From david.holmes at oracle.com Mon Aug 1 00:01:40 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 1 Aug 2016 10:01:40 +1000 Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> Message-ID: Hi Shafi, On 30/07/2016 1:10 AM, Shafi Ahmad wrote: > Hi All, > > Could I have 2nd Reviewer's review for this change, please? I didn't see you respond to Coleen's query re JDK9. If this is not applicable to JDK9 please add a 9-na label to the bug report. Looking at the code: + void clear_method(Method* m) { + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { + for (int i = 0; i< number_of_methods; i++) { + if (b->_methods[i] == m) { + b->_methods[i] = NULL; + } + } + } + // not found + } Based on the "not found" comment I assume you intended to do a return after NULLing out the method? Nit: need space after i in i< --- 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { 1883 loader_data->jmethod_ids()->clear_method(this); 1884 } Not sure why you felt the need to add a new member function for this instead of just doing line #1883 directly at line #113 --- >>> After this change I am seeing Method::is_method_id() is getting >>> called with NULL and I have done below change to avoid crash >>> >>> - assert(m != NULL, "should be called with non-null method"); >>> + if (m == NULL) { >>> + return false; >>> + } This is the only call I can see to is_method_id: 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) { 1871 if (mid == NULL) return NULL; 1872 if (!Method::is_method_id(mid)) { 1873 return NULL; 1874 } So I don't see how it can be being passed NULL. If it is then you have a problem! Thanks, David ----- > Regards, > Shafi > >> -----Original Message----- >> From: Coleen Phillimore >> Sent: Monday, July 25, 2016 5:52 PM >> To: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in >> Method::checked_resolve_jmethod_id(_jmethodID*) >> >> >> This looks good. Was this a backport or is it still broken in 9? >> thanks, >> Coleen >> >> On 7/25/16 7:53 AM, Shafi Ahmad wrote: >>> Hi, >>> >>> Please review the small code change for bug: "JDK-8161144: Fix for JDK- >> 8147451 failed: Crash in >> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev >>> >>> Summary: >>> Method::deallocate_contents() should clear 'this' from list of Methods in >> JNIMethodBlock, similarly to clear_all_methods() does it, when class is >> unloaded. >>> After this change I am seeing Method::is_method_id() is getting called with >> NULL and I have done below change to avoid crash >>> >>> - assert(m != NULL, "should be called with non-null method"); >>> + if (m == NULL) { >>> + return false; >>> + } >>> >>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 >>> >>> Test: Run jprt >>> >>> Regards, >>> Shafi >> From shafi.s.ahmad at oracle.com Mon Aug 1 06:48:16 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Sun, 31 Jul 2016 23:48:16 -0700 (PDT) Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> Message-ID: <6d84a70a-ff39-47de-884b-3be87d494e1d@default> Hi David, Thanks for the review. > -----Original Message----- > From: David Holmes > Sent: Monday, August 01, 2016 5:32 AM > To: Shafi Ahmad; Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net > Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in > Method::checked_resolve_jmethod_id(_jmethodID*) > > Hi Shafi, > > On 30/07/2016 1:10 AM, Shafi Ahmad wrote: > > Hi All, > > > > Could I have 2nd Reviewer's review for this change, please? > > I didn't see you respond to Coleen's query re JDK9. If this is not applicable to > JDK9 please add a 9-na label to the bug report. I replied to her mail, this is not reproducible in jdk9. I have updated the bug with 9-na label. > > Looking at the code: > > + void clear_method(Method* m) { > + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { > + for (int i = 0; I< number_of_methods; i++) { > + if (b->_methods[i] == m) { > + b->_methods[i] = NULL; > + } > + } > + } > + // not found > + } > > Based on the "not found" comment I assume you intended to do a return > after NULLing out the method? Yes, you are right. I assume the array b->_methods will not have duplicate entry. > > Nit: need space after i in i< I will send updated webrev. Regards, Shafi > > --- > > 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { > 1883 loader_data->jmethod_ids()->clear_method(this); > 1884 } > > Not sure why you felt the need to add a new member function for this > instead of just doing line #1883 directly at line #113 > > --- > > >>> After this change I am seeing Method::is_method_id() is getting >>> > called with NULL and I have done below change to avoid crash >>> >>> - > assert(m != NULL, "should be called with non-null method"); >>> + if (m == > NULL) { > >>> + return false; > >>> + } > > This is the only call I can see to is_method_id: > > 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) { > 1871 if (mid == NULL) return NULL; > 1872 if (!Method::is_method_id(mid)) { > 1873 return NULL; > 1874 } > > So I don't see how it can be being passed NULL. If it is then you have a > problem! > > Thanks, > David > ----- > > > > Regards, > > Shafi > > > >> -----Original Message----- > >> From: Coleen Phillimore > >> Sent: Monday, July 25, 2016 5:52 PM > >> To: hotspot-runtime-dev at openjdk.java.net > >> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash > >> in > >> Method::checked_resolve_jmethod_id(_jmethodID*) > >> > >> > >> This looks good. Was this a backport or is it still broken in 9? > >> thanks, > >> Coleen > >> > >> On 7/25/16 7:53 AM, Shafi Ahmad wrote: > >>> Hi, > >>> > >>> Please review the small code change for bug: "JDK-8161144: Fix for > >>> JDK- > >> 8147451 failed: Crash in > >> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev > >>> > >>> Summary: > >>> Method::deallocate_contents() should clear 'this' from list of > >>> Methods in > >> JNIMethodBlock, similarly to clear_all_methods() does it, when class > >> is unloaded. > >>> After this change I am seeing Method::is_method_id() is getting > >>> called with > >> NULL and I have done below change to avoid crash > >>> > >>> - assert(m != NULL, "should be called with non-null method"); > >>> + if (m == NULL) { > >>> + return false; > >>> + } > >>> > >>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ > >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 > >>> > >>> Test: Run jprt > >>> > >>> Regards, > >>> Shafi > >> From shafi.s.ahmad at oracle.com Mon Aug 1 07:01:35 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Mon, 1 Aug 2016 00:01:35 -0700 (PDT) Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) Message-ID: Hi, Pease find updated webrev. I have incorporated both the comments by David. http://cr.openjdk.java.net/~shshahma/8161144/webrev.01/ Regards, Shafi > -----Original Message----- > From: Shafi Ahmad > Sent: Monday, August 01, 2016 12:18 PM > To: David Holmes; Coleen Phillimore; hotspot-runtime- > dev at openjdk.java.net > Subject: RE: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in > Method::checked_resolve_jmethod_id(_jmethodID*) > > Hi David, > > Thanks for the review. > > > -----Original Message----- > > From: David Holmes > > Sent: Monday, August 01, 2016 5:32 AM > > To: Shafi Ahmad; Coleen Phillimore; > > hotspot-runtime-dev at openjdk.java.net > > Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash > > in > > Method::checked_resolve_jmethod_id(_jmethodID*) > > > > Hi Shafi, > > > > On 30/07/2016 1:10 AM, Shafi Ahmad wrote: > > > Hi All, > > > > > > Could I have 2nd Reviewer's review for this change, please? > > > > I didn't see you respond to Coleen's query re JDK9. If this is not > > applicable to > > JDK9 please add a 9-na label to the bug report. > > I replied to her mail, this is not reproducible in jdk9. I have updated the bug > with 9-na label. > > > > > Looking at the code: > > > > + void clear_method(Method* m) { > > + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { > > + for (int i = 0; I< number_of_methods; i++) { > > + if (b->_methods[i] == m) { > > + b->_methods[i] = NULL; > > + } > > + } > > + } > > + // not found > > + } > > > > Based on the "not found" comment I assume you intended to do a return > > after NULLing out the method? > > Yes, you are right. I assume the array b->_methods will not have duplicate > entry. > > > > > Nit: need space after i in i< > > I will send updated webrev. > > Regards, > Shafi > > > > > --- > > > > 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { > > 1883 loader_data->jmethod_ids()->clear_method(this); > > 1884 } > > > > Not sure why you felt the need to add a new member function for this > > instead of just doing line #1883 directly at line #113 > > > > --- > > > > >>> After this change I am seeing Method::is_method_id() is getting > > >>> called with NULL and I have done below change to avoid crash >>> > > >>> - assert(m != NULL, "should be called with non-null method"); >>> > > + if (m == > > NULL) { > > >>> + return false; > > >>> + } > > > > This is the only call I can see to is_method_id: > > > > 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) { > > 1871 if (mid == NULL) return NULL; > > 1872 if (!Method::is_method_id(mid)) { > > 1873 return NULL; > > 1874 } > > > > So I don't see how it can be being passed NULL. If it is then you have > > a problem! > > > > Thanks, > > David > > ----- > > > > > > > Regards, > > > Shafi > > > > > >> -----Original Message----- > > >> From: Coleen Phillimore > > >> Sent: Monday, July 25, 2016 5:52 PM > > >> To: hotspot-runtime-dev at openjdk.java.net > > >> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: > > >> Crash in > > >> Method::checked_resolve_jmethod_id(_jmethodID*) > > >> > > >> > > >> This looks good. Was this a backport or is it still broken in 9? > > >> thanks, > > >> Coleen > > >> > > >> On 7/25/16 7:53 AM, Shafi Ahmad wrote: > > >>> Hi, > > >>> > > >>> Please review the small code change for bug: "JDK-8161144: Fix for > > >>> JDK- > > >> 8147451 failed: Crash in > > >> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev > > >>> > > >>> Summary: > > >>> Method::deallocate_contents() should clear 'this' from list of > > >>> Methods in > > >> JNIMethodBlock, similarly to clear_all_methods() does it, when > > >> class is unloaded. > > >>> After this change I am seeing Method::is_method_id() is getting > > >>> called with > > >> NULL and I have done below change to avoid crash > > >>> > > >>> - assert(m != NULL, "should be called with non-null method"); > > >>> + if (m == NULL) { > > >>> + return false; > > >>> + } > > >>> > > >>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ > > >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 > > >>> > > >>> Test: Run jprt > > >>> > > >>> Regards, > > >>> Shafi > > >> From david.holmes at oracle.com Mon Aug 1 07:04:59 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 1 Aug 2016 17:04:59 +1000 Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: <6d84a70a-ff39-47de-884b-3be87d494e1d@default> References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> <6d84a70a-ff39-47de-884b-3be87d494e1d@default> Message-ID: <497ba83b-dd0b-5c65-ad4d-60d177fb4e5f@oracle.com> On 1/08/2016 4:48 PM, Shafi Ahmad wrote: > Hi David, > > Thanks for the review. > >> -----Original Message----- >> From: David Holmes >> Sent: Monday, August 01, 2016 5:32 AM >> To: Shafi Ahmad; Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in >> Method::checked_resolve_jmethod_id(_jmethodID*) >> >> Hi Shafi, >> >> On 30/07/2016 1:10 AM, Shafi Ahmad wrote: >>> Hi All, >>> >>> Could I have 2nd Reviewer's review for this change, please? >> >> I didn't see you respond to Coleen's query re JDK9. If this is not applicable to >> JDK9 please add a 9-na label to the bug report. > > I replied to her mail, this is not reproducible in jdk9. I have updated the bug with 9-na label. > >> >> Looking at the code: >> >> + void clear_method(Method* m) { >> + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { >> + for (int i = 0; I< number_of_methods; i++) { >> + if (b->_methods[i] == m) { >> + b->_methods[i] = NULL; >> + } >> + } >> + } >> + // not found >> + } >> >> Based on the "not found" comment I assume you intended to do a return >> after NULLing out the method? > > Yes, you are right. I assume the array b->_methods will not have duplicate entry. > >> >> Nit: need space after i in i< > > I will send updated webrev. You did not respond to the rest of my comments. David ------ > Regards, > Shafi > >> >> --- >> >> 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { >> 1883 loader_data->jmethod_ids()->clear_method(this); >> 1884 } >> >> Not sure why you felt the need to add a new member function for this >> instead of just doing line #1883 directly at line #113 >> >> --- >> >> >>> After this change I am seeing Method::is_method_id() is getting >>> >> called with NULL and I have done below change to avoid crash >>> >>> - >> assert(m != NULL, "should be called with non-null method"); >>> + if (m == >> NULL) { >> >>> + return false; >> >>> + } >> >> This is the only call I can see to is_method_id: >> >> 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) { >> 1871 if (mid == NULL) return NULL; >> 1872 if (!Method::is_method_id(mid)) { >> 1873 return NULL; >> 1874 } >> >> So I don't see how it can be being passed NULL. If it is then you have a >> problem! >> >> Thanks, >> David >> ----- >> >> >>> Regards, >>> Shafi >>> >>>> -----Original Message----- >>>> From: Coleen Phillimore >>>> Sent: Monday, July 25, 2016 5:52 PM >>>> To: hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash >>>> in >>>> Method::checked_resolve_jmethod_id(_jmethodID*) >>>> >>>> >>>> This looks good. Was this a backport or is it still broken in 9? >>>> thanks, >>>> Coleen >>>> >>>> On 7/25/16 7:53 AM, Shafi Ahmad wrote: >>>>> Hi, >>>>> >>>>> Please review the small code change for bug: "JDK-8161144: Fix for >>>>> JDK- >>>> 8147451 failed: Crash in >>>> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev >>>>> >>>>> Summary: >>>>> Method::deallocate_contents() should clear 'this' from list of >>>>> Methods in >>>> JNIMethodBlock, similarly to clear_all_methods() does it, when class >>>> is unloaded. >>>>> After this change I am seeing Method::is_method_id() is getting >>>>> called with >>>> NULL and I have done below change to avoid crash >>>>> >>>>> - assert(m != NULL, "should be called with non-null method"); >>>>> + if (m == NULL) { >>>>> + return false; >>>>> + } >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ >>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 >>>>> >>>>> Test: Run jprt >>>>> >>>>> Regards, >>>>> Shafi >>>> From shafi.s.ahmad at oracle.com Mon Aug 1 08:47:42 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Mon, 1 Aug 2016 01:47:42 -0700 (PDT) Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> Message-ID: <6e11ca09-79af-4aad-bbf4-4a6f558da237@default> Hi David, Sorry for my half mail. > -----Original Message----- > From: David Holmes > Sent: Monday, August 01, 2016 5:32 AM > To: Shafi Ahmad; Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net > Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in > Method::checked_resolve_jmethod_id(_jmethodID*) > > Hi Shafi, > > On 30/07/2016 1:10 AM, Shafi Ahmad wrote: > > Hi All, > > > > Could I have 2nd Reviewer's review for this change, please? > > I didn't see you respond to Coleen's query re JDK9. If this is not applicable to > JDK9 please add a 9-na label to the bug report. > > Looking at the code: > > + void clear_method(Method* m) { > + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { > + for (int i = 0; i< number_of_methods; i++) { > + if (b->_methods[i] == m) { > + b->_methods[i] = NULL; > + } > + } > + } > + // not found > + } > > Based on the "not found" comment I assume you intended to do a return > after NULLing out the method? > > Nit: need space after i in i< > > --- > > 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { > 1883 loader_data->jmethod_ids()->clear_method(this); > 1884 } > > Not sure why you felt the need to add a new member function for this > instead of just doing line #1883 directly at line #113 Just to make it consistent with the existing method like void Method::clear_jmethod_ids(ClassLoaderData* loader_data). > --- > > >>> After this change I am seeing Method::is_method_id() is getting >>> > called with NULL and I have done below change to avoid crash >>> >>> - > assert(m != NULL, "should be called with non-null method"); >>> + if (m == > NULL) { > >>> + return false; > >>> + } > > This is the only call I can see to is_method_id: Yes, this is the only call. > > 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) { > 1871 if (mid == NULL) return NULL; > 1872 if (!Method::is_method_id(mid)) { > 1873 return NULL; > 1874 } > > So I don't see how it can be being passed NULL. If it is then you have a > problem! Here actual parameter 'mid" of method is_method_id is not null but this we are calling another method resolve_jmethod_id(mid) which returns NULL i.e m becomes null in below code. 1885 bool Method::is_method_id(jmethodID mid) { 1886 Method* m = resolve_jmethod_id(mid); 1887 if (m == NULL) { 1888 return false; 1889 } Regards, Shafi > > Thanks, > David > ----- > > > > Regards, > > Shafi > > > >> -----Original Message----- > >> From: Coleen Phillimore > >> Sent: Monday, July 25, 2016 5:52 PM > >> To: hotspot-runtime-dev at openjdk.java.net > >> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash > >> in > >> Method::checked_resolve_jmethod_id(_jmethodID*) > >> > >> > >> This looks good. Was this a backport or is it still broken in 9? > >> thanks, > >> Coleen > >> > >> On 7/25/16 7:53 AM, Shafi Ahmad wrote: > >>> Hi, > >>> > >>> Please review the small code change for bug: "JDK-8161144: Fix for > >>> JDK- > >> 8147451 failed: Crash in > >> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev > >>> > >>> Summary: > >>> Method::deallocate_contents() should clear 'this' from list of > >>> Methods in > >> JNIMethodBlock, similarly to clear_all_methods() does it, when class > >> is unloaded. > >>> After this change I am seeing Method::is_method_id() is getting > >>> called with > >> NULL and I have done below change to avoid crash > >>> > >>> - assert(m != NULL, "should be called with non-null method"); > >>> + if (m == NULL) { > >>> + return false; > >>> + } > >>> > >>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ > >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 > >>> > >>> Test: Run jprt > >>> > >>> Regards, > >>> Shafi > >> From david.holmes at oracle.com Mon Aug 1 11:19:24 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 1 Aug 2016 21:19:24 +1000 Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: <6e11ca09-79af-4aad-bbf4-4a6f558da237@default> References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> <6e11ca09-79af-4aad-bbf4-4a6f558da237@default> Message-ID: <9f4901d8-d752-ed17-67b0-878cc0881b6e@oracle.com> Hi Shafi, On 1/08/2016 6:47 PM, Shafi Ahmad wrote: > Hi David, > > Sorry for my half mail. > >> -----Original Message----- >> From: David Holmes >> Sent: Monday, August 01, 2016 5:32 AM >> To: Shafi Ahmad; Coleen Phillimore; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in >> Method::checked_resolve_jmethod_id(_jmethodID*) >> >> Hi Shafi, >> >> On 30/07/2016 1:10 AM, Shafi Ahmad wrote: >>> Hi All, >>> >>> Could I have 2nd Reviewer's review for this change, please? >> >> I didn't see you respond to Coleen's query re JDK9. If this is not applicable to >> JDK9 please add a 9-na label to the bug report. >> >> Looking at the code: >> >> + void clear_method(Method* m) { >> + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { >> + for (int i = 0; i< number_of_methods; i++) { >> + if (b->_methods[i] == m) { >> + b->_methods[i] = NULL; >> + } >> + } >> + } >> + // not found >> + } >> >> Based on the "not found" comment I assume you intended to do a return >> after NULLing out the method? >> >> Nit: need space after i in i< >> >> --- >> >> 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { >> 1883 loader_data->jmethod_ids()->clear_method(this); >> 1884 } >> >> Not sure why you felt the need to add a new member function for this >> instead of just doing line #1883 directly at line #113 > > Just to make it consistent with the existing method like void Method::clear_jmethod_ids(ClassLoaderData* loader_data). I would not add to the API unnecessarily as it just makes the API harder to understand. >> --- >> >> >>> After this change I am seeing Method::is_method_id() is getting >>> >> called with NULL and I have done below change to avoid crash >>> >>> - >> assert(m != NULL, "should be called with non-null method"); >>> + if (m == >> NULL) { >> >>> + return false; >> >>> + } >> >> This is the only call I can see to is_method_id: > > Yes, this is the only call. > >> >> 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) { >> 1871 if (mid == NULL) return NULL; >> 1872 if (!Method::is_method_id(mid)) { >> 1873 return NULL; >> 1874 } >> > > > >> So I don't see how it can be being passed NULL. If it is then you have a >> problem! > > Here actual parameter 'mid" of method is_method_id is not null but this we are calling another method resolve_jmethod_id(mid) which returns NULL i.e m becomes null in below code. > > 1885 bool Method::is_method_id(jmethodID mid) { > 1886 Method* m = resolve_jmethod_id(mid); > 1887 if (m == NULL) { > 1888 return false; > 1889 } Ah I see - sorry. Thanks for clarifying. David > > Regards, > Shafi > >> >> Thanks, >> David >> ----- >> >> >>> Regards, >>> Shafi >>> >>>> -----Original Message----- >>>> From: Coleen Phillimore >>>> Sent: Monday, July 25, 2016 5:52 PM >>>> To: hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash >>>> in >>>> Method::checked_resolve_jmethod_id(_jmethodID*) >>>> >>>> >>>> This looks good. Was this a backport or is it still broken in 9? >>>> thanks, >>>> Coleen >>>> >>>> On 7/25/16 7:53 AM, Shafi Ahmad wrote: >>>>> Hi, >>>>> >>>>> Please review the small code change for bug: "JDK-8161144: Fix for >>>>> JDK- >>>> 8147451 failed: Crash in >>>> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev >>>>> >>>>> Summary: >>>>> Method::deallocate_contents() should clear 'this' from list of >>>>> Methods in >>>> JNIMethodBlock, similarly to clear_all_methods() does it, when class >>>> is unloaded. >>>>> After this change I am seeing Method::is_method_id() is getting >>>>> called with >>>> NULL and I have done below change to avoid crash >>>>> >>>>> - assert(m != NULL, "should be called with non-null method"); >>>>> + if (m == NULL) { >>>>> + return false; >>>>> + } >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ >>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 >>>>> >>>>> Test: Run jprt >>>>> >>>>> Regards, >>>>> Shafi >>>> From coleen.phillimore at oracle.com Mon Aug 1 12:31:14 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 1 Aug 2016 08:31:14 -0400 Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: <9f4901d8-d752-ed17-67b0-878cc0881b6e@oracle.com> References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> <6e11ca09-79af-4aad-bbf4-4a6f558da237@default> <9f4901d8-d752-ed17-67b0-878cc0881b6e@oracle.com> Message-ID: <8e31babf-b712-4baf-e579-e98a4c1d93a1@oracle.com> On 8/1/16 7:19 AM, David Holmes wrote: > Hi Shafi, > > On 1/08/2016 6:47 PM, Shafi Ahmad wrote: >> Hi David, >> >> Sorry for my half mail. >> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Monday, August 01, 2016 5:32 AM >>> To: Shafi Ahmad; Coleen Phillimore; >>> hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in >>> Method::checked_resolve_jmethod_id(_jmethodID*) >>> >>> Hi Shafi, >>> >>> On 30/07/2016 1:10 AM, Shafi Ahmad wrote: >>>> Hi All, >>>> >>>> Could I have 2nd Reviewer's review for this change, please? >>> >>> I didn't see you respond to Coleen's query re JDK9. If this is not >>> applicable to >>> JDK9 please add a 9-na label to the bug report. >>> >>> Looking at the code: >>> >>> + void clear_method(Method* m) { >>> + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { >>> + for (int i = 0; i< number_of_methods; i++) { >>> + if (b->_methods[i] == m) { >>> + b->_methods[i] = NULL; >>> + } >>> + } >>> + } >>> + // not found >>> + } >>> >>> Based on the "not found" comment I assume you intended to do a return >>> after NULLing out the method? >>> >>> Nit: need space after i in i< >>> >>> --- >>> >>> 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { >>> 1883 loader_data->jmethod_ids()->clear_method(this); >>> 1884 } >>> >>> Not sure why you felt the need to add a new member function for this >>> instead of just doing line #1883 directly at line #113 >> >> Just to make it consistent with the existing method like void >> Method::clear_jmethod_ids(ClassLoaderData* loader_data). > > I would not add to the API unnecessarily as it just makes the API > harder to understand. I think this API is good rather than telling Method::deallocate_contents that there's a jmethod_ids pointer in the class_loader data. If this changes for some reason, it would be good to change it together with the other APIs. Shafi, can you move this function down to below Method::set_on_stack? Then it'll be clearer that it belongs with Method::clear_jmethod_ids(). Thanks, Coleen > >>> --- >>> >>> >>> After this change I am seeing Method::is_method_id() is >>> getting >>> >>> called with NULL and I have done below change to avoid crash >>> >>> - >>> assert(m != NULL, "should be called with non-null method"); >>> + >>> if (m == >>> NULL) { >>> >>> + return false; >>> >>> + } >>> >>> This is the only call I can see to is_method_id: >> >> Yes, this is the only call. >> >>> >>> 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) { >>> 1871 if (mid == NULL) return NULL; >>> 1872 if (!Method::is_method_id(mid)) { >>> 1873 return NULL; >>> 1874 } >>> >> >> >> >>> So I don't see how it can be being passed NULL. If it is then you >>> have a >>> problem! >> >> Here actual parameter 'mid" of method is_method_id is not null but >> this we are calling another method resolve_jmethod_id(mid) which >> returns NULL i.e m becomes null in below code. >> >> 1885 bool Method::is_method_id(jmethodID mid) { >> 1886 Method* m = resolve_jmethod_id(mid); >> 1887 if (m == NULL) { >> 1888 return false; >> 1889 } > > Ah I see - sorry. Thanks for clarifying. > > David > >> >> Regards, >> Shafi >> >>> >>> Thanks, >>> David >>> ----- >>> >>> >>>> Regards, >>>> Shafi >>>> >>>>> -----Original Message----- >>>>> From: Coleen Phillimore >>>>> Sent: Monday, July 25, 2016 5:52 PM >>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash >>>>> in >>>>> Method::checked_resolve_jmethod_id(_jmethodID*) >>>>> >>>>> >>>>> This looks good. Was this a backport or is it still broken in 9? >>>>> thanks, >>>>> Coleen >>>>> >>>>> On 7/25/16 7:53 AM, Shafi Ahmad wrote: >>>>>> Hi, >>>>>> >>>>>> Please review the small code change for bug: "JDK-8161144: Fix for >>>>>> JDK- >>>>> 8147451 failed: Crash in >>>>> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev >>>>>> >>>>>> Summary: >>>>>> Method::deallocate_contents() should clear 'this' from list of >>>>>> Methods in >>>>> JNIMethodBlock, similarly to clear_all_methods() does it, when class >>>>> is unloaded. >>>>>> After this change I am seeing Method::is_method_id() is getting >>>>>> called with >>>>> NULL and I have done below change to avoid crash >>>>>> >>>>>> - assert(m != NULL, "should be called with non-null method"); >>>>>> + if (m == NULL) { >>>>>> + return false; >>>>>> + } >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ >>>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 >>>>>> >>>>>> Test: Run jprt >>>>>> >>>>>> Regards, >>>>>> Shafi >>>>> From christoph.langer at sap.com Mon Aug 1 14:41:49 2016 From: christoph.langer at sap.com (Langer, Christoph) Date: Mon, 1 Aug 2016 14:41:49 +0000 Subject: RFR(XS): 8162869: Small fixes for AIX perf memory and attach listener Message-ID: <96f980da5ca3459d8865fa9cdda24bf1@DEWDFE13DE11.global.corp.sap> Hi, please review a very small fix in the AIX perf memory and the AIX attach listener: Bug: https://bugs.openjdk.java.net/browse/JDK-8162869 Change: http://cr.openjdk.java.net/~clanger/webrevs/8162869.1/ I also touched the perfMemory_*.cpp files of the other platforms to align indentation and some comments. Because of that I also need a sponsor to push the change. Thanks in advance and best regards, Christoph From dmitry.samersoff at oracle.com Mon Aug 1 15:32:35 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 1 Aug 2016 18:32:35 +0300 Subject: RFR(XS): 8162869: Small fixes for AIX perf memory and attach listener In-Reply-To: <96f980da5ca3459d8865fa9cdda24bf1@DEWDFE13DE11.global.corp.sap> References: <96f980da5ca3459d8865fa9cdda24bf1@DEWDFE13DE11.global.corp.sap> Message-ID: Christoph, Looks good to me. -Dmitry On 2016-08-01 17:41, Langer, Christoph wrote: > Hi, > > please review a very small fix in the AIX perf memory and the AIX attach listener: > Bug: https://bugs.openjdk.java.net/browse/JDK-8162869 > Change: http://cr.openjdk.java.net/~clanger/webrevs/8162869.1/ > > I also touched the perfMemory_*.cpp files of the other platforms to align indentation and some comments. Because of that I also need a sponsor to push the change. > > Thanks in advance and best regards, > Christoph > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.holmes at oracle.com Tue Aug 2 00:36:03 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 2 Aug 2016 10:36:03 +1000 Subject: RFR(XS): 8162869: Small fixes for AIX perf memory and attach listener In-Reply-To: <96f980da5ca3459d8865fa9cdda24bf1@DEWDFE13DE11.global.corp.sap> References: <96f980da5ca3459d8865fa9cdda24bf1@DEWDFE13DE11.global.corp.sap> Message-ID: Looks good Christoph, I will sponsor this for you. Thanks, David On 2/08/2016 12:41 AM, Langer, Christoph wrote: > Hi, > > please review a very small fix in the AIX perf memory and the AIX attach listener: > Bug: https://bugs.openjdk.java.net/browse/JDK-8162869 > Change: http://cr.openjdk.java.net/~clanger/webrevs/8162869.1/ > > I also touched the perfMemory_*.cpp files of the other platforms to align indentation and some comments. Because of that I also need a sponsor to push the change. > > Thanks in advance and best regards, > Christoph > From lois.foltan at oracle.com Tue Aug 2 00:40:48 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Mon, 01 Aug 2016 20:40:48 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> Message-ID: <579FEC0F.1010407@oracle.com> On 7/17/2016 7:05 PM, harold seigel wrote: > Hi, > > Please review these Hotspot VM only changes to process the seven > module-specific options that have been renamed to have gnu-like > names. JDK changes for this bug will be reviewed separately. > > Descriptions of these options are here > . For these six options, > --module-path, --upgrade-module-path, --add-modules, --limit-modules, > --add-reads, and --add-exports, the JVM just sets a system property. > For the --patch-module option, the JVM sets a system property and then > processes the option in the same way as when it was named -Xpatch. > > Additionally, the JVM now checks properties specified on the command > line. If a property matches one of the properties used by one of the > above options then the JVM ignores the property. This forces users to > use the explicit option when wanting to do things like add a module or > a package export. > > The RFR contains two new tests. Also, many existing tests were > changed to use the new option names. > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 > > Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ Hi Harold, Overall looks good. A couple of comments: src/share/vm/prims/jvmtiEnv.cpp - line #3428 - The if statement is incorrect. There are internal properties, like jdk.boot.class.path.append, whose value if non-null should be returned. src/share/vm/runtime/arguments.cpp - Arguments::append_to_addmods_property was added before the VM starting to process --add-modules. So with this fix, it seems like it could be simply changed to: bool Arguments::append_to_addmods_property(const char* module_name) { PropertyList_unique_add(&_system_properties, Arguments::get_property("jdk.module.addmods"), module_name, AppendProperty, UnwriteableProperty, InternalProperty); } Please consider making this change since currently it contains a lot of duplicated code that is now unnecessary. - line #3171, should the comment be "--add-modules=java.sql" instead of "--add-modules java.sql"? Thanks, Lois > > The changes were tested with the JCK lang and VM tests, the JTreg > hotspot tests, and the RBT hotspot nightlies. > > Thanks, Harold From david.holmes at oracle.com Tue Aug 2 03:29:40 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 2 Aug 2016 13:29:40 +1000 Subject: (XS) RFR: 7008747: Header files with conditional behaviour can not be precompiled Message-ID: <1cdebdd2-6f4b-9b35-f93f-68cd7e7a3600@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-7008747 webrev: http://cr.openjdk.java.net/~dholmes/7008747/webrev/ A trivial clean up to ensure no confusion. The bug might be better described in the reverse sense: files that modify the behaviour of included headers must not use precompiled headers. In this case the code looks like it is using PCH when in fact it will be disabled on this platform. Better to be 100% clear (like the PPC code involving the same definitions) and not include precompiled.hpp If any zero folk see this then zero may also want a cleanup here as it defines DONT_USE_REGISTER_DEFINES, but doesn't seem to include and headers that are conditionalized on that value. Thanks, David From shafi.s.ahmad at oracle.com Tue Aug 2 05:00:54 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Mon, 1 Aug 2016 22:00:54 -0700 (PDT) Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: <8e31babf-b712-4baf-e579-e98a4c1d93a1@oracle.com> References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> <6e11ca09-79af-4aad-bbf4-4a6f558da237@default> <9f4901d8-d752-ed17-67b0-878cc0881b6e@oracle.com> <8e31babf-b712-4baf-e579-e98a4c1d93a1@oracle.com> Message-ID: <2869f61c-32a4-4216-ae71-d1cdf2f2ef3c@default> Hi Coleen, Thanks for the review. > -----Original Message----- > From: Coleen Phillimore > Sent: Monday, August 01, 2016 6:01 PM > To: David Holmes; Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net > Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in > Method::checked_resolve_jmethod_id(_jmethodID*) > > > > On 8/1/16 7:19 AM, David Holmes wrote: > > Hi Shafi, > > > > On 1/08/2016 6:47 PM, Shafi Ahmad wrote: > >> Hi David, > >> > >> Sorry for my half mail. > >> > >>> -----Original Message----- > >>> From: David Holmes > >>> Sent: Monday, August 01, 2016 5:32 AM > >>> To: Shafi Ahmad; Coleen Phillimore; > >>> hotspot-runtime-dev at openjdk.java.net > >>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash > >>> in > >>> Method::checked_resolve_jmethod_id(_jmethodID*) > >>> > >>> Hi Shafi, > >>> > >>> On 30/07/2016 1:10 AM, Shafi Ahmad wrote: > >>>> Hi All, > >>>> > >>>> Could I have 2nd Reviewer's review for this change, please? > >>> > >>> I didn't see you respond to Coleen's query re JDK9. If this is not > >>> applicable to > >>> JDK9 please add a 9-na label to the bug report. > >>> > >>> Looking at the code: > >>> > >>> + void clear_method(Method* m) { > >>> + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { > >>> + for (int i = 0; i< number_of_methods; i++) { > >>> + if (b->_methods[i] == m) { > >>> + b->_methods[i] = NULL; > >>> + } > >>> + } > >>> + } > >>> + // not found > >>> + } > >>> > >>> Based on the "not found" comment I assume you intended to do a > >>> return after NULLing out the method? > >>> > >>> Nit: need space after i in i< > >>> > >>> --- > >>> > >>> 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { > >>> 1883 loader_data->jmethod_ids()->clear_method(this); > >>> 1884 } > >>> > >>> Not sure why you felt the need to add a new member function for this > >>> instead of just doing line #1883 directly at line #113 > >> > >> Just to make it consistent with the existing method like void > >> Method::clear_jmethod_ids(ClassLoaderData* loader_data). > > > > I would not add to the API unnecessarily as it just makes the API > > harder to understand. > > I think this API is good rather than telling Method::deallocate_contents > that there's a jmethod_ids pointer in the class_loader data. If this > changes for some reason, it would be good to change it together with the > other APIs. > > Shafi, can you move this function down to below Method::set_on_stack? > Then it'll be clearer that it belongs with Method::clear_jmethod_ids(). I will do it and send the updated webrev. Regards, Shafi > Thanks, > Coleen > > > >>> --- > >>> > >>> >>> After this change I am seeing Method::is_method_id() is getting > >>> >>> called with NULL and I have done below change to avoid crash >>> > >>> >>> - assert(m != NULL, "should be called with non-null method"); > >>> >>> + if (m == > >>> NULL) { > >>> >>> + return false; > >>> >>> + } > >>> > >>> This is the only call I can see to is_method_id: > >> > >> Yes, this is the only call. > >> > >>> > >>> 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) > { > >>> 1871 if (mid == NULL) return NULL; > >>> 1872 if (!Method::is_method_id(mid)) { > >>> 1873 return NULL; > >>> 1874 } > >>> > >> > >> > >> > >>> So I don't see how it can be being passed NULL. If it is then you > >>> have a problem! > >> > >> Here actual parameter 'mid" of method is_method_id is not null but > >> this we are calling another method resolve_jmethod_id(mid) which > >> returns NULL i.e m becomes null in below code. > >> > >> 1885 bool Method::is_method_id(jmethodID mid) { > >> 1886 Method* m = resolve_jmethod_id(mid); > >> 1887 if (m == NULL) { > >> 1888 return false; > >> 1889 } > > > > Ah I see - sorry. Thanks for clarifying. > > > > David > > > >> > >> Regards, > >> Shafi > >> > >>> > >>> Thanks, > >>> David > >>> ----- > >>> > >>> > >>>> Regards, > >>>> Shafi > >>>> > >>>>> -----Original Message----- > >>>>> From: Coleen Phillimore > >>>>> Sent: Monday, July 25, 2016 5:52 PM > >>>>> To: hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: > >>>>> Crash in > >>>>> Method::checked_resolve_jmethod_id(_jmethodID*) > >>>>> > >>>>> > >>>>> This looks good. Was this a backport or is it still broken in 9? > >>>>> thanks, > >>>>> Coleen > >>>>> > >>>>> On 7/25/16 7:53 AM, Shafi Ahmad wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Please review the small code change for bug: "JDK-8161144: Fix > >>>>>> for > >>>>>> JDK- > >>>>> 8147451 failed: Crash in > >>>>> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev > >>>>>> > >>>>>> Summary: > >>>>>> Method::deallocate_contents() should clear 'this' from list of > >>>>>> Methods in > >>>>> JNIMethodBlock, similarly to clear_all_methods() does it, when > >>>>> class is unloaded. > >>>>>> After this change I am seeing Method::is_method_id() is getting > >>>>>> called with > >>>>> NULL and I have done below change to avoid crash > >>>>>> > >>>>>> - assert(m != NULL, "should be called with non-null method"); > >>>>>> + if (m == NULL) { > >>>>>> + return false; > >>>>>> + } > >>>>>> > >>>>>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ > >>>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 > >>>>>> > >>>>>> Test: Run jprt > >>>>>> > >>>>>> Regards, > >>>>>> Shafi > >>>>> > From shafi.s.ahmad at oracle.com Tue Aug 2 05:19:00 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Mon, 1 Aug 2016 22:19:00 -0700 (PDT) Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: <8e31babf-b712-4baf-e579-e98a4c1d93a1@oracle.com> References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> <6e11ca09-79af-4aad-bbf4-4a6f558da237@default> <9f4901d8-d752-ed17-67b0-878cc0881b6e@oracle.com> <8e31babf-b712-4baf-e579-e98a4c1d93a1@oracle.com> Message-ID: <48f53c21-20c4-463a-a445-b92e05cc2acc@default> Hi, Please find updated webrev. http://cr.openjdk.java.net/~shshahma/8161144/webrev.02/ Regards, Shafi > -----Original Message----- > From: Coleen Phillimore > Sent: Monday, August 01, 2016 6:01 PM > To: David Holmes; Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net > Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in > Method::checked_resolve_jmethod_id(_jmethodID*) > > > > On 8/1/16 7:19 AM, David Holmes wrote: > > Hi Shafi, > > > > On 1/08/2016 6:47 PM, Shafi Ahmad wrote: > >> Hi David, > >> > >> Sorry for my half mail. > >> > >>> -----Original Message----- > >>> From: David Holmes > >>> Sent: Monday, August 01, 2016 5:32 AM > >>> To: Shafi Ahmad; Coleen Phillimore; > >>> hotspot-runtime-dev at openjdk.java.net > >>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash > >>> in > >>> Method::checked_resolve_jmethod_id(_jmethodID*) > >>> > >>> Hi Shafi, > >>> > >>> On 30/07/2016 1:10 AM, Shafi Ahmad wrote: > >>>> Hi All, > >>>> > >>>> Could I have 2nd Reviewer's review for this change, please? > >>> > >>> I didn't see you respond to Coleen's query re JDK9. If this is not > >>> applicable to > >>> JDK9 please add a 9-na label to the bug report. > >>> > >>> Looking at the code: > >>> > >>> + void clear_method(Method* m) { > >>> + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { > >>> + for (int i = 0; i< number_of_methods; i++) { > >>> + if (b->_methods[i] == m) { > >>> + b->_methods[i] = NULL; > >>> + } > >>> + } > >>> + } > >>> + // not found > >>> + } > >>> > >>> Based on the "not found" comment I assume you intended to do a > >>> return after NULLing out the method? > >>> > >>> Nit: need space after i in i< > >>> > >>> --- > >>> > >>> 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { > >>> 1883 loader_data->jmethod_ids()->clear_method(this); > >>> 1884 } > >>> > >>> Not sure why you felt the need to add a new member function for this > >>> instead of just doing line #1883 directly at line #113 > >> > >> Just to make it consistent with the existing method like void > >> Method::clear_jmethod_ids(ClassLoaderData* loader_data). > > > > I would not add to the API unnecessarily as it just makes the API > > harder to understand. > > I think this API is good rather than telling Method::deallocate_contents > that there's a jmethod_ids pointer in the class_loader data. If this > changes for some reason, it would be good to change it together with the > other APIs. > > Shafi, can you move this function down to below Method::set_on_stack? > Then it'll be clearer that it belongs with Method::clear_jmethod_ids(). > > Thanks, > Coleen > > > >>> --- > >>> > >>> >>> After this change I am seeing Method::is_method_id() is getting > >>> >>> called with NULL and I have done below change to avoid crash >>> > >>> >>> - assert(m != NULL, "should be called with non-null method"); > >>> >>> + if (m == > >>> NULL) { > >>> >>> + return false; > >>> >>> + } > >>> > >>> This is the only call I can see to is_method_id: > >> > >> Yes, this is the only call. > >> > >>> > >>> 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) > { > >>> 1871 if (mid == NULL) return NULL; > >>> 1872 if (!Method::is_method_id(mid)) { > >>> 1873 return NULL; > >>> 1874 } > >>> > >> > >> > >> > >>> So I don't see how it can be being passed NULL. If it is then you > >>> have a problem! > >> > >> Here actual parameter 'mid" of method is_method_id is not null but > >> this we are calling another method resolve_jmethod_id(mid) which > >> returns NULL i.e m becomes null in below code. > >> > >> 1885 bool Method::is_method_id(jmethodID mid) { > >> 1886 Method* m = resolve_jmethod_id(mid); > >> 1887 if (m == NULL) { > >> 1888 return false; > >> 1889 } > > > > Ah I see - sorry. Thanks for clarifying. > > > > David > > > >> > >> Regards, > >> Shafi > >> > >>> > >>> Thanks, > >>> David > >>> ----- > >>> > >>> > >>>> Regards, > >>>> Shafi > >>>> > >>>>> -----Original Message----- > >>>>> From: Coleen Phillimore > >>>>> Sent: Monday, July 25, 2016 5:52 PM > >>>>> To: hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: > >>>>> Crash in > >>>>> Method::checked_resolve_jmethod_id(_jmethodID*) > >>>>> > >>>>> > >>>>> This looks good. Was this a backport or is it still broken in 9? > >>>>> thanks, > >>>>> Coleen > >>>>> > >>>>> On 7/25/16 7:53 AM, Shafi Ahmad wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Please review the small code change for bug: "JDK-8161144: Fix > >>>>>> for > >>>>>> JDK- > >>>>> 8147451 failed: Crash in > >>>>> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev > >>>>>> > >>>>>> Summary: > >>>>>> Method::deallocate_contents() should clear 'this' from list of > >>>>>> Methods in > >>>>> JNIMethodBlock, similarly to clear_all_methods() does it, when > >>>>> class is unloaded. > >>>>>> After this change I am seeing Method::is_method_id() is getting > >>>>>> called with > >>>>> NULL and I have done below change to avoid crash > >>>>>> > >>>>>> - assert(m != NULL, "should be called with non-null method"); > >>>>>> + if (m == NULL) { > >>>>>> + return false; > >>>>>> + } > >>>>>> > >>>>>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ > >>>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 > >>>>>> > >>>>>> Test: Run jprt > >>>>>> > >>>>>> Regards, > >>>>>> Shafi > >>>>> > From christoph.langer at sap.com Tue Aug 2 05:23:08 2016 From: christoph.langer at sap.com (Langer, Christoph) Date: Tue, 2 Aug 2016 05:23:08 +0000 Subject: RFR(XS): 8162869: Small fixes for AIX perf memory and attach listener In-Reply-To: References: <96f980da5ca3459d8865fa9cdda24bf1@DEWDFE13DE11.global.corp.sap> Message-ID: Thanks, David and Dmitry. > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Dienstag, 2. August 2016 02:36 > To: Langer, Christoph ; hotspot-runtime- > dev at openjdk.java.net > Cc: ppc-aix-port-dev at openjdk.java.net > Subject: Re: RFR(XS): 8162869: Small fixes for AIX perf memory and attach > listener > > Looks good Christoph, I will sponsor this for you. > > Thanks, > David > > On 2/08/2016 12:41 AM, Langer, Christoph wrote: > > Hi, > > > > please review a very small fix in the AIX perf memory and the AIX attach > listener: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8162869 > > Change: http://cr.openjdk.java.net/~clanger/webrevs/8162869.1/ > > > > I also touched the perfMemory_*.cpp files of the other platforms to align > indentation and some comments. Because of that I also need a sponsor to push > the change. > > > > Thanks in advance and best regards, > > Christoph > > From david.holmes at oracle.com Tue Aug 2 05:34:17 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 2 Aug 2016 15:34:17 +1000 Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: <48f53c21-20c4-463a-a445-b92e05cc2acc@default> References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> <6e11ca09-79af-4aad-bbf4-4a6f558da237@default> <9f4901d8-d752-ed17-67b0-878cc0881b6e@oracle.com> <8e31babf-b712-4baf-e579-e98a4c1d93a1@oracle.com> <48f53c21-20c4-463a-a445-b92e05cc2acc@default> Message-ID: <6eb19b7f-835b-5284-f3fd-0c7d8a747cbe@oracle.com> Hi Shafi, No further comments from me. Thanks, David On 2/08/2016 3:19 PM, Shafi Ahmad wrote: > Hi, > > Please find updated webrev. > > http://cr.openjdk.java.net/~shshahma/8161144/webrev.02/ > > Regards, > Shafi > >> -----Original Message----- >> From: Coleen Phillimore >> Sent: Monday, August 01, 2016 6:01 PM >> To: David Holmes; Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in >> Method::checked_resolve_jmethod_id(_jmethodID*) >> >> >> >> On 8/1/16 7:19 AM, David Holmes wrote: >>> Hi Shafi, >>> >>> On 1/08/2016 6:47 PM, Shafi Ahmad wrote: >>>> Hi David, >>>> >>>> Sorry for my half mail. >>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Monday, August 01, 2016 5:32 AM >>>>> To: Shafi Ahmad; Coleen Phillimore; >>>>> hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash >>>>> in >>>>> Method::checked_resolve_jmethod_id(_jmethodID*) >>>>> >>>>> Hi Shafi, >>>>> >>>>> On 30/07/2016 1:10 AM, Shafi Ahmad wrote: >>>>>> Hi All, >>>>>> >>>>>> Could I have 2nd Reviewer's review for this change, please? >>>>> >>>>> I didn't see you respond to Coleen's query re JDK9. If this is not >>>>> applicable to >>>>> JDK9 please add a 9-na label to the bug report. >>>>> >>>>> Looking at the code: >>>>> >>>>> + void clear_method(Method* m) { >>>>> + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { >>>>> + for (int i = 0; i< number_of_methods; i++) { >>>>> + if (b->_methods[i] == m) { >>>>> + b->_methods[i] = NULL; >>>>> + } >>>>> + } >>>>> + } >>>>> + // not found >>>>> + } >>>>> >>>>> Based on the "not found" comment I assume you intended to do a >>>>> return after NULLing out the method? >>>>> >>>>> Nit: need space after i in i< >>>>> >>>>> --- >>>>> >>>>> 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { >>>>> 1883 loader_data->jmethod_ids()->clear_method(this); >>>>> 1884 } >>>>> >>>>> Not sure why you felt the need to add a new member function for this >>>>> instead of just doing line #1883 directly at line #113 >>>> >>>> Just to make it consistent with the existing method like void >>>> Method::clear_jmethod_ids(ClassLoaderData* loader_data). >>> >>> I would not add to the API unnecessarily as it just makes the API >>> harder to understand. >> >> I think this API is good rather than telling Method::deallocate_contents >> that there's a jmethod_ids pointer in the class_loader data. If this >> changes for some reason, it would be good to change it together with the >> other APIs. >> >> Shafi, can you move this function down to below Method::set_on_stack? >> Then it'll be clearer that it belongs with Method::clear_jmethod_ids(). >> >> Thanks, >> Coleen >>> >>>>> --- >>>>> >>>>> >>> After this change I am seeing Method::is_method_id() is getting >>>>>>>> called with NULL and I have done below change to avoid crash >>> >>>>>>>> - assert(m != NULL, "should be called with non-null method"); >>>>>>>> + if (m == >>>>> NULL) { >>>>> >>> + return false; >>>>> >>> + } >>>>> >>>>> This is the only call I can see to is_method_id: >>>> >>>> Yes, this is the only call. >>>> >>>>> >>>>> 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) >> { >>>>> 1871 if (mid == NULL) return NULL; >>>>> 1872 if (!Method::is_method_id(mid)) { >>>>> 1873 return NULL; >>>>> 1874 } >>>>> >>>> >>>> >>>> >>>>> So I don't see how it can be being passed NULL. If it is then you >>>>> have a problem! >>>> >>>> Here actual parameter 'mid" of method is_method_id is not null but >>>> this we are calling another method resolve_jmethod_id(mid) which >>>> returns NULL i.e m becomes null in below code. >>>> >>>> 1885 bool Method::is_method_id(jmethodID mid) { >>>> 1886 Method* m = resolve_jmethod_id(mid); >>>> 1887 if (m == NULL) { >>>> 1888 return false; >>>> 1889 } >>> >>> Ah I see - sorry. Thanks for clarifying. >>> >>> David >>> >>>> >>>> Regards, >>>> Shafi >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>> >>>>>> Regards, >>>>>> Shafi >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Coleen Phillimore >>>>>>> Sent: Monday, July 25, 2016 5:52 PM >>>>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: >>>>>>> Crash in >>>>>>> Method::checked_resolve_jmethod_id(_jmethodID*) >>>>>>> >>>>>>> >>>>>>> This looks good. Was this a backport or is it still broken in 9? >>>>>>> thanks, >>>>>>> Coleen >>>>>>> >>>>>>> On 7/25/16 7:53 AM, Shafi Ahmad wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please review the small code change for bug: "JDK-8161144: Fix >>>>>>>> for >>>>>>>> JDK- >>>>>>> 8147451 failed: Crash in >>>>>>> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev >>>>>>>> >>>>>>>> Summary: >>>>>>>> Method::deallocate_contents() should clear 'this' from list of >>>>>>>> Methods in >>>>>>> JNIMethodBlock, similarly to clear_all_methods() does it, when >>>>>>> class is unloaded. >>>>>>>> After this change I am seeing Method::is_method_id() is getting >>>>>>>> called with >>>>>>> NULL and I have done below change to avoid crash >>>>>>>> >>>>>>>> - assert(m != NULL, "should be called with non-null method"); >>>>>>>> + if (m == NULL) { >>>>>>>> + return false; >>>>>>>> + } >>>>>>>> >>>>>>>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ >>>>>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 >>>>>>>> >>>>>>>> Test: Run jprt >>>>>>>> >>>>>>>> Regards, >>>>>>>> Shafi >>>>>>> >> From vladimir.kozlov at oracle.com Tue Aug 2 05:37:36 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 1 Aug 2016 22:37:36 -0700 Subject: (XS) RFR: 7008747: Header files with conditional behaviour can not be precompiled In-Reply-To: <1cdebdd2-6f4b-9b35-f93f-68cd7e7a3600@oracle.com> References: <1cdebdd2-6f4b-9b35-f93f-68cd7e7a3600@oracle.com> Message-ID: <6b936782-c658-128b-6e03-2a422d7ae763@oracle.com> Looks good. Thanks, Vladimir On 8/1/16 8:29 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-7008747 > > webrev: http://cr.openjdk.java.net/~dholmes/7008747/webrev/ > > A trivial clean up to ensure no confusion. The bug might be better described in the reverse sense: files that modify the > behaviour of included headers must not use precompiled headers. > > In this case the code looks like it is using PCH when in fact it will be disabled on this platform. Better to be 100% > clear (like the PPC code involving the same definitions) and not include precompiled.hpp > > If any zero folk see this then zero may also want a cleanup here as it defines DONT_USE_REGISTER_DEFINES, but doesn't > seem to include and headers that are conditionalized on that value. > > Thanks, > David From david.holmes at oracle.com Tue Aug 2 06:10:03 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 2 Aug 2016 16:10:03 +1000 Subject: (XS) RFR: 7008747: Header files with conditional behaviour can not be precompiled In-Reply-To: <6b936782-c658-128b-6e03-2a422d7ae763@oracle.com> References: <1cdebdd2-6f4b-9b35-f93f-68cd7e7a3600@oracle.com> <6b936782-c658-128b-6e03-2a422d7ae763@oracle.com> Message-ID: <02aa6dac-ae18-0a05-0454-04601b4414d2@oracle.com> Thanks Vladimir! David On 2/08/2016 3:37 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 8/1/16 8:29 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-7008747 >> >> webrev: http://cr.openjdk.java.net/~dholmes/7008747/webrev/ >> >> A trivial clean up to ensure no confusion. The bug might be better >> described in the reverse sense: files that modify the >> behaviour of included headers must not use precompiled headers. >> >> In this case the code looks like it is using PCH when in fact it will >> be disabled on this platform. Better to be 100% >> clear (like the PPC code involving the same definitions) and not >> include precompiled.hpp >> >> If any zero folk see this then zero may also want a cleanup here as it >> defines DONT_USE_REGISTER_DEFINES, but doesn't >> seem to include and headers that are conditionalized on that value. >> >> Thanks, >> David From harold.seigel at oracle.com Tue Aug 2 13:25:01 2016 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 2 Aug 2016 09:25:01 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: <579FEC0F.1010407@oracle.com> References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> Message-ID: Hi Lois, Thanks for the review. Please see comments in-line. Harold On 8/1/2016 8:40 PM, Lois Foltan wrote: > > On 7/17/2016 7:05 PM, harold seigel wrote: >> Hi, >> >> Please review these Hotspot VM only changes to process the seven >> module-specific options that have been renamed to have gnu-like >> names. JDK changes for this bug will be reviewed separately. >> >> Descriptions of these options are here >> . For these six options, >> --module-path, --upgrade-module-path, --add-modules, --limit-modules, >> --add-reads, and --add-exports, the JVM just sets a system property. >> For the --patch-module option, the JVM sets a system property and >> then processes the option in the same way as when it was named -Xpatch. >> >> Additionally, the JVM now checks properties specified on the command >> line. If a property matches one of the properties used by one of the >> above options then the JVM ignores the property. This forces users to >> use the explicit option when wanting to do things like add a module >> or a package export. >> >> The RFR contains two new tests. Also, many existing tests were >> changed to use the new option names. >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >> >> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ > > Hi Harold, > > Overall looks good. A couple of comments: > > src/share/vm/prims/jvmtiEnv.cpp > - line #3428 - The if statement is incorrect. There are internal > properties, like jdk.boot.class.path.append, whose value if non-null > should be returned. This code will be reworked in the next version of these changes because of multiple issues. > > src/share/vm/runtime/arguments.cpp > - Arguments::append_to_addmods_property was added before the VM > starting to process --add-modules. So with this fix, it seems like it > could be simply changed to: > > bool Arguments::append_to_addmods_property(const char* module_name) { > PropertyList_unique_add(&_system_properties, > Arguments::get_property("jdk.module.addmods"), > module_name, > AppendProperty, UnwriteableProperty, InternalProperty); > } > > Please consider making this change since currently it contains a lot > of duplicated code that is now unnecessary. The one difference is that append_to_addmods_property() returns a status but PropertyList_unique_add() does not. I'll look into this a bit further. > > - line #3171, should the comment be "--add-modules=java.sql" instead > of "--add-modules java.sql"? yes. The changes suggested by you, Coleen, and Dan will be in the next version of this webrev. Thanks, Harold > > Thanks, > Lois > > > > > > > > > > > > > >> >> The changes were tested with the JCK lang and VM tests, the JTreg >> hotspot tests, and the RBT hotspot nightlies. >> >> Thanks, Harold > From gerald.thornbrugh at oracle.com Tue Aug 2 13:49:03 2016 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Tue, 02 Aug 2016 07:49:03 -0600 Subject: (XS) RFR: 7008747: Header files with conditional behaviour can not be precompiled In-Reply-To: <1cdebdd2-6f4b-9b35-f93f-68cd7e7a3600@oracle.com> References: <1cdebdd2-6f4b-9b35-f93f-68cd7e7a3600@oracle.com> Message-ID: <57A0A4CF.2020504@oracle.com> Hi David, Your changes look good. Jerry > Bug: https://bugs.openjdk.java.net/browse/JDK-7008747 > > webrev: http://cr.openjdk.java.net/~dholmes/7008747/webrev/ > > A trivial clean up to ensure no confusion. The bug might be better > described in the reverse sense: files that modify the behaviour of > included headers must not use precompiled headers. > > In this case the code looks like it is using PCH when in fact it will > be disabled on this platform. Better to be 100% clear (like the PPC > code involving the same definitions) and not include precompiled.hpp > > If any zero folk see this then zero may also want a cleanup here as it > defines DONT_USE_REGISTER_DEFINES, but doesn't seem to include and > headers that are conditionalized on that value. > > Thanks, > David From coleen.phillimore at oracle.com Tue Aug 2 14:52:50 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 2 Aug 2016 10:52:50 -0400 Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: <48f53c21-20c4-463a-a445-b92e05cc2acc@default> References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> <6e11ca09-79af-4aad-bbf4-4a6f558da237@default> <9f4901d8-d752-ed17-67b0-878cc0881b6e@oracle.com> <8e31babf-b712-4baf-e579-e98a4c1d93a1@oracle.com> <48f53c21-20c4-463a-a445-b92e05cc2acc@default> Message-ID: <819de2c9-e1e3-84ee-6781-8c3054021319@oracle.com> I think this looks good. Thank you! Coleen On 8/2/16 1:19 AM, Shafi Ahmad wrote: > Hi, > > Please find updated webrev. > > http://cr.openjdk.java.net/~shshahma/8161144/webrev.02/ > > Regards, > Shafi > >> -----Original Message----- >> From: Coleen Phillimore >> Sent: Monday, August 01, 2016 6:01 PM >> To: David Holmes; Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in >> Method::checked_resolve_jmethod_id(_jmethodID*) >> >> >> >> On 8/1/16 7:19 AM, David Holmes wrote: >>> Hi Shafi, >>> >>> On 1/08/2016 6:47 PM, Shafi Ahmad wrote: >>>> Hi David, >>>> >>>> Sorry for my half mail. >>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Monday, August 01, 2016 5:32 AM >>>>> To: Shafi Ahmad; Coleen Phillimore; >>>>> hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash >>>>> in >>>>> Method::checked_resolve_jmethod_id(_jmethodID*) >>>>> >>>>> Hi Shafi, >>>>> >>>>> On 30/07/2016 1:10 AM, Shafi Ahmad wrote: >>>>>> Hi All, >>>>>> >>>>>> Could I have 2nd Reviewer's review for this change, please? >>>>> I didn't see you respond to Coleen's query re JDK9. If this is not >>>>> applicable to >>>>> JDK9 please add a 9-na label to the bug report. >>>>> >>>>> Looking at the code: >>>>> >>>>> + void clear_method(Method* m) { >>>>> + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { >>>>> + for (int i = 0; i< number_of_methods; i++) { >>>>> + if (b->_methods[i] == m) { >>>>> + b->_methods[i] = NULL; >>>>> + } >>>>> + } >>>>> + } >>>>> + // not found >>>>> + } >>>>> >>>>> Based on the "not found" comment I assume you intended to do a >>>>> return after NULLing out the method? >>>>> >>>>> Nit: need space after i in i< >>>>> >>>>> --- >>>>> >>>>> 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) { >>>>> 1883 loader_data->jmethod_ids()->clear_method(this); >>>>> 1884 } >>>>> >>>>> Not sure why you felt the need to add a new member function for this >>>>> instead of just doing line #1883 directly at line #113 >>>> Just to make it consistent with the existing method like void >>>> Method::clear_jmethod_ids(ClassLoaderData* loader_data). >>> I would not add to the API unnecessarily as it just makes the API >>> harder to understand. >> I think this API is good rather than telling Method::deallocate_contents >> that there's a jmethod_ids pointer in the class_loader data. If this >> changes for some reason, it would be good to change it together with the >> other APIs. >> >> Shafi, can you move this function down to below Method::set_on_stack? >> Then it'll be clearer that it belongs with Method::clear_jmethod_ids(). >> >> Thanks, >> Coleen >>>>> --- >>>>> >>>>> >>> After this change I am seeing Method::is_method_id() is getting >>>>>>>> called with NULL and I have done below change to avoid crash >>> >>>>>>>> - assert(m != NULL, "should be called with non-null method"); >>>>>>>> + if (m == >>>>> NULL) { >>>>> >>> + return false; >>>>> >>> + } >>>>> >>>>> This is the only call I can see to is_method_id: >>>> Yes, this is the only call. >>>> >>>>> 1870 Method* Method::checked_resolve_jmethod_id(jmethodID mid) >> { >>>>> 1871 if (mid == NULL) return NULL; >>>>> 1872 if (!Method::is_method_id(mid)) { >>>>> 1873 return NULL; >>>>> 1874 } >>>>> >>>> >>>> >>>>> So I don't see how it can be being passed NULL. If it is then you >>>>> have a problem! >>>> Here actual parameter 'mid" of method is_method_id is not null but >>>> this we are calling another method resolve_jmethod_id(mid) which >>>> returns NULL i.e m becomes null in below code. >>>> >>>> 1885 bool Method::is_method_id(jmethodID mid) { >>>> 1886 Method* m = resolve_jmethod_id(mid); >>>> 1887 if (m == NULL) { >>>> 1888 return false; >>>> 1889 } >>> Ah I see - sorry. Thanks for clarifying. >>> >>> David >>> >>>> Regards, >>>> Shafi >>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>> >>>>>> Regards, >>>>>> Shafi >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Coleen Phillimore >>>>>>> Sent: Monday, July 25, 2016 5:52 PM >>>>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: >>>>>>> Crash in >>>>>>> Method::checked_resolve_jmethod_id(_jmethodID*) >>>>>>> >>>>>>> >>>>>>> This looks good. Was this a backport or is it still broken in 9? >>>>>>> thanks, >>>>>>> Coleen >>>>>>> >>>>>>> On 7/25/16 7:53 AM, Shafi Ahmad wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please review the small code change for bug: "JDK-8161144: Fix >>>>>>>> for >>>>>>>> JDK- >>>>>>> 8147451 failed: Crash in >>>>>>> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u-dev >>>>>>>> Summary: >>>>>>>> Method::deallocate_contents() should clear 'this' from list of >>>>>>>> Methods in >>>>>>> JNIMethodBlock, similarly to clear_all_methods() does it, when >>>>>>> class is unloaded. >>>>>>>> After this change I am seeing Method::is_method_id() is getting >>>>>>>> called with >>>>>>> NULL and I have done below change to avoid crash >>>>>>>> - assert(m != NULL, "should be called with non-null method"); >>>>>>>> + if (m == NULL) { >>>>>>>> + return false; >>>>>>>> + } >>>>>>>> >>>>>>>> Webrev: http://cr.openjdk.java.net/~shshahma/8161144/webrev/ >>>>>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 >>>>>>>> >>>>>>>> Test: Run jprt >>>>>>>> >>>>>>>> Regards, >>>>>>>> Shafi From shafi.s.ahmad at oracle.com Tue Aug 2 17:36:01 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Tue, 2 Aug 2016 10:36:01 -0700 (PDT) Subject: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in Method::checked_resolve_jmethod_id(_jmethodID*) In-Reply-To: <819de2c9-e1e3-84ee-6781-8c3054021319@oracle.com> References: <4971842d-29db-4116-ac50-e506dcc14dc7@default> <1b64e628-c40a-dcd4-e41f-34b96d5a64b1@oracle.com> <1c8198e3-0080-4f52-aa24-ef44751d7e71@default> <6e11ca09-79af-4aad-bbf4-4a6f558da237@default> <9f4901d8-d752-ed17-67b0-878cc0881b6e@oracle.com> <8e31babf-b712-4baf-e579-e98a4c1d93a1@oracle.com> <48f53c21-20c4-463a-a445-b92e05cc2acc@default> <819de2c9-e1e3-84ee-6781-8c3054021319@oracle.com> Message-ID: <778adfe1-ebc0-484a-8789-48235bff372e@default> Thanks Coleen and David for the review. Regards, Shafi > -----Original Message----- > From: Coleen Phillimore > Sent: Tuesday, August 02, 2016 8:23 PM > To: Shafi Ahmad; David Holmes; hotspot-runtime-dev at openjdk.java.net > Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash in > Method::checked_resolve_jmethod_id(_jmethodID*) > > > I think this looks good. Thank you! > Coleen > > On 8/2/16 1:19 AM, Shafi Ahmad wrote: > > Hi, > > > > Please find updated webrev. > > > > http://cr.openjdk.java.net/~shshahma/8161144/webrev.02/ > > > > Regards, > > Shafi > > > >> -----Original Message----- > >> From: Coleen Phillimore > >> Sent: Monday, August 01, 2016 6:01 PM > >> To: David Holmes; Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net > >> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: Crash > >> in > >> Method::checked_resolve_jmethod_id(_jmethodID*) > >> > >> > >> > >> On 8/1/16 7:19 AM, David Holmes wrote: > >>> Hi Shafi, > >>> > >>> On 1/08/2016 6:47 PM, Shafi Ahmad wrote: > >>>> Hi David, > >>>> > >>>> Sorry for my half mail. > >>>> > >>>>> -----Original Message----- > >>>>> From: David Holmes > >>>>> Sent: Monday, August 01, 2016 5:32 AM > >>>>> To: Shafi Ahmad; Coleen Phillimore; > >>>>> hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: > >>>>> Crash in > >>>>> Method::checked_resolve_jmethod_id(_jmethodID*) > >>>>> > >>>>> Hi Shafi, > >>>>> > >>>>> On 30/07/2016 1:10 AM, Shafi Ahmad wrote: > >>>>>> Hi All, > >>>>>> > >>>>>> Could I have 2nd Reviewer's review for this change, please? > >>>>> I didn't see you respond to Coleen's query re JDK9. If this is not > >>>>> applicable to > >>>>> JDK9 please add a 9-na label to the bug report. > >>>>> > >>>>> Looking at the code: > >>>>> > >>>>> + void clear_method(Method* m) { > >>>>> + for (JNIMethodBlock* b = this; b != NULL; b = b->_next) { > >>>>> + for (int i = 0; i< number_of_methods; i++) { > >>>>> + if (b->_methods[i] == m) { > >>>>> + b->_methods[i] = NULL; > >>>>> + } > >>>>> + } > >>>>> + } > >>>>> + // not found > >>>>> + } > >>>>> > >>>>> Based on the "not found" comment I assume you intended to do a > >>>>> return after NULLing out the method? > >>>>> > >>>>> Nit: need space after i in i< > >>>>> > >>>>> --- > >>>>> > >>>>> 1882 void Method::clear_jmethod_id(ClassLoaderData* loader_data) > { > >>>>> 1883 loader_data->jmethod_ids()->clear_method(this); > >>>>> 1884 } > >>>>> > >>>>> Not sure why you felt the need to add a new member function for > >>>>> this instead of just doing line #1883 directly at line #113 > >>>> Just to make it consistent with the existing method like void > >>>> Method::clear_jmethod_ids(ClassLoaderData* loader_data). > >>> I would not add to the API unnecessarily as it just makes the API > >>> harder to understand. > >> I think this API is good rather than telling Method::deallocate_contents > >> that there's a jmethod_ids pointer in the class_loader data. If this > >> changes for some reason, it would be good to change it together with > >> the other APIs. > >> > >> Shafi, can you move this function down to below Method::set_on_stack? > >> Then it'll be clearer that it belongs with Method::clear_jmethod_ids(). > >> > >> Thanks, > >> Coleen > >>>>> --- > >>>>> > >>>>> >>> After this change I am seeing Method::is_method_id() is > >>>>> getting > >>>>>>>> called with NULL and I have done below change to avoid crash > >>>>>>>> >>> > >>>>>>>> - assert(m != NULL, "should be called with non-null method"); > >>>>>>>> + if (m == > >>>>> NULL) { > >>>>> >>> + return false; > >>>>> >>> + } > >>>>> > >>>>> This is the only call I can see to is_method_id: > >>>> Yes, this is the only call. > >>>> > >>>>> 1870 Method* Method::checked_resolve_jmethod_id(jmethodID > mid) > >> { > >>>>> 1871 if (mid == NULL) return NULL; > >>>>> 1872 if (!Method::is_method_id(mid)) { > >>>>> 1873 return NULL; > >>>>> 1874 } > >>>>> > >>>> > >>>> > >>>>> So I don't see how it can be being passed NULL. If it is then you > >>>>> have a problem! > >>>> Here actual parameter 'mid" of method is_method_id is not null > >>>> but this we are calling another method resolve_jmethod_id(mid) > >>>> which returns NULL i.e m becomes null in below code. > >>>> > >>>> 1885 bool Method::is_method_id(jmethodID mid) { > >>>> 1886 Method* m = resolve_jmethod_id(mid); > >>>> 1887 if (m == NULL) { > >>>> 1888 return false; > >>>> 1889 } > >>> Ah I see - sorry. Thanks for clarifying. > >>> > >>> David > >>> > >>>> Regards, > >>>> Shafi > >>>> > >>>>> Thanks, > >>>>> David > >>>>> ----- > >>>>> > >>>>> > >>>>>> Regards, > >>>>>> Shafi > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Coleen Phillimore > >>>>>>> Sent: Monday, July 25, 2016 5:52 PM > >>>>>>> To: hotspot-runtime-dev at openjdk.java.net > >>>>>>> Subject: Re: [8u] RFR JDK-8161144: Fix for JDK-8147451 failed: > >>>>>>> Crash in > >>>>>>> Method::checked_resolve_jmethod_id(_jmethodID*) > >>>>>>> > >>>>>>> > >>>>>>> This looks good. Was this a backport or is it still broken in 9? > >>>>>>> thanks, > >>>>>>> Coleen > >>>>>>> > >>>>>>> On 7/25/16 7:53 AM, Shafi Ahmad wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> Please review the small code change for bug: "JDK-8161144: Fix > >>>>>>>> for > >>>>>>>> JDK- > >>>>>>> 8147451 failed: Crash in > >>>>>>> Method::checked_resolve_jmethod_id(_jmethodID*)" on jdk8u- > dev > >>>>>>>> Summary: > >>>>>>>> Method::deallocate_contents() should clear 'this' from list of > >>>>>>>> Methods in > >>>>>>> JNIMethodBlock, similarly to clear_all_methods() does it, when > >>>>>>> class is unloaded. > >>>>>>>> After this change I am seeing Method::is_method_id() is getting > >>>>>>>> called with > >>>>>>> NULL and I have done below change to avoid crash > >>>>>>>> - assert(m != NULL, "should be called with non-null method"); > >>>>>>>> + if (m == NULL) { > >>>>>>>> + return false; > >>>>>>>> + } > >>>>>>>> > >>>>>>>> Webrev: > http://cr.openjdk.java.net/~shshahma/8161144/webrev/ > >>>>>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8161144 > >>>>>>>> > >>>>>>>> Test: Run jprt > >>>>>>>> > >>>>>>>> Regards, > >>>>>>>> Shafi > From george.triantafillou at oracle.com Tue Aug 2 18:49:55 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 2 Aug 2016 14:49:55 -0400 Subject: RFR 8160833: ClassesByName2Test.java and RedefineCrossEvent.java failing with jtreg tip Message-ID: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> Please review this small change to fix test failures in two JDI tests: JBS: https://bugs.openjdk.java.net/browse/JDK-8160833 Open webrev: http://cr.openjdk.java.net/~gtriantafill/8160833/webrev/ Thanks to Tim Bell for his assistance. The fix replaces the use of javax.rmi.CORBA.Util with String.format. See this comment for details. Tested locally on Linux with latest version of jtreg. -George From daniel.daugherty at oracle.com Tue Aug 2 19:50:22 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 2 Aug 2016 13:50:22 -0600 Subject: RFR 8160833: ClassesByName2Test.java and RedefineCrossEvent.java failing with jtreg tip In-Reply-To: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> References: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> Message-ID: Adding serviceability-dev at ... alias since these are JDI tests... Dan On 8/2/16 12:49 PM, George Triantafillou wrote: > Please review this small change to fix test failures in two JDI tests: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8160833 > Open webrev: http://cr.openjdk.java.net/~gtriantafill/8160833/webrev/ > > > Thanks to Tim Bell for his assistance. The fix replaces the use of > javax.rmi.CORBA.Util with String.format. See this comment > > for details. > > Tested locally on Linux with latest version of jtreg. > > -George > From david.holmes at oracle.com Tue Aug 2 20:21:22 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Aug 2016 06:21:22 +1000 Subject: (XS) RFR: 7008747: Header files with conditional behaviour can not be precompiled In-Reply-To: <57A0A4CF.2020504@oracle.com> References: <1cdebdd2-6f4b-9b35-f93f-68cd7e7a3600@oracle.com> <57A0A4CF.2020504@oracle.com> Message-ID: Thanks Jerry. David On 2/08/2016 11:49 PM, Gerald Thornbrugh wrote: > Hi David, > > Your changes look good. > > Jerry >> Bug: https://bugs.openjdk.java.net/browse/JDK-7008747 >> >> webrev: http://cr.openjdk.java.net/~dholmes/7008747/webrev/ >> >> A trivial clean up to ensure no confusion. The bug might be better >> described in the reverse sense: files that modify the behaviour of >> included headers must not use precompiled headers. >> >> In this case the code looks like it is using PCH when in fact it will >> be disabled on this platform. Better to be 100% clear (like the PPC >> code involving the same definitions) and not include precompiled.hpp >> >> If any zero folk see this then zero may also want a cleanup here as it >> defines DONT_USE_REGISTER_DEFINES, but doesn't seem to include and >> headers that are conditionalized on that value. >> >> Thanks, >> David > From christian.tornqvist at oracle.com Tue Aug 2 20:31:16 2016 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Tue, 2 Aug 2016 16:31:16 -0400 Subject: RFR 8160833: ClassesByName2Test.java and RedefineCrossEvent.java failing with jtreg tip In-Reply-To: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> References: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> Message-ID: <4bda01d1ecfc$d2662bc0$77328340$@oracle.com> Hi George, Looks good, after your explanation made to me offline that it's not the java.lang.String class that we expect to be loaded @75. Thanks, Christian -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of George Triantafillou Sent: Tuesday, August 2, 2016 2:50 PM To: hotspot-runtime-dev at openjdk.java.net Subject: RFR 8160833: ClassesByName2Test.java and RedefineCrossEvent.java failing with jtreg tip Please review this small change to fix test failures in two JDI tests: JBS: https://bugs.openjdk.java.net/browse/JDK-8160833 Open webrev: http://cr.openjdk.java.net/~gtriantafill/8160833/webrev/ Thanks to Tim Bell for his assistance. The fix replaces the use of javax.rmi.CORBA.Util with String.format. See this comment for details. Tested locally on Linux with latest version of jtreg. -George From chris.plummer at oracle.com Tue Aug 2 20:31:28 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 2 Aug 2016 13:31:28 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup Message-ID: Hello, Please review the following: webrev: http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ Bugs fixed: JDK-8133749: os::current_frame() is not returning the proper frame on ARM and solaris-x64 https://bugs.openjdk.java.net/browse/JDK-8133749 JDK-8133747: NMT includes an extra stack frame due to assumption NMT is making on tail calls being used https://bugs.openjdk.java.net/browse/JDK-8133747 JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds includes NativeCallStack::NativeCallStack() frame in backtrace https://bugs.openjdk.java.net/browse/JDK-8133740 The above bugs all result in the NMT detail stack traces including extra frames in the stack traces. Certain frames are suppose to be skipped, but sometimes are not. The frames that show up are: NativeCallStack::NativeCallStack os::get_native_stack These are both methods used to generate the stack trace, and therefore should not be included it. However, under some (most) circumstances, they were. Also, there was no test to make sure that any NMT detail output is generated, or that it is correct. I've added one with this webrev. Of the 27 possible builds (9 platforms * 3 build flavors), only 9 of the 27 initially passed this new test. They were the product and fastdebug builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug builds for solaris-x64, windows-x86, and windows-x64. All the rest failed. They now all pass with my fixes in place. Here's a summary of the changes: src/os/posix/vm/os_posix.cpp src/os/windows/vm/os_windows.cpp JDK-8133747 fixes: There was some frame skipping logic here which was sort of correct, but was misplace. There are no extra frames being added in os::get_native_stack() due to lack of inlining or lack of a tail call, so no need for toSkip++ here. The logic has been moved to NativeCallStack::NativeCallStack, which is where the tail call is (sometimes) made, and also corrected (see nativeCallStack.cpp below). src/share/vm/utilities/nativeCallStack.cpp JDK-8133747 fixes: The frame skipping logic that was moved here assumed that NativeCallStack::NativeCallStack would not appear in the call stack (due to a tail call be using to call os::get_native_stack) except in slow debug builds. However, some platforms also don't use a tail call even when optimized. From what I can tell that is the case for 32-bit platforms and for windows. src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp src/os_cpu/windows_x86/vm/os_windows_x86.cpp src/os_cpu/linux_x86/vm/os_linux_x86.cpp JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to skip one extra frame src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp JDK-8133749 fixes: os:current_frame() was not consistent with other platforms and needs to skip one more frame. This means it returns the frame for the caller's caller. So when called by os:get_native_stack(), it returns the frame for whoever called os::get_native_stack(). Although not intuitive, this is what os:get_native_stack() expects. Probably a method rename and/or a behavior change is justified here, but I would prefer to do that with a followup CR if anyone has a good suggestion on what to do. test/runtime/NMT/CheckForProperDetailStackTrace.java This is the new NTM detail test. It checks for frames that shouldn't be present and validates at least one stack trace is what is expected. I verified that the above test now passes on all supported platforms, and also did a full jprt "-testset hotpot" run. I plan on doing some RBT testing with NMT detail enabled before committing. Regarding the community contributed ports that Oracle does not support, I didn't make any changes there, but it looks like some of these bugs do exist. Notably: -linux-aarch64: Looks like it suffers from JDK-8133740. The changes done to the os_linux_x86.cp should also be applied here. -linux-ppc: Hard to say for sure since the implementation of os::current_frame is different than others, but it looks to me like it suffers from both JDK-8133749 and JDK-8133740. -aix-ppc: Looks to be the same implementation as linux-ppc, so would need the same changes. These ports may also be suffering from JDK-8133747, but that fix is in shared code (nativeCallStack.cpp). My changes there will need some tweaking for these ports they don't use a tail call to call os::get_native_stack(). If the maintainers of these ports could send me some NMT detail output, I can advise better on what changes are needed. Then you can implement and test them, and then send them back to me and I'll include them with my changes. What I need is the following command run on product and slowdebug builds. Initially run without any of my changes applied. If needed I may followup with a request that they be run with the changes applied: bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version thanks, Chris From coleen.phillimore at oracle.com Tue Aug 2 22:34:33 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 2 Aug 2016 18:34:33 -0400 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS Message-ID: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> Summary: Add AlwaysRestoreFPU and move the test to jtreg. Contributed-by: myself and christian.tornqvist at oracle.com Christian moved the test, I did some cleanup and added AlwaysRestoreFPU to it. open webrev at http://cr.openjdk.java.net/~coleenp/8079562.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8079562 Tested with JPRT. thanks, Coleen From ioi.lam at oracle.com Tue Aug 2 22:46:58 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 02 Aug 2016 15:46:58 -0700 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS In-Reply-To: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> References: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> Message-ID: <57A122E2.4020900@oracle.com> Looks good. Thanks for fixing it. - Ioi On 8/2/16 3:34 PM, Coleen Phillimore wrote: > Summary: Add AlwaysRestoreFPU and move the test to jtreg. > Contributed-by: myself and christian.tornqvist at oracle.com > > Christian moved the test, I did some cleanup and added > AlwaysRestoreFPU to it. > > open webrev at http://cr.openjdk.java.net/~coleenp/8079562.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8079562 > > Tested with JPRT. > > thanks, > Coleen From coleen.phillimore at oracle.com Wed Aug 3 00:09:49 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 2 Aug 2016 20:09:49 -0400 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS In-Reply-To: <57A122E2.4020900@oracle.com> References: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> <57A122E2.4020900@oracle.com> Message-ID: <09c7ccd8-07ac-5472-1917-3102c5d98bca@oracle.com> Thanks, Ioi! Coleen On 8/2/16 6:46 PM, Ioi Lam wrote: > Looks good. Thanks for fixing it. > > - Ioi > > On 8/2/16 3:34 PM, Coleen Phillimore wrote: >> Summary: Add AlwaysRestoreFPU and move the test to jtreg. >> Contributed-by: myself and christian.tornqvist at oracle.com >> >> Christian moved the test, I did some cleanup and added >> AlwaysRestoreFPU to it. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8079562.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8079562 >> >> Tested with JPRT. >> >> thanks, >> Coleen > From david.holmes at oracle.com Wed Aug 3 00:35:04 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Aug 2016 10:35:04 +1000 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS In-Reply-To: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> References: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> Message-ID: Hi Coleen, On 3/08/2016 8:34 AM, Coleen Phillimore wrote: > Summary: Add AlwaysRestoreFPU and move the test to jtreg. > Contributed-by: myself and christian.tornqvist at oracle.com > > Christian moved the test, I did some cleanup and added AlwaysRestoreFPU > to it. You modified the original C code such that this comment is no longer valid: 33 // Only valid for specific x86 based systems: linux-x86, or else x86 with fpu_control.h Not sure why you did that - we were running this test on more than just Windows, despite the naming. I'm also unclear exactly what changed such that we need to add AlwaysRestoreFPU ?? Can you add @Summary info to the Java test (and/or other commentary) as it is far from clear exactly what this is testing. Thanks, David > open webrev at http://cr.openjdk.java.net/~coleenp/8079562.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8079562 > > Tested with JPRT. > > thanks, > Coleen From david.holmes at oracle.com Wed Aug 3 01:13:40 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Aug 2016 11:13:40 +1000 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread Message-ID: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ bug: https://bugs.openjdk.java.net/browse/JDK-8159461 The suspend/resume signal (SR_signum) is never sent to a thread once it has started to terminate. On one platform (SuSE 12) we have seen what appears to be a "stuck" signal, which is only delivered when the terminating thread restores its original signal mask (as if pthread_sigmask makes the system realize there is a pending signal - we already check the signal was not blocked). At this point in the thread termination we have freed the osthread, so the the SR_handler would access deallocated memory. In debug builds we first hit an assertion that the current thread is a JavaThread or the VMThread - that assertion fails, even though it is a JavaThread, because we have already executed the ~JavaThread destructor and inside the ~Thread destructor we are a plain Thread not a JavaThread. The fix was to make a small adjustment to the thread termination process so that we delete the SR_lock before calling os::free_thread(). In the SR_handler() we can then use a NULL check of SR_lock() to indicate the thread has terminated and we return. While only seen on Linux I took the opportunity to apply the fix on all platforms and also cleaned up the code where we were using Thread::current() unsafely in a signal-handling context. Testing: regular tier 1 (JPRT) Kitchensink (in progress) As we can't readily reproduce the problem I tested this by having a terminating thread raise SR_signum directly from within the ~Thread destructor. Thanks, David From coleen.phillimore at oracle.com Wed Aug 3 01:16:42 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 2 Aug 2016 21:16:42 -0400 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS In-Reply-To: References: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> Message-ID: On 8/2/16 8:35 PM, David Holmes wrote: > Hi Coleen, > > On 3/08/2016 8:34 AM, Coleen Phillimore wrote: >> Summary: Add AlwaysRestoreFPU and move the test to jtreg. >> Contributed-by: myself and christian.tornqvist at oracle.com >> >> Christian moved the test, I did some cleanup and added AlwaysRestoreFPU >> to it. > > You modified the original C code such that this comment is no longer > valid: > > 33 // Only valid for specific x86 based systems: linux-x86, or else > x86 with fpu_control.h Thanks, David. I took out the rest of the comments but missed this one. > > Not sure why you did that - we were running this test on more than > just Windows, despite the naming. Actually, the #ifdefs were intended to only run on 32 bit linux as well as windows, but I don't think the ifdefs were correct, so the test always printed that it was skipped on platforms other than windows. The test is only useful on windows. Modifying the fpcw didn't have an effect on linux x86. Seems silly to run the test on platforms where it's not testing anything. > > I'm also unclear exactly what changed such that we need to add > AlwaysRestoreFPU ?? https://bugs.openjdk.java.net/browse/JDK-8076284 allegedly changed the behavior in 2015. Setting floating point control word is something that we don't promise to work with generated code or the jvm code, which is why this -XX:+AlwaysRestoreFPU flag was added. > > Can you add @Summary info to the Java test (and/or other commentary) > as it is far from clear exactly what this is testing. > Yes, I can do that. How about: * @summary Test that modifying the floating point control word doesn't cause unhandled Windows floa ting point exceptions Can summary lines be > 1 line or do they have to be one line? Thanks, Coleen > Thanks, > David > >> open webrev at http://cr.openjdk.java.net/~coleenp/8079562.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8079562 >> >> Tested with JPRT. >> >> thanks, >> Coleen From david.holmes at oracle.com Wed Aug 3 01:32:37 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Aug 2016 11:32:37 +1000 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS In-Reply-To: References: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> Message-ID: <8a82ee56-3068-bcb4-5df5-8d20228fe05d@oracle.com> Hi Coleen, On 3/08/2016 11:16 AM, Coleen Phillimore wrote: > > > On 8/2/16 8:35 PM, David Holmes wrote: >> Hi Coleen, >> >> On 3/08/2016 8:34 AM, Coleen Phillimore wrote: >>> Summary: Add AlwaysRestoreFPU and move the test to jtreg. >>> Contributed-by: myself and christian.tornqvist at oracle.com >>> >>> Christian moved the test, I did some cleanup and added AlwaysRestoreFPU >>> to it. >> >> You modified the original C code such that this comment is no longer >> valid: >> >> 33 // Only valid for specific x86 based systems: linux-x86, or else >> x86 with fpu_control.h > > Thanks, David. I took out the rest of the comments but missed this one. >> >> Not sure why you did that - we were running this test on more than >> just Windows, despite the naming. > > Actually, the #ifdefs were intended to only run on 32 bit linux as well > as windows, but I don't think the ifdefs were correct, so the test > always printed that it was skipped on platforms other than windows. The > test is only useful on windows. Modifying the fpcw didn't have an > effect on linux x86. Seems silly to run the test on platforms where > it's not testing anything. It's been a while but I've been involved with this test in the past and I thought it was doing similar testing on other x86 platforms. As you say the main point is to verify that messing with the FPU control word doesn't break anything so I'm not sure that is a reason not to have such a test run on linux. >> >> I'm also unclear exactly what changed such that we need to add >> AlwaysRestoreFPU ?? > > https://bugs.openjdk.java.net/browse/JDK-8076284 allegedly changed the > behavior in 2015. I certainly can't make any connection between that bug and handling of the FPU control word. :( > Setting floating point control word is something that we don't promise > to work with generated code or the jvm code, which is why this > -XX:+AlwaysRestoreFPU flag was added. Understood, just puzzled about why this test suddenly requires it. Given the FPU mode used by the VM has not changed, why does the test now fail without this?? >> >> Can you add @Summary info to the Java test (and/or other commentary) >> as it is far from clear exactly what this is testing. >> > > Yes, I can do that. How about: > > * @summary Test that modifying the floating point control word doesn't > cause unhandled Windows floating point exceptions That's fine -thanks. Can you also document with a comment exactly what our modification of the FPU control word is intended to do and how that relates to the normal FPU mode used by the VM. This is something I always have to go looking for :) > > Can summary lines be > 1 line or do they have to be one line? A tag runs until the next tag is encountered so the summary can span multiple lines. Thanks, David > Thanks, > Coleen > > >> Thanks, >> David >> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8079562.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8079562 >>> >>> Tested with JPRT. >>> >>> thanks, >>> Coleen > From david.holmes at oracle.com Wed Aug 3 01:49:44 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Aug 2016 11:49:44 +1000 Subject: RFR 8160833: ClassesByName2Test.java and RedefineCrossEvent.java failing with jtreg tip In-Reply-To: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> References: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> Message-ID: <09a47dd8-c396-78fc-301c-23882322436e@oracle.com> Hi George, On 3/08/2016 4:49 AM, George Triantafillou wrote: > Please review this small change to fix test failures in two JDI tests: There's only one test changed in the webrev ?? > JBS: https://bugs.openjdk.java.net/browse/JDK-8160833 > Open webrev: http://cr.openjdk.java.net/~gtriantafill/8160833/webrev/ > > > Thanks to Tim Bell for his assistance. The fix replaces the use of > javax.rmi.CORBA.Util with String.format. See this comment > > for details. So the assumption is that String.format will load a bunch of classes not already loaded? I assume the problem with javax.rmi.CORBA was module accessibility? Or does the Util class no longer exist? 31 * @modules java.corba Presuambly this can be removed? However this: 56 java.awt.Toolkit tk = java.awt.Toolkit.getDefaultToolkit(); suggests we need @module java.desktop (which limits this test to running on a full JDK). Thanks, David > Tested locally on Linux with latest version of jtreg. > > -George > From coleen.phillimore at oracle.com Wed Aug 3 02:14:15 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 2 Aug 2016 22:14:15 -0400 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS In-Reply-To: <8a82ee56-3068-bcb4-5df5-8d20228fe05d@oracle.com> References: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> <8a82ee56-3068-bcb4-5df5-8d20228fe05d@oracle.com> Message-ID: <0808479f-79f8-04a8-a614-08afa33f23a8@oracle.com> On 8/2/16 9:32 PM, David Holmes wrote: > Hi Coleen, > > On 3/08/2016 11:16 AM, Coleen Phillimore wrote: >> >> >> On 8/2/16 8:35 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 3/08/2016 8:34 AM, Coleen Phillimore wrote: >>>> Summary: Add AlwaysRestoreFPU and move the test to jtreg. >>>> Contributed-by: myself and christian.tornqvist at oracle.com >>>> >>>> Christian moved the test, I did some cleanup and added >>>> AlwaysRestoreFPU >>>> to it. >>> >>> You modified the original C code such that this comment is no longer >>> valid: >>> >>> 33 // Only valid for specific x86 based systems: linux-x86, or else >>> x86 with fpu_control.h >> >> Thanks, David. I took out the rest of the comments but missed this one. >>> >>> Not sure why you did that - we were running this test on more than >>> just Windows, despite the naming. >> >> Actually, the #ifdefs were intended to only run on 32 bit linux as well >> as windows, but I don't think the ifdefs were correct, so the test >> always printed that it was skipped on platforms other than windows. The >> test is only useful on windows. Modifying the fpcw didn't have an >> effect on linux x86. Seems silly to run the test on platforms where >> it's not testing anything. > > It's been a while but I've been involved with this test in the past > and I thought it was doing similar testing on other x86 platforms. As > you say the main point is to verify that messing with the FPU control > word doesn't break anything so I'm not sure that is a reason not to > have such a test run on linux. The original point of the test is unknown to me. The test was doing something that can cause the vm to crash or get the wrong behavior. The new point of the test is to test the AlwaysRestoreFPU flag. Since we can provoke a crash on windows 32 bit only, this seems like a good restriction for the test. > >>> >>> I'm also unclear exactly what changed such that we need to add >>> AlwaysRestoreFPU ?? >> >> https://bugs.openjdk.java.net/browse/JDK-8076284 allegedly changed the >> behavior in 2015. > > I certainly can't make any connection between that bug and handling of > the FPU control word. :( > Me neither but the failure I got was from the divide by zero getting a windows internal error, not the C2 generated code. Rereading the bug, Vladimir thought it could have come from the VC++ compiler change and not the fix. >> Setting floating point control word is something that we don't promise >> to work with generated code or the jvm code, which is why this >> -XX:+AlwaysRestoreFPU flag was added. > > Understood, just puzzled about why this test suddenly requires it. > Given the FPU mode used by the VM has not changed, why does the test > now fail without this?? > I don't know why it worked before. There could have been a library change also that gets the internal error. It wasn't an unhandled floating point exception. >>> >>> Can you add @Summary info to the Java test (and/or other commentary) >>> as it is far from clear exactly what this is testing. >>> >> >> Yes, I can do that. How about: >> >> * @summary Test that modifying the floating point control word doesn't >> cause unhandled Windows floating point exceptions > > That's fine -thanks. Can you also document with a comment exactly what > our modification of the FPU control word is intended to do and how > that relates to the normal FPU mode used by the VM. This is something > I always have to go looking for :) Okay, how about: * @summary Test that modifying the floating point control word doesn't cause an internal * error when turning off floating point exceptions for divide by zero. See new webrev for comments about the default. open webrev at http://cr.openjdk.java.net/~coleenp/8079562.02/webrev Again, adding the flag is the solution that we give customers. We also test for changing the fpcw with -Xcheck:jni. Thanks, Coleen > >> >> Can summary lines be > 1 line or do they have to be one line? > > A tag runs until the next tag is encountered so the summary can span > multiple lines. > > Thanks, > David > >> Thanks, >> Coleen >> >> >>> Thanks, >>> David >>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8079562.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8079562 >>>> >>>> Tested with JPRT. >>>> >>>> thanks, >>>> Coleen >> From david.holmes at oracle.com Wed Aug 3 02:34:07 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Aug 2016 12:34:07 +1000 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS In-Reply-To: <0808479f-79f8-04a8-a614-08afa33f23a8@oracle.com> References: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> <8a82ee56-3068-bcb4-5df5-8d20228fe05d@oracle.com> <0808479f-79f8-04a8-a614-08afa33f23a8@oracle.com> Message-ID: <0f3fc479-1aed-12e9-5029-6ee7d2875419@oracle.com> Hi Coleen, So does this: setCW( (FPU_DEFAULT | FPU_IEM) & ~EM_ZERODIVIDE ); enable or disable hardware exceptions for FP-div-by-zero? To be honest I'm not sure this test serves any real point in its current form any more. A general cross-platform test for the AlwaysRestoreFPU flag would be: static native int getFPUCW(); static native void setFPUCW(int cw); test() { int default_cw = getFPUCW(); int new_cw = ~default_cw; setFPUCW(new_cw); assert(getFPU_CW() == default_cw, "AlwaysRestoreFPU didn't restore it!); } but that kind of misses the point about triggering unhandled C++ exceptions. And isn't getting to the bottom of the internal error that was seen. Sorry. Feel free to ignore me. David On 3/08/2016 12:14 PM, Coleen Phillimore wrote: > > > On 8/2/16 9:32 PM, David Holmes wrote: >> Hi Coleen, >> >> On 3/08/2016 11:16 AM, Coleen Phillimore wrote: >>> >>> >>> On 8/2/16 8:35 PM, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> On 3/08/2016 8:34 AM, Coleen Phillimore wrote: >>>>> Summary: Add AlwaysRestoreFPU and move the test to jtreg. >>>>> Contributed-by: myself and christian.tornqvist at oracle.com >>>>> >>>>> Christian moved the test, I did some cleanup and added >>>>> AlwaysRestoreFPU >>>>> to it. >>>> >>>> You modified the original C code such that this comment is no longer >>>> valid: >>>> >>>> 33 // Only valid for specific x86 based systems: linux-x86, or else >>>> x86 with fpu_control.h >>> >>> Thanks, David. I took out the rest of the comments but missed this one. >>>> >>>> Not sure why you did that - we were running this test on more than >>>> just Windows, despite the naming. >>> >>> Actually, the #ifdefs were intended to only run on 32 bit linux as well >>> as windows, but I don't think the ifdefs were correct, so the test >>> always printed that it was skipped on platforms other than windows. The >>> test is only useful on windows. Modifying the fpcw didn't have an >>> effect on linux x86. Seems silly to run the test on platforms where >>> it's not testing anything. >> >> It's been a while but I've been involved with this test in the past >> and I thought it was doing similar testing on other x86 platforms. As >> you say the main point is to verify that messing with the FPU control >> word doesn't break anything so I'm not sure that is a reason not to >> have such a test run on linux. > > The original point of the test is unknown to me. The test was doing > something that can cause the vm to crash or get the wrong behavior. The > new point of the test is to test the AlwaysRestoreFPU flag. Since we can > provoke a crash on windows 32 bit only, this seems like a good > restriction for the test. >> >>>> >>>> I'm also unclear exactly what changed such that we need to add >>>> AlwaysRestoreFPU ?? >>> >>> https://bugs.openjdk.java.net/browse/JDK-8076284 allegedly changed the >>> behavior in 2015. >> >> I certainly can't make any connection between that bug and handling of >> the FPU control word. :( >> > > Me neither but the failure I got was from the divide by zero getting a > windows internal error, not the C2 generated code. Rereading the bug, > Vladimir thought it could have come from the VC++ compiler change and > not the fix. > >>> Setting floating point control word is something that we don't promise >>> to work with generated code or the jvm code, which is why this >>> -XX:+AlwaysRestoreFPU flag was added. >> >> Understood, just puzzled about why this test suddenly requires it. >> Given the FPU mode used by the VM has not changed, why does the test >> now fail without this?? >> > > I don't know why it worked before. There could have been a library > change also that gets the internal error. It wasn't an unhandled > floating point exception. > >>>> >>>> Can you add @Summary info to the Java test (and/or other commentary) >>>> as it is far from clear exactly what this is testing. >>>> >>> >>> Yes, I can do that. How about: >>> >>> * @summary Test that modifying the floating point control word doesn't >>> cause unhandled Windows floating point exceptions >> >> That's fine -thanks. Can you also document with a comment exactly what >> our modification of the FPU control word is intended to do and how >> that relates to the normal FPU mode used by the VM. This is something >> I always have to go looking for :) > > Okay, how about: > > * @summary Test that modifying the floating point control word doesn't > cause an internal > * error when turning off floating point exceptions for divide > by zero. > > See new webrev for comments about the default. > > open webrev at http://cr.openjdk.java.net/~coleenp/8079562.02/webrev > > > Again, adding the flag is the solution that we give customers. We also > test for changing the fpcw with -Xcheck:jni. > > Thanks, > Coleen > >> >>> >>> Can summary lines be > 1 line or do they have to be one line? >> >> A tag runs until the next tag is encountered so the summary can span >> multiple lines. >> >> Thanks, >> David >> >>> Thanks, >>> Coleen >>> >>> >>>> Thanks, >>>> David >>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8079562.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8079562 >>>>> >>>>> Tested with JPRT. >>>>> >>>>> thanks, >>>>> Coleen >>> > From coleen.phillimore at oracle.com Wed Aug 3 02:48:48 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 2 Aug 2016 22:48:48 -0400 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS In-Reply-To: <0f3fc479-1aed-12e9-5029-6ee7d2875419@oracle.com> References: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> <8a82ee56-3068-bcb4-5df5-8d20228fe05d@oracle.com> <0808479f-79f8-04a8-a614-08afa33f23a8@oracle.com> <0f3fc479-1aed-12e9-5029-6ee7d2875419@oracle.com> Message-ID: <029b035d-74d8-ac44-fa4b-08952f4ebf24@oracle.com> Hi, David, That would be a better test, I guess. My goal here is to fix this test from failing on windows 32 and salvage something out of the existing test. We could simply delete the test and file an RFE to test fpcw better. On 8/2/16 10:34 PM, David Holmes wrote: > Hi Coleen, > > So does this: > > setCW( (FPU_DEFAULT | FPU_IEM) & ~EM_ZERODIVIDE ); > > enable or disable hardware exceptions for FP-div-by-zero? I think it disables hardware exceptions. I guess Christian's going to have to answer this tomorrow. > > To be honest I'm not sure this test serves any real point in its > current form any more. A general cross-platform test for the > AlwaysRestoreFPU flag would be: > > static native int getFPUCW(); > static native void setFPUCW(int cw); > > test() { > int default_cw = getFPUCW(); > int new_cw = ~default_cw; > setFPUCW(new_cw); > assert(getFPU_CW() == default_cw, "AlwaysRestoreFPU didn't restore it!); > } > > but that kind of misses the point about triggering unhandled C++ > exceptions. And isn't getting to the bottom of the internal error that > was seen. > No, if that's required, I'll have to unassign myself from the bug, quarantine the test, and save it for jdk10, because it might take a long time to figure out, if we ever figure it out. I don't think there's a product issue here at least from my reading of the situation. > Sorry. Feel free to ignore me. > We never ignore you, David. Coleen > David > > > On 3/08/2016 12:14 PM, Coleen Phillimore wrote: >> >> >> On 8/2/16 9:32 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 3/08/2016 11:16 AM, Coleen Phillimore wrote: >>>> >>>> >>>> On 8/2/16 8:35 PM, David Holmes wrote: >>>>> Hi Coleen, >>>>> >>>>> On 3/08/2016 8:34 AM, Coleen Phillimore wrote: >>>>>> Summary: Add AlwaysRestoreFPU and move the test to jtreg. >>>>>> Contributed-by: myself and christian.tornqvist at oracle.com >>>>>> >>>>>> Christian moved the test, I did some cleanup and added >>>>>> AlwaysRestoreFPU >>>>>> to it. >>>>> >>>>> You modified the original C code such that this comment is no longer >>>>> valid: >>>>> >>>>> 33 // Only valid for specific x86 based systems: linux-x86, or else >>>>> x86 with fpu_control.h >>>> >>>> Thanks, David. I took out the rest of the comments but missed this >>>> one. >>>>> >>>>> Not sure why you did that - we were running this test on more than >>>>> just Windows, despite the naming. >>>> >>>> Actually, the #ifdefs were intended to only run on 32 bit linux as >>>> well >>>> as windows, but I don't think the ifdefs were correct, so the test >>>> always printed that it was skipped on platforms other than >>>> windows. The >>>> test is only useful on windows. Modifying the fpcw didn't have an >>>> effect on linux x86. Seems silly to run the test on platforms where >>>> it's not testing anything. >>> >>> It's been a while but I've been involved with this test in the past >>> and I thought it was doing similar testing on other x86 platforms. As >>> you say the main point is to verify that messing with the FPU control >>> word doesn't break anything so I'm not sure that is a reason not to >>> have such a test run on linux. >> >> The original point of the test is unknown to me. The test was doing >> something that can cause the vm to crash or get the wrong behavior. The >> new point of the test is to test the AlwaysRestoreFPU flag. Since we can >> provoke a crash on windows 32 bit only, this seems like a good >> restriction for the test. >>> >>>>> >>>>> I'm also unclear exactly what changed such that we need to add >>>>> AlwaysRestoreFPU ?? >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8076284 allegedly changed the >>>> behavior in 2015. >>> >>> I certainly can't make any connection between that bug and handling of >>> the FPU control word. :( >>> >> >> Me neither but the failure I got was from the divide by zero getting a >> windows internal error, not the C2 generated code. Rereading the bug, >> Vladimir thought it could have come from the VC++ compiler change and >> not the fix. >> >>>> Setting floating point control word is something that we don't promise >>>> to work with generated code or the jvm code, which is why this >>>> -XX:+AlwaysRestoreFPU flag was added. >>> >>> Understood, just puzzled about why this test suddenly requires it. >>> Given the FPU mode used by the VM has not changed, why does the test >>> now fail without this?? >>> >> >> I don't know why it worked before. There could have been a library >> change also that gets the internal error. It wasn't an unhandled >> floating point exception. >> >>>>> >>>>> Can you add @Summary info to the Java test (and/or other commentary) >>>>> as it is far from clear exactly what this is testing. >>>>> >>>> >>>> Yes, I can do that. How about: >>>> >>>> * @summary Test that modifying the floating point control word >>>> doesn't >>>> cause unhandled Windows floating point exceptions >>> >>> That's fine -thanks. Can you also document with a comment exactly what >>> our modification of the FPU control word is intended to do and how >>> that relates to the normal FPU mode used by the VM. This is something >>> I always have to go looking for :) >> >> Okay, how about: >> >> * @summary Test that modifying the floating point control word doesn't >> cause an internal >> * error when turning off floating point exceptions for divide >> by zero. >> >> See new webrev for comments about the default. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8079562.02/webrev >> >> >> Again, adding the flag is the solution that we give customers. We also >> test for changing the fpcw with -Xcheck:jni. >> >> Thanks, >> Coleen >> >>> >>>> >>>> Can summary lines be > 1 line or do they have to be one line? >>> >>> A tag runs until the next tag is encountered so the summary can span >>> multiple lines. >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> Coleen >>>> >>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8079562.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8079562 >>>>>> >>>>>> Tested with JPRT. >>>>>> >>>>>> thanks, >>>>>> Coleen >>>> >> From harold.seigel at oracle.com Wed Aug 3 12:15:58 2016 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 3 Aug 2016 08:15:58 -0400 Subject: RFR 8058575: IllegalAccessError trying to access package-private class from VM anonymous class Message-ID: <0149d99a-842d-4b18-e3b8-67eced65517e@oracle.com> Hi, Please review this fix for bug 8058575. The fix prevents a class created using Unsafe.defineAnonymousClass() from being in a different package than its host class. Being in different packages would create access problems if the packages were in different modules. With this fix, If the anonymous class is in a different package then the JVM will throw IllegalArgumentException. If the anonymous class is in the unnamed package then the JVM will move the anonymous class into its host class's package. JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8058575 Open webrevs: http://cr.openjdk.java.net/~hseigel/bug_8058575.hs/ http://cr.openjdk.java.net/~hseigel/bug_8058575.jdk/ The fix was tested with the JCK Lang and VM tests, the hotpot, and java/lang, java/util and other JTreg tests, the NSK quick tests, and with the RBT runtime nightly tests. Thanks, Harold From george.triantafillou at oracle.com Wed Aug 3 13:24:51 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Wed, 3 Aug 2016 09:24:51 -0400 Subject: RFR 8160833: ClassesByName2Test.java and RedefineCrossEvent.java failing with jtreg tip In-Reply-To: <4bda01d1ecfc$d2662bc0$77328340$@oracle.com> References: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> <4bda01d1ecfc$d2662bc0$77328340$@oracle.com> Message-ID: Hi Christian, Thanks for the review. Yes, java.util.Formatter is loaded after ClassesByName2Targ, which satisfies the test requirements. -George On 8/2/2016 4:31 PM, Christian Tornqvist wrote: > Hi George, > > Looks good, after your explanation made to me offline that it's not the java.lang.String class that we expect to be loaded @75. > > Thanks, > Christian > > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of George Triantafillou > Sent: Tuesday, August 2, 2016 2:50 PM > To: hotspot-runtime-dev at openjdk.java.net > Subject: RFR 8160833: ClassesByName2Test.java and RedefineCrossEvent.java failing with jtreg tip > > Please review this small change to fix test failures in two JDI tests: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8160833 > Open webrev: http://cr.openjdk.java.net/~gtriantafill/8160833/webrev/ > > > Thanks to Tim Bell for his assistance. The fix replaces the use of javax.rmi.CORBA.Util with String.format. See this comment > for details. > > Tested locally on Linux with latest version of jtreg. > > -George > > From george.triantafillou at oracle.com Wed Aug 3 13:31:44 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Wed, 3 Aug 2016 09:31:44 -0400 Subject: RFR 8160833: ClassesByName2Test.java and RedefineCrossEvent.java failing with jtreg tip In-Reply-To: References: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> Message-ID: <9cb46939-4ce9-dac1-0068-3e1b34c78c07@oracle.com> Thanks Dan! -George On 8/2/2016 3:50 PM, Daniel D. Daugherty wrote: > Adding serviceability-dev at ... alias since these are JDI tests... > > Dan > > > On 8/2/16 12:49 PM, George Triantafillou wrote: >> Please review this small change to fix test failures in two JDI tests: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8160833 >> Open webrev: http://cr.openjdk.java.net/~gtriantafill/8160833/webrev/ >> >> >> Thanks to Tim Bell for his assistance. The fix replaces the use of >> javax.rmi.CORBA.Util with String.format. See this comment >> >> for details. >> >> Tested locally on Linux with latest version of jtreg. >> >> -George >> > From frederic.parain at oracle.com Wed Aug 3 14:07:16 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Wed, 3 Aug 2016 10:07:16 -0400 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> Message-ID: <4136b535-7618-6a42-7656-f8fcebafdb60@oracle.com> David, Interesting twist about JavaThreads returning to their plain Thread nature before dying. Fix looks good to me. Fred On 08/02/2016 09:13 PM, David Holmes wrote: > webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ > > bug: https://bugs.openjdk.java.net/browse/JDK-8159461 > > The suspend/resume signal (SR_signum) is never sent to a thread once it > has started to terminate. On one platform (SuSE 12) we have seen what > appears to be a "stuck" signal, which is only delivered when the > terminating thread restores its original signal mask (as if > pthread_sigmask makes the system realize there is a pending signal - we > already check the signal was not blocked). At this point in the thread > termination we have freed the osthread, so the the SR_handler would > access deallocated memory. In debug builds we first hit an assertion > that the current thread is a JavaThread or the VMThread - that assertion > fails, even though it is a JavaThread, because we have already executed > the ~JavaThread destructor and inside the ~Thread destructor we are a > plain Thread not a JavaThread. > > The fix was to make a small adjustment to the thread termination process > so that we delete the SR_lock before calling os::free_thread(). In the > SR_handler() we can then use a NULL check of SR_lock() to indicate the > thread has terminated and we return. > > While only seen on Linux I took the opportunity to apply the fix on all > platforms and also cleaned up the code where we were using > Thread::current() unsafely in a signal-handling context. > > Testing: regular tier 1 (JPRT) > Kitchensink (in progress) > > As we can't readily reproduce the problem I tested this by having a > terminating thread raise SR_signum directly from within the ~Thread > destructor. > > Thanks, > David From george.triantafillou at oracle.com Wed Aug 3 14:24:51 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Wed, 3 Aug 2016 10:24:51 -0400 Subject: RFR 8160833: ClassesByName2Test.java and RedefineCrossEvent.java failing with jtreg tip In-Reply-To: <09a47dd8-c396-78fc-301c-23882322436e@oracle.com> References: <533e8d45-9e5b-9222-e8f6-a56280bd5d7e@oracle.com> <09a47dd8-c396-78fc-301c-23882322436e@oracle.com> Message-ID: Hi David, Thanks for your review. On 8/2/2016 9:49 PM, David Holmes wrote: > Hi George, > > On 3/08/2016 4:49 AM, George Triantafillou wrote: >> Please review this small change to fix test failures in two JDI tests: > > There's only one test changed in the webrev ?? RedefineCrossEvent.java simply runs a set of tests, including ClassesByName2Test.java, with @run driver. So fixing ClassesByName2Test.java also fixed the failure in RedefineCrossEvent.java. > >> JBS: https://bugs.openjdk.java.net/browse/JDK-8160833 >> Open webrev: http://cr.openjdk.java.net/~gtriantafill/8160833/webrev/ >> >> >> Thanks to Tim Bell for his assistance. The fix replaces the use of >> javax.rmi.CORBA.Util with String.format. See this comment >> >> >> for details. > > So the assumption is that String.format will load a bunch of classes > not already loaded? Yes, and the test requires that those classes be loaded after ClassesByName2Targ. > I assume the problem with javax.rmi.CORBA was module accessibility? Or > does the Util class no longer exist? Yes, the problem was module accessibility. I tried to unravel a simple way to add module access, but was unsuccessful. > > 31 * @modules java.corba > > Presuambly this can be removed? Yes, good catch. java.corba is no longer required and I've removed it. New webrev is here: http://cr.openjdk.java.net/~gtriantafill/8160833.01/webrev/ > > However this: > > 56 java.awt.Toolkit tk = java.awt.Toolkit.getDefaultToolkit(); > > suggests we need @module java.desktop (which limits this test to > running on a full JDK). I don't know, since this was in the initial revision in the repo. -George > > Thanks, > David > >> Tested locally on Linux with latest version of jtreg. >> >> -George >> From lois.foltan at oracle.com Wed Aug 3 16:22:16 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 03 Aug 2016 12:22:16 -0400 Subject: RFR 8058575: IllegalAccessError trying to access package-private class from VM anonymous class In-Reply-To: <0149d99a-842d-4b18-e3b8-67eced65517e@oracle.com> References: <0149d99a-842d-4b18-e3b8-67eced65517e@oracle.com> Message-ID: <57A21A38.7020006@oracle.com> Hi Harold, Looks good. Have some comments: src/share/vm/classfile/classFileParser.cpp: - lots of comments refer to the "null package", can you change that to the "unnamed package". There really isn't a concept of a null package. - line #5423 & 5439, why do you use _class_name->base instead of _class_name->as_C_string()? - line #5439 - I recall that using jbyte should be avoided, can you change to char? - line #5439 - why do you use UTF8::strrchr instead of strrchr like line #5410? - line #5451 - extra line - line #5455 - can you change the call to InstanceKlass::is_same_class_package to use the associated with "this" or the anonymous class instead of assuming that host_klass and _class_name have the same class loader? One of the checks that InstanceKlass::is_same_class_package does is to make sure the class loaders are the same and kick out if they are not. So by passing in the same host_klass class_loader you are bypassing that check. Testing: Is it possible to have nested anonymous classes? If yes, can you add a test case for this where the host_klass being in one package and then at each nested anonymous class there are differing combinations of unnamed package, named package. Thanks, Lois On 8/3/2016 8:15 AM, harold seigel wrote: > Hi, > > Please review this fix for bug 8058575. The fix prevents a class > created using Unsafe.defineAnonymousClass() from being in a different > package than its host class. Being in different packages would create > access problems if the packages were in different modules. > > With this fix, If the anonymous class is in a different package then > the JVM will throw IllegalArgumentException. If the anonymous class > is in the unnamed package then the JVM will move the anonymous class > into its host class's package. > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8058575 > > Open webrevs: > > http://cr.openjdk.java.net/~hseigel/bug_8058575.hs/ > > http://cr.openjdk.java.net/~hseigel/bug_8058575.jdk/ > > The fix was tested with the JCK Lang and VM tests, the hotpot, and > java/lang, java/util and other JTreg tests, the NSK quick tests, and > with the RBT runtime nightly tests. > > Thanks, Harold > From david.holmes at oracle.com Wed Aug 3 20:58:00 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Aug 2016 06:58:00 +1000 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: <4136b535-7618-6a42-7656-f8fcebafdb60@oracle.com> References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> <4136b535-7618-6a42-7656-f8fcebafdb60@oracle.com> Message-ID: <15f95901-eac9-4d20-1e76-94220a8b8934@oracle.com> On 4/08/2016 12:07 AM, Frederic Parain wrote: > David, > > Interesting twist about JavaThreads returning to their plain > Thread nature before dying. Yes that was quite baffling for a while :) > Fix looks good to me. Thanks for the review Fred! David > Fred > > On 08/02/2016 09:13 PM, David Holmes wrote: >> webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8159461 >> >> The suspend/resume signal (SR_signum) is never sent to a thread once it >> has started to terminate. On one platform (SuSE 12) we have seen what >> appears to be a "stuck" signal, which is only delivered when the >> terminating thread restores its original signal mask (as if >> pthread_sigmask makes the system realize there is a pending signal - we >> already check the signal was not blocked). At this point in the thread >> termination we have freed the osthread, so the the SR_handler would >> access deallocated memory. In debug builds we first hit an assertion >> that the current thread is a JavaThread or the VMThread - that assertion >> fails, even though it is a JavaThread, because we have already executed >> the ~JavaThread destructor and inside the ~Thread destructor we are a >> plain Thread not a JavaThread. >> >> The fix was to make a small adjustment to the thread termination process >> so that we delete the SR_lock before calling os::free_thread(). In the >> SR_handler() we can then use a NULL check of SR_lock() to indicate the >> thread has terminated and we return. >> >> While only seen on Linux I took the opportunity to apply the fix on all >> platforms and also cleaned up the code where we were using >> Thread::current() unsafely in a signal-handling context. >> >> Testing: regular tier 1 (JPRT) >> Kitchensink (in progress) >> >> As we can't readily reproduce the problem I tested this by having a >> terminating thread raise SR_signum directly from within the ~Thread >> destructor. >> >> Thanks, >> David From harold.seigel at oracle.com Thu Aug 4 12:46:21 2016 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 4 Aug 2016 08:46:21 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> Message-ID: <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> Hi, Please review this update for this fix. This webrev only shows the changes since the last webrev. These changes include: 1. Fix forJDK-8162415 - the JVM now prints the following message when ignoring a property and PrintWarnings is enabled: warning: Ignoring system property options whose names start with '-Djdk.module'. They are reserved for internal use. 2. Fix for JDK-8162412 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests 4. Incorporates review comments from Alan, Coleen, Dan, and Lois 5. Fixes JTReg tests that failed due to the new option syntax. Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ Thanks, Harold On 8/2/2016 9:25 AM, harold seigel wrote: > > Hi Lois, > > Thanks for the review. Please see comments in-line. > > Harold > > > On 8/1/2016 8:40 PM, Lois Foltan wrote: >> >> On 7/17/2016 7:05 PM, harold seigel wrote: >>> Hi, >>> >>> Please review these Hotspot VM only changes to process the seven >>> module-specific options that have been renamed to have gnu-like >>> names. JDK changes for this bug will be reviewed separately. >>> >>> Descriptions of these options are here >>> . For these six options, >>> --module-path, --upgrade-module-path, --add-modules, >>> --limit-modules, --add-reads, and --add-exports, the JVM just sets a >>> system property. For the --patch-module option, the JVM sets a >>> system property and then processes the option in the same way as >>> when it was named -Xpatch. >>> >>> Additionally, the JVM now checks properties specified on the command >>> line. If a property matches one of the properties used by one of >>> the above options then the JVM ignores the property. This forces >>> users to use the explicit option when wanting to do things like add >>> a module or a package export. >>> >>> The RFR contains two new tests. Also, many existing tests were >>> changed to use the new option names. >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>> >>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >> >> Hi Harold, >> >> Overall looks good. A couple of comments: >> >> src/share/vm/prims/jvmtiEnv.cpp >> - line #3428 - The if statement is incorrect. There are internal >> properties, like jdk.boot.class.path.append, whose value if non-null >> should be returned. > This code will be reworked in the next version of these changes > because of multiple issues. >> >> src/share/vm/runtime/arguments.cpp >> - Arguments::append_to_addmods_property was added before the VM >> starting to process --add-modules. So with this fix, it seems like >> it could be simply changed to: >> >> bool Arguments::append_to_addmods_property(const char* module_name) { >> PropertyList_unique_add(&_system_properties, >> Arguments::get_property("jdk.module.addmods"), >> module_name, >> AppendProperty, UnwriteableProperty, InternalProperty); >> } >> >> Please consider making this change since currently it contains a lot >> of duplicated code that is now unnecessary. > The one difference is that append_to_addmods_property() returns a > status but PropertyList_unique_add() does not. I'll look into this a > bit further. >> >> - line #3171, should the comment be "--add-modules=java.sql" instead >> of "--add-modules java.sql"? > yes. > > The changes suggested by you, Coleen, and Dan will be in the next > version of this webrev. > > Thanks, Harold >> >> Thanks, >> Lois >> >> >> >> >> >> >> >> >> >> >> >> >> >>> >>> The changes were tested with the JCK lang and VM tests, the JTreg >>> hotspot tests, and the RBT hotspot nightlies. >>> >>> Thanks, Harold >> > From gerald.thornbrugh at oracle.com Thu Aug 4 14:21:33 2016 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Thu, 04 Aug 2016 08:21:33 -0600 Subject: RFR 8162999 Build give extraneous find warnings Message-ID: <57A34F6D.9070808@oracle.com> Hi Everyone, I would like to have the following change reviewed: Bug: https://bugs.openjdk.java.net/browse/JDK-8162999 Webrev: http://cr.openjdk.java.net/~gthornbr/8162999/hotspot-webrev.01/ It seems that my putback of JDK-8144278 created a merge issue where a previous removed line from JDK-8132919 was placed back into JtregNative.gmk. This fix removes the line again. The presents of this line generates "No such file or directory" errors messages during builds because the "hotspot/test/compiler/native" directory no longer exists. Before this change the "No such file or directory" error messages were seen in the build logs and after the change the error messages where not placed into the log. Please let me know if you have any questions or concerns. Thanks, Jerry From coleen.phillimore at oracle.com Thu Aug 4 15:36:12 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 4 Aug 2016 11:36:12 -0400 Subject: RFR 8162999 Build give extraneous find warnings In-Reply-To: <57A34F6D.9070808@oracle.com> References: <57A34F6D.9070808@oracle.com> Message-ID: Jerry, This change looks good, thank you for fixing it. I can sponsor it for you with one more reviewer. Coleen On 8/4/16 10:21 AM, Gerald Thornbrugh wrote: > Hi Everyone, > > I would like to have the following change reviewed: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8162999 > > Webrev: > http://cr.openjdk.java.net/~gthornbr/8162999/hotspot-webrev.01/ > > > It seems that my putback of JDK-8144278 created a merge issue where a > previous > removed line from JDK-8132919 was placed back into JtregNative.gmk. > This fix > removes the line again. The presents of this line generates "No such > file or directory" > errors messages during builds because the > "hotspot/test/compiler/native" directory no > longer exists. > > Before this change the "No such file or directory" error messages were > seen in the > build logs and after the change the error messages where not placed > into the log. > > Please let me know if you have any questions or concerns. > > Thanks, > > Jerry From gerald.thornbrugh at oracle.com Thu Aug 4 15:37:25 2016 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Thu, 04 Aug 2016 09:37:25 -0600 Subject: RFR 8162999 Build give extraneous find warnings In-Reply-To: References: <57A34F6D.9070808@oracle.com> Message-ID: <57A36135.2040404@oracle.com> Hi Coleen, Thanks for the review. Jerry > > Jerry, This change looks good, thank you for fixing it. > > I can sponsor it for you with one more reviewer. > > Coleen > > On 8/4/16 10:21 AM, Gerald Thornbrugh wrote: >> Hi Everyone, >> >> I would like to have the following change reviewed: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8162999 >> >> Webrev: >> http://cr.openjdk.java.net/~gthornbr/8162999/hotspot-webrev.01/ >> >> >> It seems that my putback of JDK-8144278 created a merge issue where a >> previous >> removed line from JDK-8132919 was placed back into JtregNative.gmk. >> This fix >> removes the line again. The presents of this line generates "No such >> file or directory" >> errors messages during builds because the >> "hotspot/test/compiler/native" directory no >> longer exists. >> >> Before this change the "No such file or directory" error messages >> were seen in the >> build logs and after the change the error messages where not placed >> into the log. >> >> Please let me know if you have any questions or concerns. >> >> Thanks, >> >> Jerry > From frederic.parain at oracle.com Thu Aug 4 15:40:58 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Thu, 4 Aug 2016 11:40:58 -0400 Subject: RFR 8162999 Build give extraneous find warnings In-Reply-To: References: <57A34F6D.9070808@oracle.com> Message-ID: <20a02e26-9926-8fa3-f761-3e6fe64d553d@oracle.com> Reviewed. Fred On 08/04/2016 11:36 AM, Coleen Phillimore wrote: > > Jerry, This change looks good, thank you for fixing it. > > I can sponsor it for you with one more reviewer. > > Coleen > > On 8/4/16 10:21 AM, Gerald Thornbrugh wrote: >> Hi Everyone, >> >> I would like to have the following change reviewed: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8162999 >> >> Webrev: >> http://cr.openjdk.java.net/~gthornbr/8162999/hotspot-webrev.01/ >> >> >> It seems that my putback of JDK-8144278 created a merge issue where a >> previous >> removed line from JDK-8132919 was placed back into JtregNative.gmk. >> This fix >> removes the line again. The presents of this line generates "No such >> file or directory" >> errors messages during builds because the >> "hotspot/test/compiler/native" directory no >> longer exists. >> >> Before this change the "No such file or directory" error messages were >> seen in the >> build logs and after the change the error messages where not placed >> into the log. >> >> Please let me know if you have any questions or concerns. >> >> Thanks, >> >> Jerry > From gerald.thornbrugh at oracle.com Thu Aug 4 15:46:25 2016 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Thu, 04 Aug 2016 09:46:25 -0600 Subject: RFR 8162999 Build give extraneous find warnings In-Reply-To: <20a02e26-9926-8fa3-f761-3e6fe64d553d@oracle.com> References: <57A34F6D.9070808@oracle.com> <20a02e26-9926-8fa3-f761-3e6fe64d553d@oracle.com> Message-ID: <57A36351.909@oracle.com> Hi Fred, Thanks. Jerry > Reviewed. > > Fred > > On 08/04/2016 11:36 AM, Coleen Phillimore wrote: >> >> Jerry, This change looks good, thank you for fixing it. >> >> I can sponsor it for you with one more reviewer. >> >> Coleen >> >> On 8/4/16 10:21 AM, Gerald Thornbrugh wrote: >>> Hi Everyone, >>> >>> I would like to have the following change reviewed: >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8162999 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~gthornbr/8162999/hotspot-webrev.01/ >>> >>> >>> It seems that my putback of JDK-8144278 created a merge issue where a >>> previous >>> removed line from JDK-8132919 was placed back into JtregNative.gmk. >>> This fix >>> removes the line again. The presents of this line generates "No such >>> file or directory" >>> errors messages during builds because the >>> "hotspot/test/compiler/native" directory no >>> longer exists. >>> >>> Before this change the "No such file or directory" error messages were >>> seen in the >>> build logs and after the change the error messages where not placed >>> into the log. >>> >>> Please let me know if you have any questions or concerns. >>> >>> Thanks, >>> >>> Jerry >> From volker.simonis at gmail.com Thu Aug 4 15:48:13 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 4 Aug 2016 17:48:13 +0200 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> Message-ID: Hi David, thanks for doing this change on all platforms. The fix looks good. Maybe you can just extend the following comment with something like: // Note that the SR_lock plays no role in this suspend/resume protocol. // It is only used in SR_handler as a thread termination indicator if NULL. Regards, Volker On Wed, Aug 3, 2016 at 3:13 AM, David Holmes wrote: > webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ > > bug: https://bugs.openjdk.java.net/browse/JDK-8159461 > > The suspend/resume signal (SR_signum) is never sent to a thread once it > has started to terminate. On one platform (SuSE 12) we have seen what > appears to be a "stuck" signal, which is only delivered when the > terminating thread restores its original signal mask (as if pthread_sigmask > makes the system realize there is a pending signal - we already check the > signal was not blocked). At this point in the thread termination we have > freed the osthread, so the the SR_handler would access deallocated memory. > In debug builds we first hit an assertion that the current thread is a > JavaThread or the VMThread - that assertion fails, even though it is a > JavaThread, because we have already executed the ~JavaThread destructor and > inside the ~Thread destructor we are a plain Thread not a JavaThread. > > The fix was to make a small adjustment to the thread termination process > so that we delete the SR_lock before calling os::free_thread(). In the > SR_handler() we can then use a NULL check of SR_lock() to indicate the > thread has terminated and we return. > > While only seen on Linux I took the opportunity to apply the fix on all > platforms and also cleaned up the code where we were using > Thread::current() unsafely in a signal-handling context. > > Testing: regular tier 1 (JPRT) > Kitchensink (in progress) > > As we can't readily reproduce the problem I tested this by having a > terminating thread raise SR_signum directly from within the ~Thread > destructor. > > Thanks, > David > From harold.seigel at oracle.com Thu Aug 4 15:53:01 2016 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 4 Aug 2016 11:53:01 -0400 Subject: RFR 8058575: IllegalAccessError trying to access package-private class from VM anonymous class In-Reply-To: <57A21A38.7020006@oracle.com> References: <0149d99a-842d-4b18-e3b8-67eced65517e@oracle.com> <57A21A38.7020006@oracle.com> Message-ID: <41c4e009-09a3-d553-0b3e-f247a879867c@oracle.com> Hi Lois, Thanks for the review. Please see comments inline. On 8/3/2016 12:22 PM, Lois Foltan wrote: > Hi Harold, > > Looks good. Have some comments: > > src/share/vm/classfile/classFileParser.cpp: > - lots of comments refer to the "null package", can you change that to > the "unnamed package". There really isn't a concept of a null package. done > - line #5423 & 5439, why do you use _class_name->base instead of > _class_name->as_C_string()? The name of the anonymous class can contain non-ASCII characters so as_C_string() may not work properly. > - line #5439 - I recall that using jbyte should be avoided, can you > change to char? I used jbyte* because that is the type returned by UTF8::strrchr. I can cast the result to char*, if you prefer. > - line #5439 - why do you use UTF8::strrchr instead of strrchr like > line #5410? I used UTF8::strrchr because it takes a length argument. > - line #5451 - extra line > - line #5455 - can you change the call to > InstanceKlass::is_same_class_package to use the associated with "this" > or the anonymous class instead of assuming that host_klass and > _class_name have the same class loader? One of the checks that > InstanceKlass::is_same_class_package does is to make sure the class > loaders are the same and kick out if they are not. So by passing in > the same host_klass class_loader you are bypassing that check. The anonymous class is always loaded by the same class loader as its host. Method Unsafe_DefineAnonymousClass_impl() passes the host class's loader to SystemDictionary::parse_stream() which uses the host loader to load the anonymous class. So the extra class loader check is not needed. Also, the anonymous class is not yet loaded at this point, so what class loader would be passed to that is_same_class_package() method. > > Testing: > Is it possible to have nested anonymous classes? If yes, can you add > a test case for this where the host_klass being in one package and > then at each nested anonymous class there are differing combinations > of unnamed package, named package. Yes, it is possible to have nested anonymous classes. An anonymous class that does string concatenation would be an example. But I'm not sure such a test would have much value for this issue. > > Thanks, > Lois > Thanks, Harold > On 8/3/2016 8:15 AM, harold seigel wrote: >> Hi, >> >> Please review this fix for bug 8058575. The fix prevents a class >> created using Unsafe.defineAnonymousClass() from being in a different >> package than its host class. Being in different packages would >> create access problems if the packages were in different modules. >> >> With this fix, If the anonymous class is in a different package then >> the JVM will throw IllegalArgumentException. If the anonymous class >> is in the unnamed package then the JVM will move the anonymous class >> into its host class's package. >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8058575 >> >> Open webrevs: >> >> http://cr.openjdk.java.net/~hseigel/bug_8058575.hs/ >> >> http://cr.openjdk.java.net/~hseigel/bug_8058575.jdk/ >> >> The fix was tested with the JCK Lang and VM tests, the hotpot, and >> java/lang, java/util and other JTreg tests, the NSK quick tests, and >> with the RBT runtime nightly tests. >> >> Thanks, Harold >> > From chris.plummer at oracle.com Thu Aug 4 21:53:27 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 4 Aug 2016 14:53:27 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: Message-ID: Ping! On 8/2/16 1:31 PM, Chris Plummer wrote: > Hello, > > Please review the following: > > webrev: > http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ > > Bugs fixed: > > JDK-8133749: os::current_frame() is not returning the proper frame on > ARM and solaris-x64 > https://bugs.openjdk.java.net/browse/JDK-8133749 > > JDK-8133747: NMT includes an extra stack frame due to assumption NMT > is making on tail calls being used > https://bugs.openjdk.java.net/browse/JDK-8133747 > > JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds > includes NativeCallStack::NativeCallStack() frame in backtrace > https://bugs.openjdk.java.net/browse/JDK-8133740 > > The above bugs all result in the NMT detail stack traces including > extra frames in the stack traces. Certain frames are suppose to be > skipped, but sometimes are not. The frames that show up are: > > NativeCallStack::NativeCallStack > os::get_native_stack > > These are both methods used to generate the stack trace, and therefore > should not be included it. However, under some (most) circumstances, > they were. > > Also, there was no test to make sure that any NMT detail output is > generated, or that it is correct. I've added one with this webrev. Of > the 27 possible builds (9 platforms * 3 build flavors), only 9 of the > 27 initially passed this new test. They were the product and fastdebug > builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug > builds for solaris-x64, windows-x86, and windows-x64. All the rest > failed. They now all pass with my fixes in place. > > Here's a summary of the changes: > > src/os/posix/vm/os_posix.cpp > src/os/windows/vm/os_windows.cpp > > JDK-8133747 fixes: There was some frame skipping logic here which was > sort of correct, but was misplace. There are no extra frames being > added in os::get_native_stack() due to lack of inlining or lack of a > tail call, so no need for toSkip++ here. The logic has been moved to > NativeCallStack::NativeCallStack, which is where the tail call is > (sometimes) made, and also corrected (see nativeCallStack.cpp below). > > src/share/vm/utilities/nativeCallStack.cpp > > JDK-8133747 fixes: The frame skipping logic that was moved here > assumed that NativeCallStack::NativeCallStack would not appear in the > call stack (due to a tail call be using to call os::get_native_stack) > except in slow debug builds. However, some platforms also don't use a > tail call even when optimized. From what I can tell that is the case > for 32-bit platforms and for windows. > > src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp > src/os_cpu/windows_x86/vm/os_windows_x86.cpp > src/os_cpu/linux_x86/vm/os_linux_x86.cpp > > JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to > skip one extra frame > > src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp > > JDK-8133749 fixes: os:current_frame() was not consistent with other > platforms and needs to skip one more frame. This means it returns the > frame for the caller's caller. So when called by > os:get_native_stack(), it returns the frame for whoever called > os::get_native_stack(). Although not intuitive, this is what > os:get_native_stack() expects. Probably a method rename and/or a > behavior change is justified here, but I would prefer to do that with > a followup CR if anyone has a good suggestion on what to do. > > test/runtime/NMT/CheckForProperDetailStackTrace.java > > This is the new NTM detail test. It checks for frames that shouldn't > be present and validates at least one stack trace is what is expected. > > I verified that the above test now passes on all supported platforms, > and also did a full jprt "-testset hotpot" run. I plan on doing some > RBT testing with NMT detail enabled before committing. > > Regarding the community contributed ports that Oracle does not > support, I didn't make any changes there, but it looks like some of > these bugs do exist. Notably: > > -linux-aarch64: Looks like it suffers from JDK-8133740. The changes > done to the > os_linux_x86.cp should also be applied here. > -linux-ppc: Hard to say for sure since the implementation of > os::current_frame is > different than others, but it looks to me like it suffers from both > JDK-8133749 > and JDK-8133740. > -aix-ppc: Looks to be the same implementation as linux-ppc, so would > need the > same changes. > > These ports may also be suffering from JDK-8133747, but that fix is in > shared code (nativeCallStack.cpp). My changes there will need some > tweaking for these ports they don't use a tail call to call > os::get_native_stack(). > > If the maintainers of these ports could send me some NMT detail > output, I can advise better on what changes are needed. Then you can > implement and test them, and then send them back to me and I'll > include them with my changes. What I need is the following command run > on product and slowdebug builds. Initially run without any of my > changes applied. If needed I may followup with a request that they be > run with the changes applied: > > bin/java -XX:+UnlockDiagnosticVMOptions > -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version > > thanks, > > Chris > From dean.long at oracle.com Thu Aug 4 22:28:55 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 4 Aug 2016 15:28:55 -0700 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod Message-ID: https://bugs.openjdk.java.net/browse/JDK-8161598 http://cr.openjdk.java.net/~dlong/8161598/webrev/ Sorry, this issue is Confidential. The problem is similar to 8029441, where we suspend a thread and use pd_get_top_frame_for_profiling() to get the top frame for stack walking. The problem is "last Java frame" anchor frames on x86. In lots of places we do not store last_Java_pc. This is OK in the synchronous stack walk case done by the current thread. But in the asynchronous case, there are small windows where it's not always safe to get PC from sp[-1]. The solution is not to treat x86 anchor frames as "always walkable". Instead, we follow the example of sparc and make them walking by filling in last_Java_pc when it's safe. I went for the minimal fix, resetting clear_pc to true in reset_last_Java_frame() but not changing the API and all the callers. I can fix this if reviewers feel strongly about it. dl From david.holmes at oracle.com Fri Aug 5 02:28:47 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Aug 2016 12:28:47 +1000 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> Message-ID: <17a8d9a5-e9da-ef69-8f14-81b46710eaf1@oracle.com> Hi Volker, Thanks for looking at this. On 5/08/2016 1:48 AM, Volker Simonis wrote: > Hi David, > > thanks for doing this change on all platforms. > The fix looks good. Maybe you can just extend the following comment with > something like: > > // Note that the SR_lock plays no role in this suspend/resume protocol. > // It is only used in SR_handler as a thread termination indicator if > NULL. Darn this code is confusing - too many "SR"'s :( I have added // Note that the SR_lock plays no role in this suspend/resume protocol, // but is checked for NULL in SR_handler as a thread termination indicator. Updated webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev.v2/ This also reminded me to follow up on why the Solaris SR_handler is different and I found it is not actually installed as a direct signal handler, but is called from the real signal handler if dealing with a JavaThread or the VMThread. Consequently the Solaris version of the SR_handler can not encounter this specific bug and so I have reverted the changes to os_solaris.cpp Thanks, David > Regards, > Volker > > On Wed, Aug 3, 2016 at 3:13 AM, David Holmes > wrote: > > webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ > > > bug: https://bugs.openjdk.java.net/browse/JDK-8159461 > > > The suspend/resume signal (SR_signum) is never sent to a thread once > it has started to terminate. On one platform (SuSE 12) we have seen > what appears to be a "stuck" signal, which is only delivered when > the terminating thread restores its original signal mask (as if > pthread_sigmask makes the system realize there is a pending signal - > we already check the signal was not blocked). At this point in the > thread termination we have freed the osthread, so the the SR_handler > would access deallocated memory. In debug builds we first hit an > assertion that the current thread is a JavaThread or the VMThread - > that assertion fails, even though it is a JavaThread, because we > have already executed the ~JavaThread destructor and inside the > ~Thread destructor we are a plain Thread not a JavaThread. > > The fix was to make a small adjustment to the thread termination > process so that we delete the SR_lock before calling > os::free_thread(). In the SR_handler() we can then use a NULL check > of SR_lock() to indicate the thread has terminated and we return. > > While only seen on Linux I took the opportunity to apply the fix on > all platforms and also cleaned up the code where we were using > Thread::current() unsafely in a signal-handling context. > > Testing: regular tier 1 (JPRT) > Kitchensink (in progress) > > As we can't readily reproduce the problem I tested this by having a > terminating thread raise SR_signum directly from within the ~Thread > destructor. > > Thanks, > David > > From david.holmes at oracle.com Fri Aug 5 04:30:41 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Aug 2016 14:30:41 +1000 Subject: RFR 8162999 Build give extraneous find warnings In-Reply-To: <57A34F6D.9070808@oracle.com> References: <57A34F6D.9070808@oracle.com> Message-ID: On 5/08/2016 12:21 AM, Gerald Thornbrugh wrote: > Hi Everyone, > > I would like to have the following change reviewed: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8162999 > > Webrev: http://cr.openjdk.java.net/~gthornbr/8162999/hotspot-webrev.01/ > > > It seems that my putback of JDK-8144278 created a merge issue where a > previous > removed line from JDK-8132919 was placed back into JtregNative.gmk. This Thanks for fixing this but I blame hg all the way here! There is no sign of any mismerge - the resulting file simply does not match the changesets that were pushed. David ----- > fix > removes the line again. The presents of this line generates "No such > file or directory" > errors messages during builds because the "hotspot/test/compiler/native" > directory no > longer exists. > > Before this change the "No such file or directory" error messages were > seen in the > build logs and after the change the error messages where not placed into > the log. > > Please let me know if you have any questions or concerns. > > Thanks, > > Jerry From david.holmes at oracle.com Fri Aug 5 04:47:54 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Aug 2016 14:47:54 +1000 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: Message-ID: Hi Chris, On 5/08/2016 7:53 AM, Chris Plummer wrote: > Ping! I took another look at this and my earlier comments from JDK-8133749. I hate to see the functionality "fixed" yet still have a completely confusing and mis-named API. I'm still far from convinced that returning the callers caller wasn't an "error" that was done due to the lack of inlining and the appearance of an unexpected stackframe. You've now made things consistent - but os::current_frame() is completely mis-leading in name. And I'm still concerned that correctness here depends on C compiler inlining choices, with no way to verify at build time that they were indeed inlined or not! Don't we have ALWAYSINLINE to mark things like _get_previous_fp ? For that matter shouldn't _get_previous_fp be a macro so inlining plays no role ? Sorry but this code seems to simply limp from one broken state to another due to its fragility. Thanks, David ----- > On 8/2/16 1:31 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> webrev: >> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >> >> >> Bugs fixed: >> >> JDK-8133749: os::current_frame() is not returning the proper frame on >> ARM and solaris-x64 >> https://bugs.openjdk.java.net/browse/JDK-8133749 >> >> JDK-8133747: NMT includes an extra stack frame due to assumption NMT >> is making on tail calls being used >> https://bugs.openjdk.java.net/browse/JDK-8133747 >> >> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >> includes NativeCallStack::NativeCallStack() frame in backtrace >> https://bugs.openjdk.java.net/browse/JDK-8133740 >> >> The above bugs all result in the NMT detail stack traces including >> extra frames in the stack traces. Certain frames are suppose to be >> skipped, but sometimes are not. The frames that show up are: >> >> NativeCallStack::NativeCallStack >> os::get_native_stack >> >> These are both methods used to generate the stack trace, and therefore >> should not be included it. However, under some (most) circumstances, >> they were. >> >> Also, there was no test to make sure that any NMT detail output is >> generated, or that it is correct. I've added one with this webrev. Of >> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the >> 27 initially passed this new test. They were the product and fastdebug >> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >> builds for solaris-x64, windows-x86, and windows-x64. All the rest >> failed. They now all pass with my fixes in place. >> >> Here's a summary of the changes: >> >> src/os/posix/vm/os_posix.cpp >> src/os/windows/vm/os_windows.cpp >> >> JDK-8133747 fixes: There was some frame skipping logic here which was >> sort of correct, but was misplace. There are no extra frames being >> added in os::get_native_stack() due to lack of inlining or lack of a >> tail call, so no need for toSkip++ here. The logic has been moved to >> NativeCallStack::NativeCallStack, which is where the tail call is >> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >> >> src/share/vm/utilities/nativeCallStack.cpp >> >> JDK-8133747 fixes: The frame skipping logic that was moved here >> assumed that NativeCallStack::NativeCallStack would not appear in the >> call stack (due to a tail call be using to call os::get_native_stack) >> except in slow debug builds. However, some platforms also don't use a >> tail call even when optimized. From what I can tell that is the case >> for 32-bit platforms and for windows. >> >> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >> >> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >> skip one extra frame >> >> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >> >> JDK-8133749 fixes: os:current_frame() was not consistent with other >> platforms and needs to skip one more frame. This means it returns the >> frame for the caller's caller. So when called by >> os:get_native_stack(), it returns the frame for whoever called >> os::get_native_stack(). Although not intuitive, this is what >> os:get_native_stack() expects. Probably a method rename and/or a >> behavior change is justified here, but I would prefer to do that with >> a followup CR if anyone has a good suggestion on what to do. >> >> test/runtime/NMT/CheckForProperDetailStackTrace.java >> >> This is the new NTM detail test. It checks for frames that shouldn't >> be present and validates at least one stack trace is what is expected. >> >> I verified that the above test now passes on all supported platforms, >> and also did a full jprt "-testset hotpot" run. I plan on doing some >> RBT testing with NMT detail enabled before committing. >> >> Regarding the community contributed ports that Oracle does not >> support, I didn't make any changes there, but it looks like some of >> these bugs do exist. Notably: >> >> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >> done to the >> os_linux_x86.cp should also be applied here. >> -linux-ppc: Hard to say for sure since the implementation of >> os::current_frame is >> different than others, but it looks to me like it suffers from both >> JDK-8133749 >> and JDK-8133740. >> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >> need the >> same changes. >> >> These ports may also be suffering from JDK-8133747, but that fix is in >> shared code (nativeCallStack.cpp). My changes there will need some >> tweaking for these ports they don't use a tail call to call >> os::get_native_stack(). >> >> If the maintainers of these ports could send me some NMT detail >> output, I can advise better on what changes are needed. Then you can >> implement and test them, and then send them back to me and I'll >> include them with my changes. What I need is the following command run >> on product and slowdebug builds. Initially run without any of my >> changes applied. If needed I may followup with a request that they be >> run with the changes applied: >> >> bin/java -XX:+UnlockDiagnosticVMOptions >> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >> >> thanks, >> >> Chris >> > From chris.plummer at oracle.com Fri Aug 5 07:05:55 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 5 Aug 2016 00:05:55 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: Message-ID: <10b1ed76-cedb-d909-168f-863baf5869e2@oracle.com> Hi David, If fixing os::current_frame() to have a better name and also make it go up one less frame makes these changes more palatable, I'm willing to make that change. I would prefer to do it with a follow up CR (it would probably have to be an RFE), but will do it with these changes if necessary. I still pull hairs over the proper name for this method, even if it is modified to return the frame of whoever called it. Usually the meaning conveyed by a method's name does not change based on whether you choose the caller's or callee's point of view, but in this case it does, and I'm not sure which point of view makes more sense. If we choose the caller's point of view, then the proper name remains os::current_frame(). If we choose the callee's point of view, then it should be os::callers_frame(). Maybe there's a name that is agnostic and means the same thing from both view points. I just haven't thought of one yet. With respect to ALWAYSINLINE, it does not work for solaris and windows slowdebug builds. Note the special case in the test I wrote to allow for AllocateHeap() in the stack trace in this case, even though it shouldn't be there because it uses ALWAYSINLINE. I could have made changes in the source to get rid of it from the stack trace, but I didn't feel the source code disruption was worth it for a slowdebug build, especially since there are only a allocation call sites where it is a problem. I could use ALWAYSINLINE for the cases where it will work to inline _get_previous_fp, but I don't really see that as being any more reliable than what is there now. As for making _get_previous_fp() a macro, that's made more complicated because it has #ifdefs already. I could move its implementation directly into os::current_frame(). That would fix the inlining problem. I think it could also use some cleanup with the #ifdefs. For example, for linux-x86 do we have to worry about the SPARC_WORKS and __clang__ cases? And yes, even with my changes the code is no less fragile, and no less misdirected in its approach to getting a consistent allocation back trace. As I see it, there are 3 options: (1) Do nothing, and leave it both broken and fragile. (2) Do the cleanup I've done to at least correct the known stack trace issues. (3) Find another solution that doesn't suffer from these fragility issues. Note that (3) does not preclude doing (2) first, and (2) seems a better alternative than leaving it in its broken state (1). That's why I have pursued these changes even though I know things will still be fragile. thanks, Chris On 8/4/16 9:47 PM, David Holmes wrote: > Hi Chris, > > On 5/08/2016 7:53 AM, Chris Plummer wrote: >> Ping! > > I took another look at this and my earlier comments from JDK-8133749. > I hate to see the functionality "fixed" yet still have a completely > confusing and mis-named API. I'm still far from convinced that > returning the callers caller wasn't an "error" that was done due to > the lack of inlining and the appearance of an unexpected stackframe. > You've now made things consistent - but os::current_frame() is > completely mis-leading in name. And I'm still concerned that > correctness here depends on C compiler inlining choices, with no way > to verify at build time that they were indeed inlined or not! Don't we > have ALWAYSINLINE to mark things like _get_previous_fp ? For that > matter shouldn't _get_previous_fp be a macro so inlining plays no role ? > > Sorry but this code seems to simply limp from one broken state to > another due to its fragility. > > Thanks, > David > ----- > >> On 8/2/16 1:31 PM, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following: >>> >>> webrev: >>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>> >>> >>> >>> Bugs fixed: >>> >>> JDK-8133749: os::current_frame() is not returning the proper frame on >>> ARM and solaris-x64 >>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>> >>> JDK-8133747: NMT includes an extra stack frame due to assumption NMT >>> is making on tail calls being used >>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>> >>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>> includes NativeCallStack::NativeCallStack() frame in backtrace >>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>> >>> The above bugs all result in the NMT detail stack traces including >>> extra frames in the stack traces. Certain frames are suppose to be >>> skipped, but sometimes are not. The frames that show up are: >>> >>> NativeCallStack::NativeCallStack >>> os::get_native_stack >>> >>> These are both methods used to generate the stack trace, and therefore >>> should not be included it. However, under some (most) circumstances, >>> they were. >>> >>> Also, there was no test to make sure that any NMT detail output is >>> generated, or that it is correct. I've added one with this webrev. Of >>> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the >>> 27 initially passed this new test. They were the product and fastdebug >>> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >>> builds for solaris-x64, windows-x86, and windows-x64. All the rest >>> failed. They now all pass with my fixes in place. >>> >>> Here's a summary of the changes: >>> >>> src/os/posix/vm/os_posix.cpp >>> src/os/windows/vm/os_windows.cpp >>> >>> JDK-8133747 fixes: There was some frame skipping logic here which was >>> sort of correct, but was misplace. There are no extra frames being >>> added in os::get_native_stack() due to lack of inlining or lack of a >>> tail call, so no need for toSkip++ here. The logic has been moved to >>> NativeCallStack::NativeCallStack, which is where the tail call is >>> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >>> >>> src/share/vm/utilities/nativeCallStack.cpp >>> >>> JDK-8133747 fixes: The frame skipping logic that was moved here >>> assumed that NativeCallStack::NativeCallStack would not appear in the >>> call stack (due to a tail call be using to call os::get_native_stack) >>> except in slow debug builds. However, some platforms also don't use a >>> tail call even when optimized. From what I can tell that is the case >>> for 32-bit platforms and for windows. >>> >>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>> >>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >>> skip one extra frame >>> >>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>> >>> JDK-8133749 fixes: os:current_frame() was not consistent with other >>> platforms and needs to skip one more frame. This means it returns the >>> frame for the caller's caller. So when called by >>> os:get_native_stack(), it returns the frame for whoever called >>> os::get_native_stack(). Although not intuitive, this is what >>> os:get_native_stack() expects. Probably a method rename and/or a >>> behavior change is justified here, but I would prefer to do that with >>> a followup CR if anyone has a good suggestion on what to do. >>> >>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>> >>> This is the new NTM detail test. It checks for frames that shouldn't >>> be present and validates at least one stack trace is what is expected. >>> >>> I verified that the above test now passes on all supported platforms, >>> and also did a full jprt "-testset hotpot" run. I plan on doing some >>> RBT testing with NMT detail enabled before committing. >>> >>> Regarding the community contributed ports that Oracle does not >>> support, I didn't make any changes there, but it looks like some of >>> these bugs do exist. Notably: >>> >>> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >>> done to the >>> os_linux_x86.cp should also be applied here. >>> -linux-ppc: Hard to say for sure since the implementation of >>> os::current_frame is >>> different than others, but it looks to me like it suffers from both >>> JDK-8133749 >>> and JDK-8133740. >>> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >>> need the >>> same changes. >>> >>> These ports may also be suffering from JDK-8133747, but that fix is in >>> shared code (nativeCallStack.cpp). My changes there will need some >>> tweaking for these ports they don't use a tail call to call >>> os::get_native_stack(). >>> >>> If the maintainers of these ports could send me some NMT detail >>> output, I can advise better on what changes are needed. Then you can >>> implement and test them, and then send them back to me and I'll >>> include them with my changes. What I need is the following command run >>> on product and slowdebug builds. Initially run without any of my >>> changes applied. If needed I may followup with a request that they be >>> run with the changes applied: >>> >>> bin/java -XX:+UnlockDiagnosticVMOptions >>> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >>> >>> thanks, >>> >>> Chris >>> >> From dmitry.samersoff at oracle.com Fri Aug 5 09:25:09 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 5 Aug 2016 12:25:09 +0300 Subject: RFR(S): JDK-8157236 - attach on ARMv7 fails with com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file Message-ID: Everybody, Please review the fix: http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.02/ Problem: Tests fail intermittently because it can't attach to child process, these attach failures is hard to debug because attach framework doesn't provide enough diagnostic information. Solution: a) Increase attach timeout b) Slightly change attach loop to save a bit of CPU power. c) Add some logging to attach listener. It's just a first step in this direction. Complete cleanup of attach code (remove LinuxThreads support and convert all printing to UL) is not a goal of this fix - I'll file a separate CR for it. -Dmitry -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From zgu at redhat.com Fri Aug 5 12:47:06 2016 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 5 Aug 2016 08:47:06 -0400 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: Message-ID: <04de0d46-ebc0-5a66-31b3-2bf6b91c3524@redhat.com> Hi Chris, The changes look good in general. However, I am not sure that, by skipping the two stack capturing frames (NativeCallStack::NativeCallStack and os::get_native_stack), you are not skipping actual callers in some places. If I recall correctly, I saw inconsistent inlining behaviors in the same binaries. For example, you don't see these two frames in all NMT call stacks. I also wonder if the changes can survive compiler upgrades. Thanks, -Zhengyu On 08/04/2016 05:53 PM, Chris Plummer wrote: > Ping! > > On 8/2/16 1:31 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> webrev: >> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >> >> Bugs fixed: >> >> JDK-8133749: os::current_frame() is not returning the proper frame on >> ARM and solaris-x64 >> https://bugs.openjdk.java.net/browse/JDK-8133749 >> >> JDK-8133747: NMT includes an extra stack frame due to assumption NMT >> is making on tail calls being used >> https://bugs.openjdk.java.net/browse/JDK-8133747 >> >> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >> includes NativeCallStack::NativeCallStack() frame in backtrace >> https://bugs.openjdk.java.net/browse/JDK-8133740 >> >> The above bugs all result in the NMT detail stack traces including >> extra frames in the stack traces. Certain frames are suppose to be >> skipped, but sometimes are not. The frames that show up are: >> >> NativeCallStack::NativeCallStack >> os::get_native_stack >> >> These are both methods used to generate the stack trace, and >> therefore should not be included it. However, under some (most) >> circumstances, they were. >> >> Also, there was no test to make sure that any NMT detail output is >> generated, or that it is correct. I've added one with this webrev. Of >> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the >> 27 initially passed this new test. They were the product and >> fastdebug builds for solaris-sparc, bsd-x64, and linux-x64; and the >> slowdebug builds for solaris-x64, windows-x86, and windows-x64. All >> the rest failed. They now all pass with my fixes in place. >> >> Here's a summary of the changes: >> >> src/os/posix/vm/os_posix.cpp >> src/os/windows/vm/os_windows.cpp >> >> JDK-8133747 fixes: There was some frame skipping logic here which was >> sort of correct, but was misplace. There are no extra frames being >> added in os::get_native_stack() due to lack of inlining or lack of a >> tail call, so no need for toSkip++ here. The logic has been moved to >> NativeCallStack::NativeCallStack, which is where the tail call is >> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >> >> src/share/vm/utilities/nativeCallStack.cpp >> >> JDK-8133747 fixes: The frame skipping logic that was moved here >> assumed that NativeCallStack::NativeCallStack would not appear in the >> call stack (due to a tail call be using to call os::get_native_stack) >> except in slow debug builds. However, some platforms also don't use a >> tail call even when optimized. From what I can tell that is the case >> for 32-bit platforms and for windows. >> >> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >> >> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >> skip one extra frame >> >> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >> >> JDK-8133749 fixes: os:current_frame() was not consistent with other >> platforms and needs to skip one more frame. This means it returns the >> frame for the caller's caller. So when called by >> os:get_native_stack(), it returns the frame for whoever called >> os::get_native_stack(). Although not intuitive, this is what >> os:get_native_stack() expects. Probably a method rename and/or a >> behavior change is justified here, but I would prefer to do that with >> a followup CR if anyone has a good suggestion on what to do. >> >> test/runtime/NMT/CheckForProperDetailStackTrace.java >> >> This is the new NTM detail test. It checks for frames that shouldn't >> be present and validates at least one stack trace is what is expected. >> >> I verified that the above test now passes on all supported platforms, >> and also did a full jprt "-testset hotpot" run. I plan on doing some >> RBT testing with NMT detail enabled before committing. >> >> Regarding the community contributed ports that Oracle does not >> support, I didn't make any changes there, but it looks like some of >> these bugs do exist. Notably: >> >> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >> done to the >> os_linux_x86.cp should also be applied here. >> -linux-ppc: Hard to say for sure since the implementation of >> os::current_frame is >> different than others, but it looks to me like it suffers from both >> JDK-8133749 >> and JDK-8133740. >> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >> need the >> same changes. >> >> These ports may also be suffering from JDK-8133747, but that fix is >> in shared code (nativeCallStack.cpp). My changes there will need some >> tweaking for these ports they don't use a tail call to call >> os::get_native_stack(). >> >> If the maintainers of these ports could send me some NMT detail >> output, I can advise better on what changes are needed. Then you can >> implement and test them, and then send them back to me and I'll >> include them with my changes. What I need is the following command >> run on product and slowdebug builds. Initially run without any of my >> changes applied. If needed I may followup with a request that they be >> run with the changes applied: >> >> bin/java -XX:+UnlockDiagnosticVMOptions >> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >> >> thanks, >> >> Chris >> > From chris.plummer at oracle.com Fri Aug 5 18:10:47 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 5 Aug 2016 11:10:47 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <04de0d46-ebc0-5a66-31b3-2bf6b91c3524@redhat.com> References: <04de0d46-ebc0-5a66-31b3-2bf6b91c3524@redhat.com> Message-ID: Hi Zhengyu, I don't think skipping too many frames is a risk here. I only skip NativeCallStack::NativeCallStack from within NativeCallStack::NativeCallStack, and only do so for platforms where I have verified that it does not use a tail call to call os::get_native_stack. os::get_native_stack() needs to always skip itself, as does _get_previous_fp() when it is not being inlined. I did do a lot of manual verification of NMT detail output, comparing the original output with the new output. I did this on all platforms and for all 3 build flavors. The results looked to be correct. There is a risk of needing to do further adjustments when there are compiler upgrades. The test I wrote should capture this need. In any case, I believe "fixed but fragile" is better than "broken and fragile". thanks, Chris On 8/5/16 5:47 AM, Zhengyu Gu wrote: > Hi Chris, > > The changes look good in general. > > However, I am not sure that, by skipping the two stack capturing > frames (NativeCallStack::NativeCallStack and os::get_native_stack), > you are not skipping actual callers in some places. If I recall > correctly, I saw inconsistent inlining behaviors in the same binaries. > For example, you don't see these two frames in all NMT call stacks. I > also wonder if the changes can survive compiler upgrades. > > Thanks, > > -Zhengyu > > > On 08/04/2016 05:53 PM, Chris Plummer wrote: >> Ping! >> >> On 8/2/16 1:31 PM, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following: >>> >>> webrev: >>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>> >>> Bugs fixed: >>> >>> JDK-8133749: os::current_frame() is not returning the proper frame >>> on ARM and solaris-x64 >>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>> >>> JDK-8133747: NMT includes an extra stack frame due to assumption NMT >>> is making on tail calls being used >>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>> >>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>> includes NativeCallStack::NativeCallStack() frame in backtrace >>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>> >>> The above bugs all result in the NMT detail stack traces including >>> extra frames in the stack traces. Certain frames are suppose to be >>> skipped, but sometimes are not. The frames that show up are: >>> >>> NativeCallStack::NativeCallStack >>> os::get_native_stack >>> >>> These are both methods used to generate the stack trace, and >>> therefore should not be included it. However, under some (most) >>> circumstances, they were. >>> >>> Also, there was no test to make sure that any NMT detail output is >>> generated, or that it is correct. I've added one with this webrev. >>> Of the 27 possible builds (9 platforms * 3 build flavors), only 9 of >>> the 27 initially passed this new test. They were the product and >>> fastdebug builds for solaris-sparc, bsd-x64, and linux-x64; and the >>> slowdebug builds for solaris-x64, windows-x86, and windows-x64. All >>> the rest failed. They now all pass with my fixes in place. >>> >>> Here's a summary of the changes: >>> >>> src/os/posix/vm/os_posix.cpp >>> src/os/windows/vm/os_windows.cpp >>> >>> JDK-8133747 fixes: There was some frame skipping logic here which >>> was sort of correct, but was misplace. There are no extra frames >>> being added in os::get_native_stack() due to lack of inlining or >>> lack of a tail call, so no need for toSkip++ here. The logic has >>> been moved to NativeCallStack::NativeCallStack, which is where the >>> tail call is (sometimes) made, and also corrected (see >>> nativeCallStack.cpp below). >>> >>> src/share/vm/utilities/nativeCallStack.cpp >>> >>> JDK-8133747 fixes: The frame skipping logic that was moved here >>> assumed that NativeCallStack::NativeCallStack would not appear in >>> the call stack (due to a tail call be using to call >>> os::get_native_stack) except in slow debug builds. However, some >>> platforms also don't use a tail call even when optimized. From what >>> I can tell that is the case for 32-bit platforms and for windows. >>> >>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>> >>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >>> skip one extra frame >>> >>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>> >>> JDK-8133749 fixes: os:current_frame() was not consistent with other >>> platforms and needs to skip one more frame. This means it returns >>> the frame for the caller's caller. So when called by >>> os:get_native_stack(), it returns the frame for whoever called >>> os::get_native_stack(). Although not intuitive, this is what >>> os:get_native_stack() expects. Probably a method rename and/or a >>> behavior change is justified here, but I would prefer to do that >>> with a followup CR if anyone has a good suggestion on what to do. >>> >>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>> >>> This is the new NTM detail test. It checks for frames that shouldn't >>> be present and validates at least one stack trace is what is expected. >>> >>> I verified that the above test now passes on all supported >>> platforms, and also did a full jprt "-testset hotpot" run. I plan on >>> doing some RBT testing with NMT detail enabled before committing. >>> >>> Regarding the community contributed ports that Oracle does not >>> support, I didn't make any changes there, but it looks like some of >>> these bugs do exist. Notably: >>> >>> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >>> done to the >>> os_linux_x86.cp should also be applied here. >>> -linux-ppc: Hard to say for sure since the implementation of >>> os::current_frame is >>> different than others, but it looks to me like it suffers from both >>> JDK-8133749 >>> and JDK-8133740. >>> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >>> need the >>> same changes. >>> >>> These ports may also be suffering from JDK-8133747, but that fix is >>> in shared code (nativeCallStack.cpp). My changes there will need >>> some tweaking for these ports they don't use a tail call to call >>> os::get_native_stack(). >>> >>> If the maintainers of these ports could send me some NMT detail >>> output, I can advise better on what changes are needed. Then you can >>> implement and test them, and then send them back to me and I'll >>> include them with my changes. What I need is the following command >>> run on product and slowdebug builds. Initially run without any of my >>> changes applied. If needed I may followup with a request that they >>> be run with the changes applied: >>> >>> bin/java -XX:+UnlockDiagnosticVMOptions >>> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >>> >>> thanks, >>> >>> Chris >>> >> > From claes.redestad at oracle.com Fri Aug 5 19:36:49 2016 From: claes.redestad at oracle.com (Claes Redestad) Date: Fri, 5 Aug 2016 12:36:49 -0700 Subject: RFR: 8161588: MemberName::resolveOrNull cause and hide NoSuchMethodErrors In-Reply-To: <578E7CAE.6070705@oracle.com> References: <578E7CAE.6070705@oracle.com> Message-ID: <57A4EAD1.1020208@oracle.com> Withdrawing this as clearing exceptions actually doesn't seem to get rid of overhead(!), making this change moot. The overhead for my use case is small but annoying. Sorry for the noise. /Claes On 07/19/2016 12:17 PM, Claes Redestad wrote: > Hi, > > please review this bug fix to ensure MemberName::resolveOrNull doesn't > throw exceptions when speculatively looking up members that aren't > there. > > HS: http://cr.openjdk.java.net/~redestad/8161588/hs.01 > JDK: http://cr.openjdk.java.net/~redestad/8161588/jdk.01 > > This avoids throwing NoSuchMethodError etc just to be ignored, avoiding > a performance penalty when looking things up speculatively (which is key > to possible upcoming work to generate more JLI code with jlink). > > There's a pre-existing issue not dealt with by this fix in that the > exceptions thrown in MHN_resolve_Mem are never observed, instead the > exceptions thrown from various LinkResolver methods are observed. We > could clear all pending exceptions in resolve_MemberName, but this > breaks tests that are very particular about which exception is thrown > when and where, thus I opted to add the clear_pending boolean to > allow clearing the exception conditionally instead, keeping behavior > identical for MemberName::resolveOrFail > > Thanks! > > /Claes From coleen.phillimore at oracle.com Fri Aug 5 20:47:53 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 5 Aug 2016 16:47:53 -0400 Subject: RFR: 8161588: MemberName::resolveOrNull cause and hide NoSuchMethodErrors In-Reply-To: <57A4EAD1.1020208@oracle.com> References: <578E7CAE.6070705@oracle.com> <57A4EAD1.1020208@oracle.com> Message-ID: <058fff21-8877-bf31-7db4-b33893346eb1@oracle.com> Claes, I did look at this code and it looked a bit strange, but the existing code looks strange too. I'm not sure why it breaks the normal rules of TRAPS/CHECKS. Coleen On 8/5/16 3:36 PM, Claes Redestad wrote: > Withdrawing this as clearing exceptions actually doesn't seem to get > rid of > overhead(!), making this change moot. The overhead for my use case > is small but annoying. > > Sorry for the noise. > > /Claes > > On 07/19/2016 12:17 PM, Claes Redestad wrote: >> Hi, >> >> please review this bug fix to ensure MemberName::resolveOrNull doesn't >> throw exceptions when speculatively looking up members that aren't >> there. >> >> HS: http://cr.openjdk.java.net/~redestad/8161588/hs.01 >> JDK: http://cr.openjdk.java.net/~redestad/8161588/jdk.01 >> >> This avoids throwing NoSuchMethodError etc just to be ignored, avoiding >> a performance penalty when looking things up speculatively (which is key >> to possible upcoming work to generate more JLI code with jlink). >> >> There's a pre-existing issue not dealt with by this fix in that the >> exceptions thrown in MHN_resolve_Mem are never observed, instead the >> exceptions thrown from various LinkResolver methods are observed. We >> could clear all pending exceptions in resolve_MemberName, but this >> breaks tests that are very particular about which exception is thrown >> when and where, thus I opted to add the clear_pending boolean to >> allow clearing the exception conditionally instead, keeping behavior >> identical for MemberName::resolveOrFail >> >> Thanks! >> >> /Claes > From claes.redestad at oracle.com Fri Aug 5 21:54:09 2016 From: claes.redestad at oracle.com (Claes Redestad) Date: Fri, 05 Aug 2016 14:54:09 -0700 Subject: RFR: 8161588: MemberName::resolveOrNull cause and hide NoSuchMethodErrors In-Reply-To: <058fff21-8877-bf31-7db4-b33893346eb1@oracle.com> References: <578E7CAE.6070705@oracle.com> <57A4EAD1.1020208@oracle.com> <058fff21-8877-bf31-7db4-b33893346eb1@oracle.com> Message-ID: Right there are some oddities in this code which I haven't fully grasped, such as throwing new errors when there's already some pending... Still, having a way to speculatively resolve members without the exception overhead is on my wish list, I misread the current macros and mechanics thinking clearing the pending exception would avoid the upcalls (intuitively/naively in my mind the exception creation was deferred to the java/native boundary...). /Claes Coleen Phillimore skrev: (5 augusti 2016 13:47:53 GMT-07:00) > >Claes, I did look at this code and it looked a bit strange, but the >existing code looks strange too. I'm not sure why it breaks the normal > >rules of TRAPS/CHECKS. >Coleen > > >On 8/5/16 3:36 PM, Claes Redestad wrote: >> Withdrawing this as clearing exceptions actually doesn't seem to get >> rid of >> overhead(!), making this change moot. The overhead for my use case >> is small but annoying. >> >> Sorry for the noise. >> >> /Claes >> >> On 07/19/2016 12:17 PM, Claes Redestad wrote: >>> Hi, >>> >>> please review this bug fix to ensure MemberName::resolveOrNull >doesn't >>> throw exceptions when speculatively looking up members that aren't >>> there. >>> >>> HS: http://cr.openjdk.java.net/~redestad/8161588/hs.01 >>> JDK: http://cr.openjdk.java.net/~redestad/8161588/jdk.01 >>> >>> This avoids throwing NoSuchMethodError etc just to be ignored, >avoiding >>> a performance penalty when looking things up speculatively (which is >key >>> to possible upcoming work to generate more JLI code with jlink). >>> >>> There's a pre-existing issue not dealt with by this fix in that the >>> exceptions thrown in MHN_resolve_Mem are never observed, instead the >>> exceptions thrown from various LinkResolver methods are observed. We >>> could clear all pending exceptions in resolve_MemberName, but this >>> breaks tests that are very particular about which exception is >thrown >>> when and where, thus I opted to add the clear_pending boolean to >>> allow clearing the exception conditionally instead, keeping behavior >>> identical for MemberName::resolveOrFail >>> >>> Thanks! >>> >>> /Claes >> -- Sent from my Android device with K-9 Mail. Please excuse my brevity. From david.holmes at oracle.com Sun Aug 7 23:26:20 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Aug 2016 09:26:20 +1000 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <10b1ed76-cedb-d909-168f-863baf5869e2@oracle.com> References: <10b1ed76-cedb-d909-168f-863baf5869e2@oracle.com> Message-ID: <887ca8f2-4473-40ca-d1d1-336095c9f894@oracle.com> Hi Chris, I don't have any good suggestions for this. So go with (2) and lets work on (3). Thanks, David On 5/08/2016 5:05 PM, Chris Plummer wrote: > Hi David, > > If fixing os::current_frame() to have a better name and also make it go > up one less frame makes these changes more palatable, I'm willing to > make that change. I would prefer to do it with a follow up CR (it would > probably have to be an RFE), but will do it with these changes if > necessary. I still pull hairs over the proper name for this method, even > if it is modified to return the frame of whoever called it. Usually the > meaning conveyed by a method's name does not change based on whether you > choose the caller's or callee's point of view, but in this case it does, > and I'm not sure which point of view makes more sense. If we choose the > caller's point of view, then the proper name remains > os::current_frame(). If we choose the callee's point of view, then it > should be os::callers_frame(). Maybe there's a name that is agnostic and > means the same thing from both view points. I just haven't thought of > one yet. > > With respect to ALWAYSINLINE, it does not work for solaris and windows > slowdebug builds. Note the special case in the test I wrote to allow for > AllocateHeap() in the stack trace in this case, even though it shouldn't > be there because it uses ALWAYSINLINE. I could have made changes in the > source to get rid of it from the stack trace, but I didn't feel the > source code disruption was worth it for a slowdebug build, especially > since there are only a allocation call sites where it is a problem. I > could use ALWAYSINLINE for the cases where it will work to inline > _get_previous_fp, but I don't really see that as being any more reliable > than what is there now. > > As for making _get_previous_fp() a macro, that's made more complicated > because it has #ifdefs already. I could move its implementation directly > into os::current_frame(). That would fix the inlining problem. I think > it could also use some cleanup with the #ifdefs. For example, for > linux-x86 do we have to worry about the SPARC_WORKS and __clang__ cases? > > And yes, even with my changes the code is no less fragile, and no less > misdirected in its approach to getting a consistent allocation back > trace. As I see it, there are 3 options: > > (1) Do nothing, and leave it both broken and fragile. > (2) Do the cleanup I've done to at least correct the known stack trace > issues. > (3) Find another solution that doesn't suffer from these fragility issues. > > Note that (3) does not preclude doing (2) first, and (2) seems a better > alternative than leaving it in its broken state (1). That's why I have > pursued these changes even though I know things will still be fragile. > > thanks, > > Chris > > On 8/4/16 9:47 PM, David Holmes wrote: >> Hi Chris, >> >> On 5/08/2016 7:53 AM, Chris Plummer wrote: >>> Ping! >> >> I took another look at this and my earlier comments from JDK-8133749. >> I hate to see the functionality "fixed" yet still have a completely >> confusing and mis-named API. I'm still far from convinced that >> returning the callers caller wasn't an "error" that was done due to >> the lack of inlining and the appearance of an unexpected stackframe. >> You've now made things consistent - but os::current_frame() is >> completely mis-leading in name. And I'm still concerned that >> correctness here depends on C compiler inlining choices, with no way >> to verify at build time that they were indeed inlined or not! Don't we >> have ALWAYSINLINE to mark things like _get_previous_fp ? For that >> matter shouldn't _get_previous_fp be a macro so inlining plays no role ? >> >> Sorry but this code seems to simply limp from one broken state to >> another due to its fragility. >> >> Thanks, >> David >> ----- >> >>> On 8/2/16 1:31 PM, Chris Plummer wrote: >>>> Hello, >>>> >>>> Please review the following: >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>>> >>>> >>>> >>>> Bugs fixed: >>>> >>>> JDK-8133749: os::current_frame() is not returning the proper frame on >>>> ARM and solaris-x64 >>>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>>> >>>> JDK-8133747: NMT includes an extra stack frame due to assumption NMT >>>> is making on tail calls being used >>>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>>> >>>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>>> includes NativeCallStack::NativeCallStack() frame in backtrace >>>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>>> >>>> The above bugs all result in the NMT detail stack traces including >>>> extra frames in the stack traces. Certain frames are suppose to be >>>> skipped, but sometimes are not. The frames that show up are: >>>> >>>> NativeCallStack::NativeCallStack >>>> os::get_native_stack >>>> >>>> These are both methods used to generate the stack trace, and therefore >>>> should not be included it. However, under some (most) circumstances, >>>> they were. >>>> >>>> Also, there was no test to make sure that any NMT detail output is >>>> generated, or that it is correct. I've added one with this webrev. Of >>>> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the >>>> 27 initially passed this new test. They were the product and fastdebug >>>> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >>>> builds for solaris-x64, windows-x86, and windows-x64. All the rest >>>> failed. They now all pass with my fixes in place. >>>> >>>> Here's a summary of the changes: >>>> >>>> src/os/posix/vm/os_posix.cpp >>>> src/os/windows/vm/os_windows.cpp >>>> >>>> JDK-8133747 fixes: There was some frame skipping logic here which was >>>> sort of correct, but was misplace. There are no extra frames being >>>> added in os::get_native_stack() due to lack of inlining or lack of a >>>> tail call, so no need for toSkip++ here. The logic has been moved to >>>> NativeCallStack::NativeCallStack, which is where the tail call is >>>> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >>>> >>>> src/share/vm/utilities/nativeCallStack.cpp >>>> >>>> JDK-8133747 fixes: The frame skipping logic that was moved here >>>> assumed that NativeCallStack::NativeCallStack would not appear in the >>>> call stack (due to a tail call be using to call os::get_native_stack) >>>> except in slow debug builds. However, some platforms also don't use a >>>> tail call even when optimized. From what I can tell that is the case >>>> for 32-bit platforms and for windows. >>>> >>>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>>> >>>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >>>> skip one extra frame >>>> >>>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>>> >>>> JDK-8133749 fixes: os:current_frame() was not consistent with other >>>> platforms and needs to skip one more frame. This means it returns the >>>> frame for the caller's caller. So when called by >>>> os:get_native_stack(), it returns the frame for whoever called >>>> os::get_native_stack(). Although not intuitive, this is what >>>> os:get_native_stack() expects. Probably a method rename and/or a >>>> behavior change is justified here, but I would prefer to do that with >>>> a followup CR if anyone has a good suggestion on what to do. >>>> >>>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>>> >>>> This is the new NTM detail test. It checks for frames that shouldn't >>>> be present and validates at least one stack trace is what is expected. >>>> >>>> I verified that the above test now passes on all supported platforms, >>>> and also did a full jprt "-testset hotpot" run. I plan on doing some >>>> RBT testing with NMT detail enabled before committing. >>>> >>>> Regarding the community contributed ports that Oracle does not >>>> support, I didn't make any changes there, but it looks like some of >>>> these bugs do exist. Notably: >>>> >>>> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >>>> done to the >>>> os_linux_x86.cp should also be applied here. >>>> -linux-ppc: Hard to say for sure since the implementation of >>>> os::current_frame is >>>> different than others, but it looks to me like it suffers from both >>>> JDK-8133749 >>>> and JDK-8133740. >>>> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >>>> need the >>>> same changes. >>>> >>>> These ports may also be suffering from JDK-8133747, but that fix is in >>>> shared code (nativeCallStack.cpp). My changes there will need some >>>> tweaking for these ports they don't use a tail call to call >>>> os::get_native_stack(). >>>> >>>> If the maintainers of these ports could send me some NMT detail >>>> output, I can advise better on what changes are needed. Then you can >>>> implement and test them, and then send them back to me and I'll >>>> include them with my changes. What I need is the following command run >>>> on product and slowdebug builds. Initially run without any of my >>>> changes applied. If needed I may followup with a request that they be >>>> run with the changes applied: >>>> >>>> bin/java -XX:+UnlockDiagnosticVMOptions >>>> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >>>> >>>> thanks, >>>> >>>> Chris >>>> >>> > From david.holmes at oracle.com Sun Aug 7 23:40:46 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Aug 2016 09:40:46 +1000 Subject: RFR(S): JDK-8157236 - attach on ARMv7 fails with com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file In-Reply-To: References: Message-ID: <1cf08a48-d7c0-1953-08ef-5d75f8225a3c@oracle.com> Hi Dmitry, On 5/08/2016 7:25 PM, Dmitry Samersoff wrote: > Everybody, > > Please review the fix: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.02/ > > Problem: > Tests fail intermittently because it can't attach to child process, > these attach failures is hard to debug because attach framework > doesn't provide enough diagnostic information. > > Solution: > > a) Increase attach timeout > b) Slightly change attach loop to save a bit of CPU power. > c) Add some logging to attach listener. > > It's just a first step in this direction. Complete cleanup of attach > code (remove LinuxThreads support and convert all printing to UL) is not > a goal of this fix - I'll file a separate CR for it. I still think you need more logging now to aid in debugging these cases. In particular we want to be able to verify that the path of the attach file is what we expect in all cases ie whether we find the .attach_pid file in cwd or whether we are looking in temp directory, and whether we ultimately succeed or fail. Plus whatever you do now should be done consistently for all platforms. Thanks, David > -Dmitry > From volker.simonis at gmail.com Mon Aug 8 13:35:41 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 8 Aug 2016 15:35:41 +0200 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: <17a8d9a5-e9da-ef69-8f14-81b46710eaf1@oracle.com> References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> <17a8d9a5-e9da-ef69-8f14-81b46710eaf1@oracle.com> Message-ID: Hi David, looks good now. Thanks, Volker On Fri, Aug 5, 2016 at 4:28 AM, David Holmes wrote: > Hi Volker, > > Thanks for looking at this. > > On 5/08/2016 1:48 AM, Volker Simonis wrote: >> >> Hi David, >> >> thanks for doing this change on all platforms. >> The fix looks good. Maybe you can just extend the following comment with >> something like: >> >> // Note that the SR_lock plays no role in this suspend/resume protocol. >> // It is only used in SR_handler as a thread termination indicator if >> NULL. > > > Darn this code is confusing - too many "SR"'s :( I have added > > // Note that the SR_lock plays no role in this suspend/resume protocol, > // but is checked for NULL in SR_handler as a thread termination indicator. > > Updated webrev: > > http://cr.openjdk.java.net/~dholmes/8159461/webrev.v2/ > > This also reminded me to follow up on why the Solaris SR_handler is > different and I found it is not actually installed as a direct signal > handler, but is called from the real signal handler if dealing with a > JavaThread or the VMThread. Consequently the Solaris version of the > SR_handler can not encounter this specific bug and so I have reverted the > changes to os_solaris.cpp > > Thanks, > David > > >> Regards, >> Volker >> >> On Wed, Aug 3, 2016 at 3:13 AM, David Holmes > > wrote: >> >> webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ >> >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8159461 >> >> >> The suspend/resume signal (SR_signum) is never sent to a thread once >> it has started to terminate. On one platform (SuSE 12) we have seen >> what appears to be a "stuck" signal, which is only delivered when >> the terminating thread restores its original signal mask (as if >> pthread_sigmask makes the system realize there is a pending signal - >> we already check the signal was not blocked). At this point in the >> thread termination we have freed the osthread, so the the SR_handler >> would access deallocated memory. In debug builds we first hit an >> assertion that the current thread is a JavaThread or the VMThread - >> that assertion fails, even though it is a JavaThread, because we >> have already executed the ~JavaThread destructor and inside the >> ~Thread destructor we are a plain Thread not a JavaThread. >> >> The fix was to make a small adjustment to the thread termination >> process so that we delete the SR_lock before calling >> os::free_thread(). In the SR_handler() we can then use a NULL check >> of SR_lock() to indicate the thread has terminated and we return. >> >> While only seen on Linux I took the opportunity to apply the fix on >> all platforms and also cleaned up the code where we were using >> Thread::current() unsafely in a signal-handling context. >> >> Testing: regular tier 1 (JPRT) >> Kitchensink (in progress) >> >> As we can't readily reproduce the problem I tested this by having a >> terminating thread raise SR_signum directly from within the ~Thread >> destructor. >> >> Thanks, >> David >> >> > From frederic.parain at oracle.com Mon Aug 8 14:55:32 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Mon, 8 Aug 2016 10:55:32 -0400 Subject: RFR(S): JDK-8146697 : VM crashes in test Test7005594 Message-ID: Greetings, Please review this small fix for JDK-8146697 https://bugs.openjdk.java.net/browse/JDK-8146697 Summary: The JVM sometimes tries to re-enable the Reserved Stack Area while it is currently not disabled, leading to the following assertion failure: share/vm/runtime/thread.cpp:2551 assert(_stack_guard_state != stack_guard_enabled) failed: already enabled This problem occurred while running different tests including tests where stack overflows are unlikely. It is rare and very hard to reproduce. At the beginning of the investigation, I've been able to reproduce it three times out of 1,000+ runs of metaspace stress test (the fact that was is a metaspace test doesn't matter). But once I've instrumented the JVM, the bug didn't show up again, even after 30,000+ runs. So, I've investigated it with the limited material I had. The failures always occurred on x86/32bits platforms. Regarding that some failures occurred on tests where stack overflows are unlikely (no recursive calls, small call stack), and that all failures occurred in interpreted Java code, my guess is that the issue is in the test performed on interpreted method exit to determine if the Reserved Stack Area should be enabled or not. The test on method exit compares the SP of the caller frame to an activation SP address stored in the JavaThread object when the Reserved Stack Area has been disabled. Without a reproducible test case, I've not been able to find what was the issue between the two values (de-opt, OSR, other?). So, I've slightly changed the test to make it more robust against the situation causing the assertion failure. Now the test checks the status of the guard pages, and if no guard pages have been disabled, the method exits normally. This means there's always only one test on interpreted method exit if Reserved Stack Area has not been used, so no difference on performances for most cases. If this first test detects that guard pages have been disabled, then the previous test (caller SP vs activation SP) is performed, to determine if this is the place where the Reserved Stack Area should be re-enabled or not. Even if the root cause of the bug is still unknown, the fix should make the code more robust and prevent unnecessary re-enabling of the Reserved Stack Area. Webrev: http://cr.openjdk.java.net/~fparain/8146697/webrev.00/ Thank you, Fred From daniel.daugherty at oracle.com Mon Aug 8 16:07:46 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 8 Aug 2016 10:07:46 -0600 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: <17a8d9a5-e9da-ef69-8f14-81b46710eaf1@oracle.com> References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> <17a8d9a5-e9da-ef69-8f14-81b46710eaf1@oracle.com> Message-ID: On 8/4/16 8:28 PM, David Holmes wrote: > Hi Volker, > > Thanks for looking at this. > > On 5/08/2016 1:48 AM, Volker Simonis wrote: >> Hi David, >> >> thanks for doing this change on all platforms. >> The fix looks good. Maybe you can just extend the following comment with >> something like: >> >> // Note that the SR_lock plays no role in this suspend/resume >> protocol. >> // It is only used in SR_handler as a thread termination indicator if >> NULL. > > Darn this code is confusing - too many "SR"'s :( I have added > > // Note that the SR_lock plays no role in this suspend/resume protocol, > // but is checked for NULL in SR_handler as a thread termination > indicator. > > Updated webrev: > > http://cr.openjdk.java.net/~dholmes/8159461/webrev.v2/ src/share/vm/runtime/thread.cpp L380: _SR_lock = NULL; I was expecting the _SR_lock to be freed and NULL'ed earlier based on the discussion in the bug report. Since the crashing assert() happens in a race between the JavaThread destructor the NULL'ing of the _SR_lock field, I was expecting the _SR_lock field to be dealt with as early as possible in the Thread destructor (or even earlier; see my last comment). src/os/linux/vm/os_linux.cpp L4010: // mask is changed as part of thread termination. Check the current thread grammar?: "Check the current" -> "Check that the current" L4015: if (thread->SR_lock() == NULL) L4016: return; style nit: multi-line if-statements require '{' and '}' Please add the braces or make this a single line if-statement. I would prefer the braces. :-) Isn't there still a window between the completion of the JavaThread destructor and where the Thread destructor sets _SR_lock = NULL? L4020: OSThread* osthread = thread->osthread(); Not your bug. This code assumes that osthread != NULL. Maybe it needs to be more robust. src/os/aix/vm/os_aix.cpp L2731: if (thread->SR_lock() == NULL) L2732: return; Same style nit. Same race. L2736: OSThread* osthread = thread->osthread(); Same robustness comment. src/os/bsd/vm/os_bsd.cpp L2759: if (thread->SR_lock() == NULL) L2760: return; Same style nit. Same race. L2764: OSThread* osthread = thread->osthread(); Same robustness comment. It has been a very long time since I've dealt with races in the suspend/resume code so I'm probably very rusty with this code. If the _SR_lock is only used by the JavaThread suspend/resume protocol, then we could consider free'ing and NULL'ing the field in the JavaThread destructor (as the last piece of work). That should eliminate the race that was being observed by the SR_handler() in this bug. It will open a very small race where is_Java_thread() can return true, the _SR_lock field is !NULL, but the _SR_lock has been deleted. Dan > > This also reminded me to follow up on why the Solaris SR_handler is > different and I found it is not actually installed as a direct signal > handler, but is called from the real signal handler if dealing with a > JavaThread or the VMThread. Consequently the Solaris version of the > SR_handler can not encounter this specific bug and so I have reverted > the changes to os_solaris.cpp > > Thanks, > David > > >> Regards, >> Volker >> >> On Wed, Aug 3, 2016 at 3:13 AM, David Holmes > > wrote: >> >> webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ >> >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8159461 >> >> >> The suspend/resume signal (SR_signum) is never sent to a thread once >> it has started to terminate. On one platform (SuSE 12) we have seen >> what appears to be a "stuck" signal, which is only delivered when >> the terminating thread restores its original signal mask (as if >> pthread_sigmask makes the system realize there is a pending signal - >> we already check the signal was not blocked). At this point in the >> thread termination we have freed the osthread, so the the SR_handler >> would access deallocated memory. In debug builds we first hit an >> assertion that the current thread is a JavaThread or the VMThread - >> that assertion fails, even though it is a JavaThread, because we >> have already executed the ~JavaThread destructor and inside the >> ~Thread destructor we are a plain Thread not a JavaThread. >> >> The fix was to make a small adjustment to the thread termination >> process so that we delete the SR_lock before calling >> os::free_thread(). In the SR_handler() we can then use a NULL check >> of SR_lock() to indicate the thread has terminated and we return. >> >> While only seen on Linux I took the opportunity to apply the fix on >> all platforms and also cleaned up the code where we were using >> Thread::current() unsafely in a signal-handling context. >> >> Testing: regular tier 1 (JPRT) >> Kitchensink (in progress) >> >> As we can't readily reproduce the problem I tested this by having a >> terminating thread raise SR_signum directly from within the ~Thread >> destructor. >> >> Thanks, >> David >> >> From harold.seigel at oracle.com Mon Aug 8 17:25:34 2016 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 8 Aug 2016 13:25:34 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> Message-ID: <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> Hi, Please review the latest version of this change. It is similar to the below change except it changes uses of -mp to -p. http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ Thanks! Harold On 8/4/2016 8:46 AM, harold seigel wrote: > > Hi, > > Please review this update for this fix. This webrev only shows the > changes since the last webrev. These changes include: > > 1. Fix forJDK-8162415 > - the JVM now > prints the following message when ignoring a property and > PrintWarnings is enabled: > warning: Ignoring system property options whose names start with > '-Djdk.module'. They are reserved for internal use. > > 2. Fix for JDK-8162412 > > 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests > > 4. Incorporates review comments from Alan, Coleen, Dan, and Lois > 5. Fixes JTReg tests that failed due to the new option syntax. > > Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ > > Thanks, Harold > > On 8/2/2016 9:25 AM, harold seigel wrote: >> >> Hi Lois, >> >> Thanks for the review. Please see comments in-line. >> >> Harold >> >> >> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>> >>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>> Hi, >>>> >>>> Please review these Hotspot VM only changes to process the seven >>>> module-specific options that have been renamed to have gnu-like >>>> names. JDK changes for this bug will be reviewed separately. >>>> >>>> Descriptions of these options are here >>>> . For these six options, >>>> --module-path, --upgrade-module-path, --add-modules, >>>> --limit-modules, --add-reads, and --add-exports, the JVM just sets >>>> a system property. For the --patch-module option, the JVM sets a >>>> system property and then processes the option in the same way as >>>> when it was named -Xpatch. >>>> >>>> Additionally, the JVM now checks properties specified on the >>>> command line. If a property matches one of the properties used by >>>> one of the above options then the JVM ignores the property. This >>>> forces users to use the explicit option when wanting to do things >>>> like add a module or a package export. >>>> >>>> The RFR contains two new tests. Also, many existing tests were >>>> changed to use the new option names. >>>> >>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>> >>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>> >>> Hi Harold, >>> >>> Overall looks good. A couple of comments: >>> >>> src/share/vm/prims/jvmtiEnv.cpp >>> - line #3428 - The if statement is incorrect. There are internal >>> properties, like jdk.boot.class.path.append, whose value if non-null >>> should be returned. >> This code will be reworked in the next version of these changes >> because of multiple issues. >>> >>> src/share/vm/runtime/arguments.cpp >>> - Arguments::append_to_addmods_property was added before the VM >>> starting to process --add-modules. So with this fix, it seems like >>> it could be simply changed to: >>> >>> bool Arguments::append_to_addmods_property(const char* >>> module_name) { >>> PropertyList_unique_add(&_system_properties, >>> Arguments::get_property("jdk.module.addmods"), >>> module_name, >>> AppendProperty, UnwriteableProperty, InternalProperty); >>> } >>> >>> Please consider making this change since currently it contains a lot >>> of duplicated code that is now unnecessary. >> The one difference is that append_to_addmods_property() returns a >> status but PropertyList_unique_add() does not. I'll look into this a >> bit further. >>> >>> - line #3171, should the comment be "--add-modules=java.sql" instead >>> of "--add-modules java.sql"? >> yes. >> >> The changes suggested by you, Coleen, and Dan will be in the next >> version of this webrev. >> >> Thanks, Harold >>> >>> Thanks, >>> Lois >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>> >>>> The changes were tested with the JCK lang and VM tests, the JTreg >>>> hotspot tests, and the RBT hotspot nightlies. >>>> >>>> Thanks, Harold >>> >> > From daniel.daugherty at oracle.com Mon Aug 8 17:57:54 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 8 Aug 2016 11:57:54 -0600 Subject: RFR 8162999 Build give extraneous find warnings In-Reply-To: References: <57A34F6D.9070808@oracle.com> Message-ID: <2e91d509-462a-3ba3-97bd-01fabce52bf0@oracle.com> On 8/4/16 10:30 PM, David Holmes wrote: > On 5/08/2016 12:21 AM, Gerald Thornbrugh wrote: >> Hi Everyone, >> >> I would like to have the following change reviewed: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8162999 >> >> Webrev: http://cr.openjdk.java.net/~gthornbr/8162999/hotspot-webrev.01/ >> >> >> It seems that my putback of JDK-8144278 created a merge issue where a >> previous >> removed line from JDK-8132919 was placed back into JtregNative.gmk. This > > Thanks for fixing this but I blame hg all the way here! There is no > sign of any mismerge - the resulting file simply does not match the > changesets that were pushed. This sound similar to another bug from the past: 8154121 Remove test mistakenly added during a merge Jesper and I did some analysis on how/where hg went wrong, but we never finished chasing it to ground. Sigh... it's bad when you can't trust your tools... Dan > > David > ----- > > >> fix >> removes the line again. The presents of this line generates "No such >> file or directory" >> errors messages during builds because the "hotspot/test/compiler/native" >> directory no >> longer exists. >> >> Before this change the "No such file or directory" error messages were >> seen in the >> build logs and after the change the error messages where not placed into >> the log. >> >> Please let me know if you have any questions or concerns. >> >> Thanks, >> >> Jerry From chris.plummer at oracle.com Mon Aug 8 20:22:41 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 8 Aug 2016 13:22:41 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <887ca8f2-4473-40ca-d1d1-336095c9f894@oracle.com> References: <10b1ed76-cedb-d909-168f-863baf5869e2@oracle.com> <887ca8f2-4473-40ca-d1d1-336095c9f894@oracle.com> Message-ID: Hi David, Did you want me to implement any of the additional cleanup work I mentioned: manually inline _get_previous_fp, change os::current_frame() to walk back one less frame, possibly rename os::current_frame()? thanks, Chris On 8/7/16 4:26 PM, David Holmes wrote: > Hi Chris, > > I don't have any good suggestions for this. So go with (2) and lets > work on (3). > > Thanks, > David > > On 5/08/2016 5:05 PM, Chris Plummer wrote: >> Hi David, >> >> If fixing os::current_frame() to have a better name and also make it go >> up one less frame makes these changes more palatable, I'm willing to >> make that change. I would prefer to do it with a follow up CR (it would >> probably have to be an RFE), but will do it with these changes if >> necessary. I still pull hairs over the proper name for this method, even >> if it is modified to return the frame of whoever called it. Usually the >> meaning conveyed by a method's name does not change based on whether you >> choose the caller's or callee's point of view, but in this case it does, >> and I'm not sure which point of view makes more sense. If we choose the >> caller's point of view, then the proper name remains >> os::current_frame(). If we choose the callee's point of view, then it >> should be os::callers_frame(). Maybe there's a name that is agnostic and >> means the same thing from both view points. I just haven't thought of >> one yet. >> >> With respect to ALWAYSINLINE, it does not work for solaris and windows >> slowdebug builds. Note the special case in the test I wrote to allow for >> AllocateHeap() in the stack trace in this case, even though it shouldn't >> be there because it uses ALWAYSINLINE. I could have made changes in the >> source to get rid of it from the stack trace, but I didn't feel the >> source code disruption was worth it for a slowdebug build, especially >> since there are only a allocation call sites where it is a problem. I >> could use ALWAYSINLINE for the cases where it will work to inline >> _get_previous_fp, but I don't really see that as being any more reliable >> than what is there now. >> >> As for making _get_previous_fp() a macro, that's made more complicated >> because it has #ifdefs already. I could move its implementation directly >> into os::current_frame(). That would fix the inlining problem. I think >> it could also use some cleanup with the #ifdefs. For example, for >> linux-x86 do we have to worry about the SPARC_WORKS and __clang__ cases? >> >> And yes, even with my changes the code is no less fragile, and no less >> misdirected in its approach to getting a consistent allocation back >> trace. As I see it, there are 3 options: >> >> (1) Do nothing, and leave it both broken and fragile. >> (2) Do the cleanup I've done to at least correct the known stack trace >> issues. >> (3) Find another solution that doesn't suffer from these fragility >> issues. >> >> Note that (3) does not preclude doing (2) first, and (2) seems a better >> alternative than leaving it in its broken state (1). That's why I have >> pursued these changes even though I know things will still be fragile. >> >> thanks, >> >> Chris >> >> On 8/4/16 9:47 PM, David Holmes wrote: >>> Hi Chris, >>> >>> On 5/08/2016 7:53 AM, Chris Plummer wrote: >>>> Ping! >>> >>> I took another look at this and my earlier comments from JDK-8133749. >>> I hate to see the functionality "fixed" yet still have a completely >>> confusing and mis-named API. I'm still far from convinced that >>> returning the callers caller wasn't an "error" that was done due to >>> the lack of inlining and the appearance of an unexpected stackframe. >>> You've now made things consistent - but os::current_frame() is >>> completely mis-leading in name. And I'm still concerned that >>> correctness here depends on C compiler inlining choices, with no way >>> to verify at build time that they were indeed inlined or not! Don't we >>> have ALWAYSINLINE to mark things like _get_previous_fp ? For that >>> matter shouldn't _get_previous_fp be a macro so inlining plays no >>> role ? >>> >>> Sorry but this code seems to simply limp from one broken state to >>> another due to its fragility. >>> >>> Thanks, >>> David >>> ----- >>> >>>> On 8/2/16 1:31 PM, Chris Plummer wrote: >>>>> Hello, >>>>> >>>>> Please review the following: >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>>>> >>>>> >>>>> >>>>> >>>>> Bugs fixed: >>>>> >>>>> JDK-8133749: os::current_frame() is not returning the proper frame on >>>>> ARM and solaris-x64 >>>>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>>>> >>>>> JDK-8133747: NMT includes an extra stack frame due to assumption NMT >>>>> is making on tail calls being used >>>>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>>>> >>>>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>>>> includes NativeCallStack::NativeCallStack() frame in backtrace >>>>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>>>> >>>>> The above bugs all result in the NMT detail stack traces including >>>>> extra frames in the stack traces. Certain frames are suppose to be >>>>> skipped, but sometimes are not. The frames that show up are: >>>>> >>>>> NativeCallStack::NativeCallStack >>>>> os::get_native_stack >>>>> >>>>> These are both methods used to generate the stack trace, and >>>>> therefore >>>>> should not be included it. However, under some (most) circumstances, >>>>> they were. >>>>> >>>>> Also, there was no test to make sure that any NMT detail output is >>>>> generated, or that it is correct. I've added one with this webrev. Of >>>>> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the >>>>> 27 initially passed this new test. They were the product and >>>>> fastdebug >>>>> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >>>>> builds for solaris-x64, windows-x86, and windows-x64. All the rest >>>>> failed. They now all pass with my fixes in place. >>>>> >>>>> Here's a summary of the changes: >>>>> >>>>> src/os/posix/vm/os_posix.cpp >>>>> src/os/windows/vm/os_windows.cpp >>>>> >>>>> JDK-8133747 fixes: There was some frame skipping logic here which was >>>>> sort of correct, but was misplace. There are no extra frames being >>>>> added in os::get_native_stack() due to lack of inlining or lack of a >>>>> tail call, so no need for toSkip++ here. The logic has been moved to >>>>> NativeCallStack::NativeCallStack, which is where the tail call is >>>>> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >>>>> >>>>> src/share/vm/utilities/nativeCallStack.cpp >>>>> >>>>> JDK-8133747 fixes: The frame skipping logic that was moved here >>>>> assumed that NativeCallStack::NativeCallStack would not appear in the >>>>> call stack (due to a tail call be using to call os::get_native_stack) >>>>> except in slow debug builds. However, some platforms also don't use a >>>>> tail call even when optimized. From what I can tell that is the case >>>>> for 32-bit platforms and for windows. >>>>> >>>>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>>>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>>>> >>>>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >>>>> skip one extra frame >>>>> >>>>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>>>> >>>>> JDK-8133749 fixes: os:current_frame() was not consistent with other >>>>> platforms and needs to skip one more frame. This means it returns the >>>>> frame for the caller's caller. So when called by >>>>> os:get_native_stack(), it returns the frame for whoever called >>>>> os::get_native_stack(). Although not intuitive, this is what >>>>> os:get_native_stack() expects. Probably a method rename and/or a >>>>> behavior change is justified here, but I would prefer to do that with >>>>> a followup CR if anyone has a good suggestion on what to do. >>>>> >>>>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>>>> >>>>> This is the new NTM detail test. It checks for frames that shouldn't >>>>> be present and validates at least one stack trace is what is >>>>> expected. >>>>> >>>>> I verified that the above test now passes on all supported platforms, >>>>> and also did a full jprt "-testset hotpot" run. I plan on doing some >>>>> RBT testing with NMT detail enabled before committing. >>>>> >>>>> Regarding the community contributed ports that Oracle does not >>>>> support, I didn't make any changes there, but it looks like some of >>>>> these bugs do exist. Notably: >>>>> >>>>> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >>>>> done to the >>>>> os_linux_x86.cp should also be applied here. >>>>> -linux-ppc: Hard to say for sure since the implementation of >>>>> os::current_frame is >>>>> different than others, but it looks to me like it suffers from both >>>>> JDK-8133749 >>>>> and JDK-8133740. >>>>> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >>>>> need the >>>>> same changes. >>>>> >>>>> These ports may also be suffering from JDK-8133747, but that fix >>>>> is in >>>>> shared code (nativeCallStack.cpp). My changes there will need some >>>>> tweaking for these ports they don't use a tail call to call >>>>> os::get_native_stack(). >>>>> >>>>> If the maintainers of these ports could send me some NMT detail >>>>> output, I can advise better on what changes are needed. Then you can >>>>> implement and test them, and then send them back to me and I'll >>>>> include them with my changes. What I need is the following command >>>>> run >>>>> on product and slowdebug builds. Initially run without any of my >>>>> changes applied. If needed I may followup with a request that they be >>>>> run with the changes applied: >>>>> >>>>> bin/java -XX:+UnlockDiagnosticVMOptions >>>>> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>> >> From coleen.phillimore at oracle.com Mon Aug 8 21:04:41 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 8 Aug 2016 17:04:41 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> Message-ID: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/src/share/vm/prims/jvmtiEnv.cpp.udiff.html *!_char **tmp_value = *property_ptr+readable_count++_;* This looks really painful to me. Can you add some parentheses? Or put readable_count++ on the next statement. Is this an incremental webrev? Can you send a full one? thanks, Coleen On 8/8/16 1:25 PM, harold seigel wrote: > Hi, > > Please review the latest version of this change. It is similar to the > below change except it changes uses of -mp to -p. > > http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ > > Thanks! Harold > > > On 8/4/2016 8:46 AM, harold seigel wrote: >> >> Hi, >> >> Please review this update for this fix. This webrev only shows the >> changes since the last webrev. These changes include: >> >> 1. Fix forJDK-8162415 >> - the JVM now >> prints the following message when ignoring a property and >> PrintWarnings is enabled: >> warning: Ignoring system property options whose names start with >> '-Djdk.module'. They are reserved for internal use. >> >> 2. Fix for JDK-8162412 >> >> 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests >> >> 4. Incorporates review comments from Alan, Coleen, Dan, and Lois >> 5. Fixes JTReg tests that failed due to the new option syntax. >> >> Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ >> >> Thanks, Harold >> >> On 8/2/2016 9:25 AM, harold seigel wrote: >>> >>> Hi Lois, >>> >>> Thanks for the review. Please see comments in-line. >>> >>> Harold >>> >>> >>> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>>> >>>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review these Hotspot VM only changes to process the seven >>>>> module-specific options that have been renamed to have gnu-like >>>>> names. JDK changes for this bug will be reviewed separately. >>>>> >>>>> Descriptions of these options are here >>>>> . For these six options, >>>>> --module-path, --upgrade-module-path, --add-modules, >>>>> --limit-modules, --add-reads, and --add-exports, the JVM just sets >>>>> a system property. For the --patch-module option, the JVM sets a >>>>> system property and then processes the option in the same way as >>>>> when it was named -Xpatch. >>>>> >>>>> Additionally, the JVM now checks properties specified on the >>>>> command line. If a property matches one of the properties used by >>>>> one of the above options then the JVM ignores the property. This >>>>> forces users to use the explicit option when wanting to do things >>>>> like add a module or a package export. >>>>> >>>>> The RFR contains two new tests. Also, many existing tests were >>>>> changed to use the new option names. >>>>> >>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>>> >>>> Hi Harold, >>>> >>>> Overall looks good. A couple of comments: >>>> >>>> src/share/vm/prims/jvmtiEnv.cpp >>>> - line #3428 - The if statement is incorrect. There are internal >>>> properties, like jdk.boot.class.path.append, whose value if >>>> non-null should be returned. >>> This code will be reworked in the next version of these changes >>> because of multiple issues. >>>> >>>> src/share/vm/runtime/arguments.cpp >>>> - Arguments::append_to_addmods_property was added before the VM >>>> starting to process --add-modules. So with this fix, it seems like >>>> it could be simply changed to: >>>> >>>> bool Arguments::append_to_addmods_property(const char* >>>> module_name) { >>>> PropertyList_unique_add(&_system_properties, >>>> Arguments::get_property("jdk.module.addmods"), >>>> module_name, >>>> AppendProperty, UnwriteableProperty, InternalProperty); >>>> } >>>> >>>> Please consider making this change since currently it contains a >>>> lot of duplicated code that is now unnecessary. >>> The one difference is that append_to_addmods_property() returns a >>> status but PropertyList_unique_add() does not. I'll look into this >>> a bit further. >>>> >>>> - line #3171, should the comment be "--add-modules=java.sql" >>>> instead of "--add-modules java.sql"? >>> yes. >>> >>> The changes suggested by you, Coleen, and Dan will be in the next >>> version of this webrev. >>> >>> Thanks, Harold >>>> >>>> Thanks, >>>> Lois >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> The changes were tested with the JCK lang and VM tests, the JTreg >>>>> hotspot tests, and the RBT hotspot nightlies. >>>>> >>>>> Thanks, Harold >>>> >>> >> > From david.holmes at oracle.com Mon Aug 8 23:57:35 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 Aug 2016 09:57:35 +1000 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> <17a8d9a5-e9da-ef69-8f14-81b46710eaf1@oracle.com> Message-ID: <047f996d-51b2-8993-6439-a32e6d5c7908@oracle.com> Hi Dan, Thanks for the review. On 9/08/2016 2:07 AM, Daniel D. Daugherty wrote: > On 8/4/16 8:28 PM, David Holmes wrote: >> Hi Volker, >> >> Thanks for looking at this. >> >> On 5/08/2016 1:48 AM, Volker Simonis wrote: >>> Hi David, >>> >>> thanks for doing this change on all platforms. >>> The fix looks good. Maybe you can just extend the following comment with >>> something like: >>> >>> // Note that the SR_lock plays no role in this suspend/resume >>> protocol. >>> // It is only used in SR_handler as a thread termination indicator if >>> NULL. >> >> Darn this code is confusing - too many "SR"'s :( I have added >> >> // Note that the SR_lock plays no role in this suspend/resume protocol, >> // but is checked for NULL in SR_handler as a thread termination >> indicator. >> >> Updated webrev: >> >> http://cr.openjdk.java.net/~dholmes/8159461/webrev.v2/ > > src/share/vm/runtime/thread.cpp > L380: _SR_lock = NULL; > I was expecting the _SR_lock to be freed and NULL'ed earlier > based on the discussion in the bug report. Since the crashing > assert() happens in a race between the JavaThread destructor > the NULL'ing of the _SR_lock field, I was expecting the _SR_lock > field to be dealt with as early as possible in the Thread > destructor (or even earlier; see my last comment). I will respond after that comment. > src/os/linux/vm/os_linux.cpp > L4010: // mask is changed as part of thread termination. Check the > current thread > grammar?: "Check the current" -> "Check that the current" Will change. > L4015: if (thread->SR_lock() == NULL) > L4016: return; > style nit: multi-line if-statements require '{' and '}' > Please add the braces or make this a single line if-statement. > I would prefer the braces. :-) Will fix. > Isn't there still a window between the completion of the > JavaThread destructor and where the Thread destructor sets > _SR_lock = NULL? See below. > L4020: OSThread* osthread = thread->osthread(); > Not your bug. This code assumes that osthread != NULL. > Maybe it needs to be more robust. Depends what kind of impossibilities we want to guard against. :) There should be no possible way a signal can be sent to a thread that doesn't even have a osThread as it means we never successfully started/attached the thread. > src/os/aix/vm/os_aix.cpp > L2731: if (thread->SR_lock() == NULL) > L2732: return; > Same style nit. > > Same race. > > L2736: OSThread* osthread = thread->osthread(); > Same robustness comment. > > src/os/bsd/vm/os_bsd.cpp > L2759: if (thread->SR_lock() == NULL) > L2760: return; > Same style nit. > > Same race. > > L2764: OSThread* osthread = thread->osthread(); > Same robustness comment. > > It has been a very long time since I've dealt with races in the > suspend/resume code so I'm probably very rusty with this code. > If the _SR_lock is only used by the JavaThread suspend/resume > protocol, then we could consider free'ing and NULL'ing the field > in the JavaThread destructor (as the last piece of work). > > That should eliminate the race that was being observed by the > SR_handler() in this bug. It will open a very small race where > is_Java_thread() can return true, the _SR_lock field is !NULL, > but the _SR_lock has been deleted. Given that it should have been impossible to get into the SR_handler in the first place from this code I was trying to minimize the disruption to the existing logic. Moving the delete/NULLing to just before the call to os::free_thread() fixes the crashes that had been observed. I was not trying to make the entire destruction sequence safe wrt. the SR_handler. My major concern with deleting the SR_lock much earlier is the potential race condition that I have previously outlined in: https://bugs.openjdk.java.net/browse/JDK-8152849 where there is no protection against a target thread terminating. The sooner it terminates and deletes the SR_lock the more likely we may attempt to lock a deleted lock! Thanks, David > Dan > > >> >> This also reminded me to follow up on why the Solaris SR_handler is >> different and I found it is not actually installed as a direct signal >> handler, but is called from the real signal handler if dealing with a >> JavaThread or the VMThread. Consequently the Solaris version of the >> SR_handler can not encounter this specific bug and so I have reverted >> the changes to os_solaris.cpp >> >> Thanks, >> David >> >> >>> Regards, >>> Volker >>> >>> On Wed, Aug 3, 2016 at 3:13 AM, David Holmes >> > wrote: >>> >>> webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ >>> >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8159461 >>> >>> >>> The suspend/resume signal (SR_signum) is never sent to a thread once >>> it has started to terminate. On one platform (SuSE 12) we have seen >>> what appears to be a "stuck" signal, which is only delivered when >>> the terminating thread restores its original signal mask (as if >>> pthread_sigmask makes the system realize there is a pending signal - >>> we already check the signal was not blocked). At this point in the >>> thread termination we have freed the osthread, so the the SR_handler >>> would access deallocated memory. In debug builds we first hit an >>> assertion that the current thread is a JavaThread or the VMThread - >>> that assertion fails, even though it is a JavaThread, because we >>> have already executed the ~JavaThread destructor and inside the >>> ~Thread destructor we are a plain Thread not a JavaThread. >>> >>> The fix was to make a small adjustment to the thread termination >>> process so that we delete the SR_lock before calling >>> os::free_thread(). In the SR_handler() we can then use a NULL check >>> of SR_lock() to indicate the thread has terminated and we return. >>> >>> While only seen on Linux I took the opportunity to apply the fix on >>> all platforms and also cleaned up the code where we were using >>> Thread::current() unsafely in a signal-handling context. >>> >>> Testing: regular tier 1 (JPRT) >>> Kitchensink (in progress) >>> >>> As we can't readily reproduce the problem I tested this by having a >>> terminating thread raise SR_signum directly from within the ~Thread >>> destructor. >>> >>> Thanks, >>> David >>> >>> > From david.holmes at oracle.com Tue Aug 9 00:33:08 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 Aug 2016 10:33:08 +1000 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> <17a8d9a5-e9da-ef69-8f14-81b46710eaf1@oracle.com> Message-ID: <026edff0-4e36-793c-4711-00a605435914@oracle.com> Thanks Volker! David On 8/08/2016 11:35 PM, Volker Simonis wrote: > Hi David, > > looks good now. > > Thanks, > Volker > > > On Fri, Aug 5, 2016 at 4:28 AM, David Holmes wrote: >> Hi Volker, >> >> Thanks for looking at this. >> >> On 5/08/2016 1:48 AM, Volker Simonis wrote: >>> >>> Hi David, >>> >>> thanks for doing this change on all platforms. >>> The fix looks good. Maybe you can just extend the following comment with >>> something like: >>> >>> // Note that the SR_lock plays no role in this suspend/resume protocol. >>> // It is only used in SR_handler as a thread termination indicator if >>> NULL. >> >> >> Darn this code is confusing - too many "SR"'s :( I have added >> >> // Note that the SR_lock plays no role in this suspend/resume protocol, >> // but is checked for NULL in SR_handler as a thread termination indicator. >> >> Updated webrev: >> >> http://cr.openjdk.java.net/~dholmes/8159461/webrev.v2/ >> >> This also reminded me to follow up on why the Solaris SR_handler is >> different and I found it is not actually installed as a direct signal >> handler, but is called from the real signal handler if dealing with a >> JavaThread or the VMThread. Consequently the Solaris version of the >> SR_handler can not encounter this specific bug and so I have reverted the >> changes to os_solaris.cpp >> >> Thanks, >> David >> >> >>> Regards, >>> Volker >>> >>> On Wed, Aug 3, 2016 at 3:13 AM, David Holmes >> > wrote: >>> >>> webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ >>> >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8159461 >>> >>> >>> The suspend/resume signal (SR_signum) is never sent to a thread once >>> it has started to terminate. On one platform (SuSE 12) we have seen >>> what appears to be a "stuck" signal, which is only delivered when >>> the terminating thread restores its original signal mask (as if >>> pthread_sigmask makes the system realize there is a pending signal - >>> we already check the signal was not blocked). At this point in the >>> thread termination we have freed the osthread, so the the SR_handler >>> would access deallocated memory. In debug builds we first hit an >>> assertion that the current thread is a JavaThread or the VMThread - >>> that assertion fails, even though it is a JavaThread, because we >>> have already executed the ~JavaThread destructor and inside the >>> ~Thread destructor we are a plain Thread not a JavaThread. >>> >>> The fix was to make a small adjustment to the thread termination >>> process so that we delete the SR_lock before calling >>> os::free_thread(). In the SR_handler() we can then use a NULL check >>> of SR_lock() to indicate the thread has terminated and we return. >>> >>> While only seen on Linux I took the opportunity to apply the fix on >>> all platforms and also cleaned up the code where we were using >>> Thread::current() unsafely in a signal-handling context. >>> >>> Testing: regular tier 1 (JPRT) >>> Kitchensink (in progress) >>> >>> As we can't readily reproduce the problem I tested this by having a >>> terminating thread raise SR_signum directly from within the ~Thread >>> destructor. >>> >>> Thanks, >>> David >>> >>> >> From david.holmes at oracle.com Tue Aug 9 00:52:28 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 Aug 2016 10:52:28 +1000 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: <10b1ed76-cedb-d909-168f-863baf5869e2@oracle.com> <887ca8f2-4473-40ca-d1d1-336095c9f894@oracle.com> Message-ID: <2b3b19a4-d6d9-99c9-87a8-3f42ce97adb6@oracle.com> On 9/08/2016 6:22 AM, Chris Plummer wrote: > Hi David, > > Did you want me to implement any of the additional cleanup work I > mentioned: manually inline _get_previous_fp, change os::current_frame() > to walk back one less frame, possibly rename os::current_frame()? Up to you. I'm not insisting on anything, but the less reliance we have on uncheckable (at build time) compiler behaviour, the better. Thanks, David > thanks, > > Chris > > On 8/7/16 4:26 PM, David Holmes wrote: >> Hi Chris, >> >> I don't have any good suggestions for this. So go with (2) and lets >> work on (3). >> >> Thanks, >> David >> >> On 5/08/2016 5:05 PM, Chris Plummer wrote: >>> Hi David, >>> >>> If fixing os::current_frame() to have a better name and also make it go >>> up one less frame makes these changes more palatable, I'm willing to >>> make that change. I would prefer to do it with a follow up CR (it would >>> probably have to be an RFE), but will do it with these changes if >>> necessary. I still pull hairs over the proper name for this method, even >>> if it is modified to return the frame of whoever called it. Usually the >>> meaning conveyed by a method's name does not change based on whether you >>> choose the caller's or callee's point of view, but in this case it does, >>> and I'm not sure which point of view makes more sense. If we choose the >>> caller's point of view, then the proper name remains >>> os::current_frame(). If we choose the callee's point of view, then it >>> should be os::callers_frame(). Maybe there's a name that is agnostic and >>> means the same thing from both view points. I just haven't thought of >>> one yet. >>> >>> With respect to ALWAYSINLINE, it does not work for solaris and windows >>> slowdebug builds. Note the special case in the test I wrote to allow for >>> AllocateHeap() in the stack trace in this case, even though it shouldn't >>> be there because it uses ALWAYSINLINE. I could have made changes in the >>> source to get rid of it from the stack trace, but I didn't feel the >>> source code disruption was worth it for a slowdebug build, especially >>> since there are only a allocation call sites where it is a problem. I >>> could use ALWAYSINLINE for the cases where it will work to inline >>> _get_previous_fp, but I don't really see that as being any more reliable >>> than what is there now. >>> >>> As for making _get_previous_fp() a macro, that's made more complicated >>> because it has #ifdefs already. I could move its implementation directly >>> into os::current_frame(). That would fix the inlining problem. I think >>> it could also use some cleanup with the #ifdefs. For example, for >>> linux-x86 do we have to worry about the SPARC_WORKS and __clang__ cases? >>> >>> And yes, even with my changes the code is no less fragile, and no less >>> misdirected in its approach to getting a consistent allocation back >>> trace. As I see it, there are 3 options: >>> >>> (1) Do nothing, and leave it both broken and fragile. >>> (2) Do the cleanup I've done to at least correct the known stack trace >>> issues. >>> (3) Find another solution that doesn't suffer from these fragility >>> issues. >>> >>> Note that (3) does not preclude doing (2) first, and (2) seems a better >>> alternative than leaving it in its broken state (1). That's why I have >>> pursued these changes even though I know things will still be fragile. >>> >>> thanks, >>> >>> Chris >>> >>> On 8/4/16 9:47 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> On 5/08/2016 7:53 AM, Chris Plummer wrote: >>>>> Ping! >>>> >>>> I took another look at this and my earlier comments from JDK-8133749. >>>> I hate to see the functionality "fixed" yet still have a completely >>>> confusing and mis-named API. I'm still far from convinced that >>>> returning the callers caller wasn't an "error" that was done due to >>>> the lack of inlining and the appearance of an unexpected stackframe. >>>> You've now made things consistent - but os::current_frame() is >>>> completely mis-leading in name. And I'm still concerned that >>>> correctness here depends on C compiler inlining choices, with no way >>>> to verify at build time that they were indeed inlined or not! Don't we >>>> have ALWAYSINLINE to mark things like _get_previous_fp ? For that >>>> matter shouldn't _get_previous_fp be a macro so inlining plays no >>>> role ? >>>> >>>> Sorry but this code seems to simply limp from one broken state to >>>> another due to its fragility. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> On 8/2/16 1:31 PM, Chris Plummer wrote: >>>>>> Hello, >>>>>> >>>>>> Please review the following: >>>>>> >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Bugs fixed: >>>>>> >>>>>> JDK-8133749: os::current_frame() is not returning the proper frame on >>>>>> ARM and solaris-x64 >>>>>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>>>>> >>>>>> JDK-8133747: NMT includes an extra stack frame due to assumption NMT >>>>>> is making on tail calls being used >>>>>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>>>>> >>>>>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>>>>> includes NativeCallStack::NativeCallStack() frame in backtrace >>>>>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>>>>> >>>>>> The above bugs all result in the NMT detail stack traces including >>>>>> extra frames in the stack traces. Certain frames are suppose to be >>>>>> skipped, but sometimes are not. The frames that show up are: >>>>>> >>>>>> NativeCallStack::NativeCallStack >>>>>> os::get_native_stack >>>>>> >>>>>> These are both methods used to generate the stack trace, and >>>>>> therefore >>>>>> should not be included it. However, under some (most) circumstances, >>>>>> they were. >>>>>> >>>>>> Also, there was no test to make sure that any NMT detail output is >>>>>> generated, or that it is correct. I've added one with this webrev. Of >>>>>> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the >>>>>> 27 initially passed this new test. They were the product and >>>>>> fastdebug >>>>>> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >>>>>> builds for solaris-x64, windows-x86, and windows-x64. All the rest >>>>>> failed. They now all pass with my fixes in place. >>>>>> >>>>>> Here's a summary of the changes: >>>>>> >>>>>> src/os/posix/vm/os_posix.cpp >>>>>> src/os/windows/vm/os_windows.cpp >>>>>> >>>>>> JDK-8133747 fixes: There was some frame skipping logic here which was >>>>>> sort of correct, but was misplace. There are no extra frames being >>>>>> added in os::get_native_stack() due to lack of inlining or lack of a >>>>>> tail call, so no need for toSkip++ here. The logic has been moved to >>>>>> NativeCallStack::NativeCallStack, which is where the tail call is >>>>>> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >>>>>> >>>>>> src/share/vm/utilities/nativeCallStack.cpp >>>>>> >>>>>> JDK-8133747 fixes: The frame skipping logic that was moved here >>>>>> assumed that NativeCallStack::NativeCallStack would not appear in the >>>>>> call stack (due to a tail call be using to call os::get_native_stack) >>>>>> except in slow debug builds. However, some platforms also don't use a >>>>>> tail call even when optimized. From what I can tell that is the case >>>>>> for 32-bit platforms and for windows. >>>>>> >>>>>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>>>>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>>>>> >>>>>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >>>>>> skip one extra frame >>>>>> >>>>>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>>>>> >>>>>> JDK-8133749 fixes: os:current_frame() was not consistent with other >>>>>> platforms and needs to skip one more frame. This means it returns the >>>>>> frame for the caller's caller. So when called by >>>>>> os:get_native_stack(), it returns the frame for whoever called >>>>>> os::get_native_stack(). Although not intuitive, this is what >>>>>> os:get_native_stack() expects. Probably a method rename and/or a >>>>>> behavior change is justified here, but I would prefer to do that with >>>>>> a followup CR if anyone has a good suggestion on what to do. >>>>>> >>>>>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>>>>> >>>>>> This is the new NTM detail test. It checks for frames that shouldn't >>>>>> be present and validates at least one stack trace is what is >>>>>> expected. >>>>>> >>>>>> I verified that the above test now passes on all supported platforms, >>>>>> and also did a full jprt "-testset hotpot" run. I plan on doing some >>>>>> RBT testing with NMT detail enabled before committing. >>>>>> >>>>>> Regarding the community contributed ports that Oracle does not >>>>>> support, I didn't make any changes there, but it looks like some of >>>>>> these bugs do exist. Notably: >>>>>> >>>>>> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >>>>>> done to the >>>>>> os_linux_x86.cp should also be applied here. >>>>>> -linux-ppc: Hard to say for sure since the implementation of >>>>>> os::current_frame is >>>>>> different than others, but it looks to me like it suffers from both >>>>>> JDK-8133749 >>>>>> and JDK-8133740. >>>>>> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >>>>>> need the >>>>>> same changes. >>>>>> >>>>>> These ports may also be suffering from JDK-8133747, but that fix >>>>>> is in >>>>>> shared code (nativeCallStack.cpp). My changes there will need some >>>>>> tweaking for these ports they don't use a tail call to call >>>>>> os::get_native_stack(). >>>>>> >>>>>> If the maintainers of these ports could send me some NMT detail >>>>>> output, I can advise better on what changes are needed. Then you can >>>>>> implement and test them, and then send them back to me and I'll >>>>>> include them with my changes. What I need is the following command >>>>>> run >>>>>> on product and slowdebug builds. Initially run without any of my >>>>>> changes applied. If needed I may followup with a request that they be >>>>>> run with the changes applied: >>>>>> >>>>>> bin/java -XX:+UnlockDiagnosticVMOptions >>>>>> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>> >>> > From mandy.chung at oracle.com Tue Aug 9 03:39:48 2016 From: mandy.chung at oracle.com (Mandy Chung) Date: Mon, 8 Aug 2016 20:39:48 -0700 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> Message-ID: This is the full hotspot webrev containing all of Harold's incremental patches: http://cr.openjdk.java.net/~mchung/jdk9/webrevs/8136930/gnu-options/webrev-hotspot.03/ I will be pushing the hotspot change for Harold together with the CLI work for jdk, langtools, and other repos once the code review is completed. FYI. Changes for jdk, langtools and other repos are posted [1]. Mandy [1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-August/009025.html > On Aug 8, 2016, at 2:04 PM, Coleen Phillimore wrote: > > > http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/src/share/vm/prims/jvmtiEnv.cpp.udiff.html > > *!_char **tmp_value = *property_ptr+readable_count++_;* > > > This looks really painful to me. Can you add some parentheses? Or put readable_count++ on the next statement. > > Is this an incremental webrev? Can you send a full one? > thanks, > Coleen > > > On 8/8/16 1:25 PM, harold seigel wrote: >> Hi, >> >> Please review the latest version of this change. It is similar to the below change except it changes uses of -mp to -p. >> >> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ >> >> Thanks! Harold >> >> >> On 8/4/2016 8:46 AM, harold seigel wrote: >>> >>> Hi, >>> >>> Please review this update for this fix. This webrev only shows the changes since the last webrev. These changes include: >>> >>> 1. Fix forJDK-8162415 >>> - the JVM now >>> prints the following message when ignoring a property and >>> PrintWarnings is enabled: >>> warning: Ignoring system property options whose names start with >>> '-Djdk.module'. They are reserved for internal use. >>> >>> 2. Fix for JDK-8162412 >>> >>> 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests >>> >>> 4. Incorporates review comments from Alan, Coleen, Dan, and Lois >>> 5. Fixes JTReg tests that failed due to the new option syntax. >>> >>> Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ >>> >>> Thanks, Harold >>> >>> On 8/2/2016 9:25 AM, harold seigel wrote: >>>> >>>> Hi Lois, >>>> >>>> Thanks for the review. Please see comments in-line. >>>> >>>> Harold >>>> >>>> >>>> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>>>> >>>>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>>>> Hi, >>>>>> >>>>>> Please review these Hotspot VM only changes to process the seven module-specific options that have been renamed to have gnu-like names. JDK changes for this bug will be reviewed separately. >>>>>> >>>>>> Descriptions of these options are here . For these six options, --module-path, --upgrade-module-path, --add-modules, --limit-modules, --add-reads, and --add-exports, the JVM just sets a system property. For the --patch-module option, the JVM sets a system property and then processes the option in the same way as when it was named -Xpatch. >>>>>> >>>>>> Additionally, the JVM now checks properties specified on the command line. If a property matches one of the properties used by one of the above options then the JVM ignores the property. This forces users to use the explicit option when wanting to do things like add a module or a package export. >>>>>> >>>>>> The RFR contains two new tests. Also, many existing tests were changed to use the new option names. >>>>>> >>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>>>> >>>>> Hi Harold, >>>>> >>>>> Overall looks good. A couple of comments: >>>>> >>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>> - line #3428 - The if statement is incorrect. There are internal properties, like jdk.boot.class.path.append, whose value if non-null should be returned. >>>> This code will be reworked in the next version of these changes because of multiple issues. >>>>> >>>>> src/share/vm/runtime/arguments.cpp >>>>> - Arguments::append_to_addmods_property was added before the VM starting to process --add-modules. So with this fix, it seems like it could be simply changed to: >>>>> >>>>> bool Arguments::append_to_addmods_property(const char* >>>>> module_name) { >>>>> PropertyList_unique_add(&_system_properties, >>>>> Arguments::get_property("jdk.module.addmods"), >>>>> module_name, >>>>> AppendProperty, UnwriteableProperty, InternalProperty); >>>>> } >>>>> >>>>> Please consider making this change since currently it contains a lot of duplicated code that is now unnecessary. >>>> The one difference is that append_to_addmods_property() returns a status but PropertyList_unique_add() does not. I'll look into this a bit further. >>>>> >>>>> - line #3171, should the comment be "--add-modules=java.sql" instead of "--add-modules java.sql"? >>>> yes. >>>> >>>> The changes suggested by you, Coleen, and Dan will be in the next version of this webrev. >>>> >>>> Thanks, Harold >>>>> >>>>> Thanks, >>>>> Lois >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> The changes were tested with the JCK lang and VM tests, the JTreg hotspot tests, and the RBT hotspot nightlies. >>>>>> >>>>>> Thanks, Harold >>>>> >>>> >>> >> > From harold.seigel at oracle.com Tue Aug 9 12:57:20 2016 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 9 Aug 2016 08:57:20 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> Message-ID: <94b7bdf6-a90d-bb41-da8f-e5203b8af426@oracle.com> Hi Coleen, I can move the readable_count++ to another line. Harold On 8/8/2016 5:04 PM, Coleen Phillimore wrote: > > http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/src/share/vm/prims/jvmtiEnv.cpp.udiff.html > > > *!_char **tmp_value = *property_ptr+readable_count++_;* > > > This looks really painful to me. Can you add some parentheses? Or put > readable_count++ on the next statement. > > Is this an incremental webrev? Can you send a full one? > thanks, > Coleen > > > On 8/8/16 1:25 PM, harold seigel wrote: >> Hi, >> >> Please review the latest version of this change. It is similar to >> the below change except it changes uses of -mp to -p. >> >> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ >> >> Thanks! Harold >> >> >> On 8/4/2016 8:46 AM, harold seigel wrote: >>> >>> Hi, >>> >>> Please review this update for this fix. This webrev only shows the >>> changes since the last webrev. These changes include: >>> >>> 1. Fix forJDK-8162415 >>> - the JVM now >>> prints the following message when ignoring a property and >>> PrintWarnings is enabled: >>> warning: Ignoring system property options whose names start with >>> '-Djdk.module'. They are reserved for internal use. >>> >>> 2. Fix for JDK-8162412 >>> >>> 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests >>> >>> 4. Incorporates review comments from Alan, Coleen, Dan, and Lois >>> 5. Fixes JTReg tests that failed due to the new option syntax. >>> >>> Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ >>> >>> Thanks, Harold >>> >>> On 8/2/2016 9:25 AM, harold seigel wrote: >>>> >>>> Hi Lois, >>>> >>>> Thanks for the review. Please see comments in-line. >>>> >>>> Harold >>>> >>>> >>>> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>>>> >>>>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>>>> Hi, >>>>>> >>>>>> Please review these Hotspot VM only changes to process the seven >>>>>> module-specific options that have been renamed to have gnu-like >>>>>> names. JDK changes for this bug will be reviewed separately. >>>>>> >>>>>> Descriptions of these options are here >>>>>> . For these six options, >>>>>> --module-path, --upgrade-module-path, --add-modules, >>>>>> --limit-modules, --add-reads, and --add-exports, the JVM just >>>>>> sets a system property. For the --patch-module option, the JVM >>>>>> sets a system property and then processes the option in the same >>>>>> way as when it was named -Xpatch. >>>>>> >>>>>> Additionally, the JVM now checks properties specified on the >>>>>> command line. If a property matches one of the properties used >>>>>> by one of the above options then the JVM ignores the property. >>>>>> This forces users to use the explicit option when wanting to do >>>>>> things like add a module or a package export. >>>>>> >>>>>> The RFR contains two new tests. Also, many existing tests were >>>>>> changed to use the new option names. >>>>>> >>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>>>> >>>>> Hi Harold, >>>>> >>>>> Overall looks good. A couple of comments: >>>>> >>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>> - line #3428 - The if statement is incorrect. There are internal >>>>> properties, like jdk.boot.class.path.append, whose value if >>>>> non-null should be returned. >>>> This code will be reworked in the next version of these changes >>>> because of multiple issues. >>>>> >>>>> src/share/vm/runtime/arguments.cpp >>>>> - Arguments::append_to_addmods_property was added before the VM >>>>> starting to process --add-modules. So with this fix, it seems >>>>> like it could be simply changed to: >>>>> >>>>> bool Arguments::append_to_addmods_property(const char* >>>>> module_name) { >>>>> PropertyList_unique_add(&_system_properties, >>>>> Arguments::get_property("jdk.module.addmods"), >>>>> module_name, >>>>> AppendProperty, UnwriteableProperty, InternalProperty); >>>>> } >>>>> >>>>> Please consider making this change since currently it contains a >>>>> lot of duplicated code that is now unnecessary. >>>> The one difference is that append_to_addmods_property() returns a >>>> status but PropertyList_unique_add() does not. I'll look into this >>>> a bit further. >>>>> >>>>> - line #3171, should the comment be "--add-modules=java.sql" >>>>> instead of "--add-modules java.sql"? >>>> yes. >>>> >>>> The changes suggested by you, Coleen, and Dan will be in the next >>>> version of this webrev. >>>> >>>> Thanks, Harold >>>>> >>>>> Thanks, >>>>> Lois >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> The changes were tested with the JCK lang and VM tests, the JTreg >>>>>> hotspot tests, and the RBT hotspot nightlies. >>>>>> >>>>>> Thanks, Harold >>>>> >>>> >>> >> > From dean.long at oracle.com Tue Aug 9 17:39:02 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 9 Aug 2016 10:39:02 -0700 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: References: Message-ID: Ping. dl On 8/4/16 3:28 PM, dean.long at oracle.com wrote: > https://bugs.openjdk.java.net/browse/JDK-8161598 > > http://cr.openjdk.java.net/~dlong/8161598/webrev/ > > Sorry, this issue is Confidential. The problem is similar to 8029441, > where we suspend a thread and use pd_get_top_frame_for_profiling() to > get the top frame for stack walking. The problem is "last Java frame" > anchor frames on x86. In lots of places we do not store last_Java_pc. > This is OK in the synchronous stack walk case done by the current > thread. But in the asynchronous case, there are small windows where > it's not always safe to get PC from sp[-1]. > > The solution is not to treat x86 anchor frames as "always walkable". > Instead, we follow the example of sparc and make them walking by > filling in last_Java_pc when it's safe. > > I went for the minimal fix, resetting clear_pc to true in > reset_last_Java_frame() but not changing the API and all the callers. > I can fix this if reviewers feel strongly about it. > > dl > From coleen.phillimore at oracle.com Tue Aug 9 18:24:48 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 9 Aug 2016 14:24:48 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> Message-ID: <2f8dbb85-34a0-6d14-57fe-c0dfd4cd4a97@oracle.com> Harold's changes look fine, although exactly what should be done for https://bugs.openjdk.java.net/browse/JDK-8162412 will need to be a follow-on issue. Coleen On 8/8/16 11:39 PM, Mandy Chung wrote: > This is the full hotspot webrev containing all of Harold's incremental patches: > http://cr.openjdk.java.net/~mchung/jdk9/webrevs/8136930/gnu-options/webrev-hotspot.03/ > > I will be pushing the hotspot change for Harold together with the CLI work for jdk, langtools, and other repos once the code review is completed. > > FYI. Changes for jdk, langtools and other repos are posted [1]. > Mandy > > [1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-August/009025.html > > >> On Aug 8, 2016, at 2:04 PM, Coleen Phillimore wrote: >> >> >> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/src/share/vm/prims/jvmtiEnv.cpp.udiff.html >> >> *!_char **tmp_value = *property_ptr+readable_count++_;* >> >> >> This looks really painful to me. Can you add some parentheses? Or put readable_count++ on the next statement. >> >> Is this an incremental webrev? Can you send a full one? >> thanks, >> Coleen >> >> >> On 8/8/16 1:25 PM, harold seigel wrote: >>> Hi, >>> >>> Please review the latest version of this change. It is similar to the below change except it changes uses of -mp to -p. >>> >>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ >>> >>> Thanks! Harold >>> >>> >>> On 8/4/2016 8:46 AM, harold seigel wrote: >>>> Hi, >>>> >>>> Please review this update for this fix. This webrev only shows the changes since the last webrev. These changes include: >>>> >>>> 1. Fix forJDK-8162415 >>>> - the JVM now >>>> prints the following message when ignoring a property and >>>> PrintWarnings is enabled: >>>> warning: Ignoring system property options whose names start with >>>> '-Djdk.module'. They are reserved for internal use. >>>> >>>> 2. Fix for JDK-8162412 >>>> >>>> 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests >>>> >>>> 4. Incorporates review comments from Alan, Coleen, Dan, and Lois >>>> 5. Fixes JTReg tests that failed due to the new option syntax. >>>> >>>> Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ >>>> >>>> Thanks, Harold >>>> >>>> On 8/2/2016 9:25 AM, harold seigel wrote: >>>>> Hi Lois, >>>>> >>>>> Thanks for the review. Please see comments in-line. >>>>> >>>>> Harold >>>>> >>>>> >>>>> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>>>>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Please review these Hotspot VM only changes to process the seven module-specific options that have been renamed to have gnu-like names. JDK changes for this bug will be reviewed separately. >>>>>>> >>>>>>> Descriptions of these options are here . For these six options, --module-path, --upgrade-module-path, --add-modules, --limit-modules, --add-reads, and --add-exports, the JVM just sets a system property. For the --patch-module option, the JVM sets a system property and then processes the option in the same way as when it was named -Xpatch. >>>>>>> >>>>>>> Additionally, the JVM now checks properties specified on the command line. If a property matches one of the properties used by one of the above options then the JVM ignores the property. This forces users to use the explicit option when wanting to do things like add a module or a package export. >>>>>>> >>>>>>> The RFR contains two new tests. Also, many existing tests were changed to use the new option names. >>>>>>> >>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>>>>> >>>>>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>>>>> Hi Harold, >>>>>> >>>>>> Overall looks good. A couple of comments: >>>>>> >>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>> - line #3428 - The if statement is incorrect. There are internal properties, like jdk.boot.class.path.append, whose value if non-null should be returned. >>>>> This code will be reworked in the next version of these changes because of multiple issues. >>>>>> src/share/vm/runtime/arguments.cpp >>>>>> - Arguments::append_to_addmods_property was added before the VM starting to process --add-modules. So with this fix, it seems like it could be simply changed to: >>>>>> >>>>>> bool Arguments::append_to_addmods_property(const char* >>>>>> module_name) { >>>>>> PropertyList_unique_add(&_system_properties, >>>>>> Arguments::get_property("jdk.module.addmods"), >>>>>> module_name, >>>>>> AppendProperty, UnwriteableProperty, InternalProperty); >>>>>> } >>>>>> >>>>>> Please consider making this change since currently it contains a lot of duplicated code that is now unnecessary. >>>>> The one difference is that append_to_addmods_property() returns a status but PropertyList_unique_add() does not. I'll look into this a bit further. >>>>>> - line #3171, should the comment be "--add-modules=java.sql" instead of "--add-modules java.sql"? >>>>> yes. >>>>> >>>>> The changes suggested by you, Coleen, and Dan will be in the next version of this webrev. >>>>> >>>>> Thanks, Harold >>>>>> Thanks, >>>>>> Lois >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> The changes were tested with the JCK lang and VM tests, the JTreg hotspot tests, and the RBT hotspot nightlies. >>>>>>> >>>>>>> Thanks, Harold From harold.seigel at oracle.com Tue Aug 9 18:45:17 2016 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 9 Aug 2016 14:45:17 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: <2f8dbb85-34a0-6d14-57fe-c0dfd4cd4a97@oracle.com> References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> <2f8dbb85-34a0-6d14-57fe-c0dfd4cd4a97@oracle.com> Message-ID: Thanks Coleen! Harold On 8/9/2016 2:24 PM, Coleen Phillimore wrote: > > Harold's changes look fine, although exactly what should be done for > https://bugs.openjdk.java.net/browse/JDK-8162412 will need to be a > follow-on issue. > > Coleen > > > On 8/8/16 11:39 PM, Mandy Chung wrote: >> This is the full hotspot webrev containing all of Harold's >> incremental patches: >> http://cr.openjdk.java.net/~mchung/jdk9/webrevs/8136930/gnu-options/webrev-hotspot.03/ >> >> I will be pushing the hotspot change for Harold together with the CLI >> work for jdk, langtools, and other repos once the code review is >> completed. >> >> FYI. Changes for jdk, langtools and other repos are posted [1]. >> Mandy >> >> [1] >> http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-August/009025.html >> >> >>> On Aug 8, 2016, at 2:04 PM, Coleen Phillimore >>> wrote: >>> >>> >>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/src/share/vm/prims/jvmtiEnv.cpp.udiff.html >>> >>> >>> *!_char **tmp_value = *property_ptr+readable_count++_;* >>> >>> >>> This looks really painful to me. Can you add some parentheses? Or >>> put readable_count++ on the next statement. >>> >>> Is this an incremental webrev? Can you send a full one? >>> thanks, >>> Coleen >>> >>> >>> On 8/8/16 1:25 PM, harold seigel wrote: >>>> Hi, >>>> >>>> Please review the latest version of this change. It is similar to >>>> the below change except it changes uses of -mp to -p. >>>> >>>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ >>>> >>>> Thanks! Harold >>>> >>>> >>>> On 8/4/2016 8:46 AM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this update for this fix. This webrev only shows >>>>> the changes since the last webrev. These changes include: >>>>> >>>>> 1. Fix forJDK-8162415 >>>>> - the JVM now >>>>> prints the following message when ignoring a property and >>>>> PrintWarnings is enabled: >>>>> warning: Ignoring system property options whose names start with >>>>> '-Djdk.module'. They are reserved for internal use. >>>>> >>>>> 2. Fix for JDK-8162412 >>>>> >>>>> 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests >>>>> >>>>> 4. Incorporates review comments from Alan, Coleen, Dan, and Lois >>>>> 5. Fixes JTReg tests that failed due to the new option syntax. >>>>> >>>>> Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ >>>>> >>>>> Thanks, Harold >>>>> >>>>> On 8/2/2016 9:25 AM, harold seigel wrote: >>>>>> Hi Lois, >>>>>> >>>>>> Thanks for the review. Please see comments in-line. >>>>>> >>>>>> Harold >>>>>> >>>>>> >>>>>> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>>>>>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please review these Hotspot VM only changes to process the >>>>>>>> seven module-specific options that have been renamed to have >>>>>>>> gnu-like names. JDK changes for this bug will be reviewed >>>>>>>> separately. >>>>>>>> >>>>>>>> Descriptions of these options are here >>>>>>>> . For these six options, >>>>>>>> --module-path, --upgrade-module-path, --add-modules, >>>>>>>> --limit-modules, --add-reads, and --add-exports, the JVM just >>>>>>>> sets a system property. For the --patch-module option, the JVM >>>>>>>> sets a system property and then processes the option in the >>>>>>>> same way as when it was named -Xpatch. >>>>>>>> >>>>>>>> Additionally, the JVM now checks properties specified on the >>>>>>>> command line. If a property matches one of the properties used >>>>>>>> by one of the above options then the JVM ignores the property. >>>>>>>> This forces users to use the explicit option when wanting to do >>>>>>>> things like add a module or a package export. >>>>>>>> >>>>>>>> The RFR contains two new tests. Also, many existing tests were >>>>>>>> changed to use the new option names. >>>>>>>> >>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>>>>>> >>>>>>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>>>>>> Hi Harold, >>>>>>> >>>>>>> Overall looks good. A couple of comments: >>>>>>> >>>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>>> - line #3428 - The if statement is incorrect. There are >>>>>>> internal properties, like jdk.boot.class.path.append, whose >>>>>>> value if non-null should be returned. >>>>>> This code will be reworked in the next version of these changes >>>>>> because of multiple issues. >>>>>>> src/share/vm/runtime/arguments.cpp >>>>>>> - Arguments::append_to_addmods_property was added before the VM >>>>>>> starting to process --add-modules. So with this fix, it seems >>>>>>> like it could be simply changed to: >>>>>>> >>>>>>> bool Arguments::append_to_addmods_property(const char* >>>>>>> module_name) { >>>>>>> PropertyList_unique_add(&_system_properties, >>>>>>> Arguments::get_property("jdk.module.addmods"), >>>>>>> module_name, >>>>>>> AppendProperty, UnwriteableProperty, InternalProperty); >>>>>>> } >>>>>>> >>>>>>> Please consider making this change since currently it contains a >>>>>>> lot of duplicated code that is now unnecessary. >>>>>> The one difference is that append_to_addmods_property() returns a >>>>>> status but PropertyList_unique_add() does not. I'll look into >>>>>> this a bit further. >>>>>>> - line #3171, should the comment be "--add-modules=java.sql" >>>>>>> instead of "--add-modules java.sql"? >>>>>> yes. >>>>>> >>>>>> The changes suggested by you, Coleen, and Dan will be in the next >>>>>> version of this webrev. >>>>>> >>>>>> Thanks, Harold >>>>>>> Thanks, >>>>>>> Lois >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> The changes were tested with the JCK lang and VM tests, the >>>>>>>> JTreg hotspot tests, and the RBT hotspot nightlies. >>>>>>>> >>>>>>>> Thanks, Harold > From mandy.chung at oracle.com Tue Aug 9 18:45:23 2016 From: mandy.chung at oracle.com (Mandy Chung) Date: Tue, 9 Aug 2016 11:45:23 -0700 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: <2f8dbb85-34a0-6d14-57fe-c0dfd4cd4a97@oracle.com> References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> <2f8dbb85-34a0-6d14-57fe-c0dfd4cd4a97@oracle.com> Message-ID: As you commented in JDK-8162412, you suggest to ignore -Djdk.module.$NAME=? if jdk.module.$NAME.$N is set by the VM as implementation details. What are you concerned with if -Djdk.module.addreads=value and -Djdk.module.addreads.0=value1 but jdk.module.addreads shows up in the system properties but jdk.module.addreads.0 is not? I don?t see an issue if jdk.module.addreads shows up in the system properties if the user sets it through -D explicitly since it?s not used by the module system while I don?t have any objection to it if ignoring any property with that prefix would make the implementation cleaner (basically the VM reserves such prefix for implementation details). Mandy > On Aug 9, 2016, at 11:24 AM, Coleen Phillimore wrote: > > > Harold's changes look fine, although exactly what should be done for https://bugs.openjdk.java.net/browse/JDK-8162412 will need to be a follow-on issue. > > Coleen > > > On 8/8/16 11:39 PM, Mandy Chung wrote: >> This is the full hotspot webrev containing all of Harold's incremental patches: >> http://cr.openjdk.java.net/~mchung/jdk9/webrevs/8136930/gnu-options/webrev-hotspot.03/ >> >> I will be pushing the hotspot change for Harold together with the CLI work for jdk, langtools, and other repos once the code review is completed. >> >> FYI. Changes for jdk, langtools and other repos are posted [1]. >> Mandy >> >> [1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-August/009025.html >> >> >>> On Aug 8, 2016, at 2:04 PM, Coleen Phillimore wrote: >>> >>> >>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/src/share/vm/prims/jvmtiEnv.cpp.udiff.html >>> >>> *!_char **tmp_value = *property_ptr+readable_count++_;* >>> >>> >>> This looks really painful to me. Can you add some parentheses? Or put readable_count++ on the next statement. >>> >>> Is this an incremental webrev? Can you send a full one? >>> thanks, >>> Coleen >>> >>> >>> On 8/8/16 1:25 PM, harold seigel wrote: >>>> Hi, >>>> >>>> Please review the latest version of this change. It is similar to the below change except it changes uses of -mp to -p. >>>> >>>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ >>>> >>>> Thanks! Harold >>>> >>>> >>>> On 8/4/2016 8:46 AM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this update for this fix. This webrev only shows the changes since the last webrev. These changes include: >>>>> >>>>> 1. Fix forJDK-8162415 >>>>> - the JVM now >>>>> prints the following message when ignoring a property and >>>>> PrintWarnings is enabled: >>>>> warning: Ignoring system property options whose names start with >>>>> '-Djdk.module'. They are reserved for internal use. >>>>> >>>>> 2. Fix for JDK-8162412 >>>>> >>>>> 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests >>>>> >>>>> 4. Incorporates review comments from Alan, Coleen, Dan, and Lois >>>>> 5. Fixes JTReg tests that failed due to the new option syntax. >>>>> >>>>> Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ >>>>> >>>>> Thanks, Harold >>>>> >>>>> On 8/2/2016 9:25 AM, harold seigel wrote: >>>>>> Hi Lois, >>>>>> >>>>>> Thanks for the review. Please see comments in-line. >>>>>> >>>>>> Harold >>>>>> >>>>>> >>>>>> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>>>>>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please review these Hotspot VM only changes to process the seven module-specific options that have been renamed to have gnu-like names. JDK changes for this bug will be reviewed separately. >>>>>>>> >>>>>>>> Descriptions of these options are here . For these six options, --module-path, --upgrade-module-path, --add-modules, --limit-modules, --add-reads, and --add-exports, the JVM just sets a system property. For the --patch-module option, the JVM sets a system property and then processes the option in the same way as when it was named -Xpatch. >>>>>>>> >>>>>>>> Additionally, the JVM now checks properties specified on the command line. If a property matches one of the properties used by one of the above options then the JVM ignores the property. This forces users to use the explicit option when wanting to do things like add a module or a package export. >>>>>>>> >>>>>>>> The RFR contains two new tests. Also, many existing tests were changed to use the new option names. >>>>>>>> >>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>>>>>> >>>>>>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>>>>>> Hi Harold, >>>>>>> >>>>>>> Overall looks good. A couple of comments: >>>>>>> >>>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>>> - line #3428 - The if statement is incorrect. There are internal properties, like jdk.boot.class.path.append, whose value if non-null should be returned. >>>>>> This code will be reworked in the next version of these changes because of multiple issues. >>>>>>> src/share/vm/runtime/arguments.cpp >>>>>>> - Arguments::append_to_addmods_property was added before the VM starting to process --add-modules. So with this fix, it seems like it could be simply changed to: >>>>>>> >>>>>>> bool Arguments::append_to_addmods_property(const char* >>>>>>> module_name) { >>>>>>> PropertyList_unique_add(&_system_properties, >>>>>>> Arguments::get_property("jdk.module.addmods"), >>>>>>> module_name, >>>>>>> AppendProperty, UnwriteableProperty, InternalProperty); >>>>>>> } >>>>>>> >>>>>>> Please consider making this change since currently it contains a lot of duplicated code that is now unnecessary. >>>>>> The one difference is that append_to_addmods_property() returns a status but PropertyList_unique_add() does not. I'll look into this a bit further. >>>>>>> - line #3171, should the comment be "--add-modules=java.sql" instead of "--add-modules java.sql"? >>>>>> yes. >>>>>> >>>>>> The changes suggested by you, Coleen, and Dan will be in the next version of this webrev. >>>>>> >>>>>> Thanks, Harold >>>>>>> Thanks, >>>>>>> Lois >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> The changes were tested with the JCK lang and VM tests, the JTreg hotspot tests, and the RBT hotspot nightlies. >>>>>>>> >>>>>>>> Thanks, Harold > From coleen.phillimore at oracle.com Tue Aug 9 18:50:40 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 9 Aug 2016 14:50:40 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> <2f8dbb85-34a0-6d14-57fe-c0dfd4cd4a97@oracle.com> Message-ID: <72587a49-00de-e80a-c1ab-db5869e10a02@oracle.com> On 8/9/16 2:45 PM, Mandy Chung wrote: > As you commented in JDK-8162412, you suggest to ignore -Djdk.module.$NAME=? if jdk.module.$NAME.$N is set by the VM as implementation details. > > What are you concerned with if -Djdk.module.addreads=value and -Djdk.module.addreads.0=value1 but jdk.module.addreads shows up in the system properties but jdk.module.addreads.0 is not? Yes, I am concerned about that. Since the addreads.0 variant is an internal implementation detail, it's not something that a user would do accidentally, but a user could try to set -Djdk.module.addreads=something and be surprised by the implementation not doing what he thinks it should be doing. Since it's something that obviously wrong and we should own the jdk.module namespace, I believe we should be allowed to ignore and warn on this obvious variant. > > I don?t see an issue if jdk.module.addreads shows up in the system properties if the user sets it through -D explicitly since it?s not used by the module system while I don?t have any objection to it if ignoring any property with that prefix would make the implementation cleaner (basically the VM reserves such prefix for implementation details). Right, it's not used by the module system but it would make it cleaner for the implementation, and mostly because it would make it clearer for the customer. thanks, Coleen > > Mandy > >> On Aug 9, 2016, at 11:24 AM, Coleen Phillimore wrote: >> >> >> Harold's changes look fine, although exactly what should be done for https://bugs.openjdk.java.net/browse/JDK-8162412 will need to be a follow-on issue. >> >> Coleen >> >> >> On 8/8/16 11:39 PM, Mandy Chung wrote: >>> This is the full hotspot webrev containing all of Harold's incremental patches: >>> http://cr.openjdk.java.net/~mchung/jdk9/webrevs/8136930/gnu-options/webrev-hotspot.03/ >>> >>> I will be pushing the hotspot change for Harold together with the CLI work for jdk, langtools, and other repos once the code review is completed. >>> >>> FYI. Changes for jdk, langtools and other repos are posted [1]. >>> Mandy >>> >>> [1] http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-August/009025.html >>> >>> >>>> On Aug 8, 2016, at 2:04 PM, Coleen Phillimore wrote: >>>> >>>> >>>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/src/share/vm/prims/jvmtiEnv.cpp.udiff.html >>>> >>>> *!_char **tmp_value = *property_ptr+readable_count++_;* >>>> >>>> >>>> This looks really painful to me. Can you add some parentheses? Or put readable_count++ on the next statement. >>>> >>>> Is this an incremental webrev? Can you send a full one? >>>> thanks, >>>> Coleen >>>> >>>> >>>> On 8/8/16 1:25 PM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review the latest version of this change. It is similar to the below change except it changes uses of -mp to -p. >>>>> >>>>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ >>>>> >>>>> Thanks! Harold >>>>> >>>>> >>>>> On 8/4/2016 8:46 AM, harold seigel wrote: >>>>>> Hi, >>>>>> >>>>>> Please review this update for this fix. This webrev only shows the changes since the last webrev. These changes include: >>>>>> >>>>>> 1. Fix forJDK-8162415 >>>>>> - the JVM now >>>>>> prints the following message when ignoring a property and >>>>>> PrintWarnings is enabled: >>>>>> warning: Ignoring system property options whose names start with >>>>>> '-Djdk.module'. They are reserved for internal use. >>>>>> >>>>>> 2. Fix for JDK-8162412 >>>>>> >>>>>> 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests >>>>>> >>>>>> 4. Incorporates review comments from Alan, Coleen, Dan, and Lois >>>>>> 5. Fixes JTReg tests that failed due to the new option syntax. >>>>>> >>>>>> Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ >>>>>> >>>>>> Thanks, Harold >>>>>> >>>>>> On 8/2/2016 9:25 AM, harold seigel wrote: >>>>>>> Hi Lois, >>>>>>> >>>>>>> Thanks for the review. Please see comments in-line. >>>>>>> >>>>>>> Harold >>>>>>> >>>>>>> >>>>>>> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>>>>>>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Please review these Hotspot VM only changes to process the seven module-specific options that have been renamed to have gnu-like names. JDK changes for this bug will be reviewed separately. >>>>>>>>> >>>>>>>>> Descriptions of these options are here . For these six options, --module-path, --upgrade-module-path, --add-modules, --limit-modules, --add-reads, and --add-exports, the JVM just sets a system property. For the --patch-module option, the JVM sets a system property and then processes the option in the same way as when it was named -Xpatch. >>>>>>>>> >>>>>>>>> Additionally, the JVM now checks properties specified on the command line. If a property matches one of the properties used by one of the above options then the JVM ignores the property. This forces users to use the explicit option when wanting to do things like add a module or a package export. >>>>>>>>> >>>>>>>>> The RFR contains two new tests. Also, many existing tests were changed to use the new option names. >>>>>>>>> >>>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>>>>>>> >>>>>>>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>>>>>>> Hi Harold, >>>>>>>> >>>>>>>> Overall looks good. A couple of comments: >>>>>>>> >>>>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>>>> - line #3428 - The if statement is incorrect. There are internal properties, like jdk.boot.class.path.append, whose value if non-null should be returned. >>>>>>> This code will be reworked in the next version of these changes because of multiple issues. >>>>>>>> src/share/vm/runtime/arguments.cpp >>>>>>>> - Arguments::append_to_addmods_property was added before the VM starting to process --add-modules. So with this fix, it seems like it could be simply changed to: >>>>>>>> >>>>>>>> bool Arguments::append_to_addmods_property(const char* >>>>>>>> module_name) { >>>>>>>> PropertyList_unique_add(&_system_properties, >>>>>>>> Arguments::get_property("jdk.module.addmods"), >>>>>>>> module_name, >>>>>>>> AppendProperty, UnwriteableProperty, InternalProperty); >>>>>>>> } >>>>>>>> >>>>>>>> Please consider making this change since currently it contains a lot of duplicated code that is now unnecessary. >>>>>>> The one difference is that append_to_addmods_property() returns a status but PropertyList_unique_add() does not. I'll look into this a bit further. >>>>>>>> - line #3171, should the comment be "--add-modules=java.sql" instead of "--add-modules java.sql"? >>>>>>> yes. >>>>>>> >>>>>>> The changes suggested by you, Coleen, and Dan will be in the next version of this webrev. >>>>>>> >>>>>>> Thanks, Harold >>>>>>>> Thanks, >>>>>>>> Lois >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> The changes were tested with the JCK lang and VM tests, the JTreg hotspot tests, and the RBT hotspot nightlies. >>>>>>>>> >>>>>>>>> Thanks, Harold From lois.foltan at oracle.com Tue Aug 9 20:12:03 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 09 Aug 2016 16:12:03 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: <2f8dbb85-34a0-6d14-57fe-c0dfd4cd4a97@oracle.com> References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> <2f8dbb85-34a0-6d14-57fe-c0dfd4cd4a97@oracle.com> Message-ID: <57AA3913.2090106@oracle.com> +1 Lois On 8/9/2016 2:24 PM, Coleen Phillimore wrote: > > Harold's changes look fine, although exactly what should be done for > https://bugs.openjdk.java.net/browse/JDK-8162412 will need to be a > follow-on issue. > > Coleen > > > On 8/8/16 11:39 PM, Mandy Chung wrote: >> This is the full hotspot webrev containing all of Harold's >> incremental patches: >> http://cr.openjdk.java.net/~mchung/jdk9/webrevs/8136930/gnu-options/webrev-hotspot.03/ >> >> I will be pushing the hotspot change for Harold together with the CLI >> work for jdk, langtools, and other repos once the code review is >> completed. >> >> FYI. Changes for jdk, langtools and other repos are posted [1]. >> Mandy >> >> [1] >> http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-August/009025.html >> >> >>> On Aug 8, 2016, at 2:04 PM, Coleen Phillimore >>> wrote: >>> >>> >>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/src/share/vm/prims/jvmtiEnv.cpp.udiff.html >>> >>> >>> *!_char **tmp_value = *property_ptr+readable_count++_;* >>> >>> >>> This looks really painful to me. Can you add some parentheses? Or >>> put readable_count++ on the next statement. >>> >>> Is this an incremental webrev? Can you send a full one? >>> thanks, >>> Coleen >>> >>> >>> On 8/8/16 1:25 PM, harold seigel wrote: >>>> Hi, >>>> >>>> Please review the latest version of this change. It is similar to >>>> the below change except it changes uses of -mp to -p. >>>> >>>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ >>>> >>>> Thanks! Harold >>>> >>>> >>>> On 8/4/2016 8:46 AM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this update for this fix. This webrev only shows >>>>> the changes since the last webrev. These changes include: >>>>> >>>>> 1. Fix forJDK-8162415 >>>>> - the JVM now >>>>> prints the following message when ignoring a property and >>>>> PrintWarnings is enabled: >>>>> warning: Ignoring system property options whose names start with >>>>> '-Djdk.module'. They are reserved for internal use. >>>>> >>>>> 2. Fix for JDK-8162412 >>>>> >>>>> 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests >>>>> >>>>> 4. Incorporates review comments from Alan, Coleen, Dan, and Lois >>>>> 5. Fixes JTReg tests that failed due to the new option syntax. >>>>> >>>>> Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ >>>>> >>>>> Thanks, Harold >>>>> >>>>> On 8/2/2016 9:25 AM, harold seigel wrote: >>>>>> Hi Lois, >>>>>> >>>>>> Thanks for the review. Please see comments in-line. >>>>>> >>>>>> Harold >>>>>> >>>>>> >>>>>> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>>>>>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please review these Hotspot VM only changes to process the >>>>>>>> seven module-specific options that have been renamed to have >>>>>>>> gnu-like names. JDK changes for this bug will be reviewed >>>>>>>> separately. >>>>>>>> >>>>>>>> Descriptions of these options are here >>>>>>>> . For these six options, >>>>>>>> --module-path, --upgrade-module-path, --add-modules, >>>>>>>> --limit-modules, --add-reads, and --add-exports, the JVM just >>>>>>>> sets a system property. For the --patch-module option, the JVM >>>>>>>> sets a system property and then processes the option in the >>>>>>>> same way as when it was named -Xpatch. >>>>>>>> >>>>>>>> Additionally, the JVM now checks properties specified on the >>>>>>>> command line. If a property matches one of the properties used >>>>>>>> by one of the above options then the JVM ignores the property. >>>>>>>> This forces users to use the explicit option when wanting to do >>>>>>>> things like add a module or a package export. >>>>>>>> >>>>>>>> The RFR contains two new tests. Also, many existing tests were >>>>>>>> changed to use the new option names. >>>>>>>> >>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>>>>>> >>>>>>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>>>>>> Hi Harold, >>>>>>> >>>>>>> Overall looks good. A couple of comments: >>>>>>> >>>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>>> - line #3428 - The if statement is incorrect. There are >>>>>>> internal properties, like jdk.boot.class.path.append, whose >>>>>>> value if non-null should be returned. >>>>>> This code will be reworked in the next version of these changes >>>>>> because of multiple issues. >>>>>>> src/share/vm/runtime/arguments.cpp >>>>>>> - Arguments::append_to_addmods_property was added before the VM >>>>>>> starting to process --add-modules. So with this fix, it seems >>>>>>> like it could be simply changed to: >>>>>>> >>>>>>> bool Arguments::append_to_addmods_property(const char* >>>>>>> module_name) { >>>>>>> PropertyList_unique_add(&_system_properties, >>>>>>> Arguments::get_property("jdk.module.addmods"), >>>>>>> module_name, >>>>>>> AppendProperty, UnwriteableProperty, InternalProperty); >>>>>>> } >>>>>>> >>>>>>> Please consider making this change since currently it contains a >>>>>>> lot of duplicated code that is now unnecessary. >>>>>> The one difference is that append_to_addmods_property() returns a >>>>>> status but PropertyList_unique_add() does not. I'll look into >>>>>> this a bit further. >>>>>>> - line #3171, should the comment be "--add-modules=java.sql" >>>>>>> instead of "--add-modules java.sql"? >>>>>> yes. >>>>>> >>>>>> The changes suggested by you, Coleen, and Dan will be in the next >>>>>> version of this webrev. >>>>>> >>>>>> Thanks, Harold >>>>>>> Thanks, >>>>>>> Lois >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> The changes were tested with the JCK lang and VM tests, the >>>>>>>> JTreg hotspot tests, and the RBT hotspot nightlies. >>>>>>>> >>>>>>>> Thanks, Harold > From harold.seigel at oracle.com Tue Aug 9 20:13:14 2016 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 9 Aug 2016 16:13:14 -0400 Subject: RFR(L) 8136930: Simplify use of module-system options by custom launchers In-Reply-To: <57AA3913.2090106@oracle.com> References: <4de6f3b7-d5ed-8569-aab9-798f39a1f134@oracle.com> <579FEC0F.1010407@oracle.com> <9eb4e7de-cfba-772c-c67a-4da1d8074646@oracle.com> <19d442f0-5c9a-7067-ce24-62e0a4369b1e@oracle.com> <2f8dbb85-34a0-6d14-57fe-c0dfd4cd4a97@oracle.com> <57AA3913.2090106@oracle.com> Message-ID: Thanks Lois! Harold On 8/9/2016 4:12 PM, Lois Foltan wrote: > +1 > Lois > > On 8/9/2016 2:24 PM, Coleen Phillimore wrote: >> >> Harold's changes look fine, although exactly what should be done for >> https://bugs.openjdk.java.net/browse/JDK-8162412 will need to be a >> follow-on issue. >> >> Coleen >> >> >> On 8/8/16 11:39 PM, Mandy Chung wrote: >>> This is the full hotspot webrev containing all of Harold's >>> incremental patches: >>> http://cr.openjdk.java.net/~mchung/jdk9/webrevs/8136930/gnu-options/webrev-hotspot.03/ >>> >>> >>> I will be pushing the hotspot change for Harold together with the >>> CLI work for jdk, langtools, and other repos once the code review is >>> completed. >>> >>> FYI. Changes for jdk, langtools and other repos are posted [1]. >>> Mandy >>> >>> [1] >>> http://mail.openjdk.java.net/pipermail/jigsaw-dev/2016-August/009025.html >>> >>> >>>> On Aug 8, 2016, at 2:04 PM, Coleen Phillimore >>>> wrote: >>>> >>>> >>>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/src/share/vm/prims/jvmtiEnv.cpp.udiff.html >>>> >>>> >>>> *!_char **tmp_value = *property_ptr+readable_count++_;* >>>> >>>> >>>> This looks really painful to me. Can you add some parentheses? Or >>>> put readable_count++ on the next statement. >>>> >>>> Is this an incremental webrev? Can you send a full one? >>>> thanks, >>>> Coleen >>>> >>>> >>>> On 8/8/16 1:25 PM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review the latest version of this change. It is similar to >>>>> the below change except it changes uses of -mp to -p. >>>>> >>>>> http://cr.openjdk.java.net/~hseigel/bug_8136930.hs3/ >>>>> >>>>> Thanks! Harold >>>>> >>>>> >>>>> On 8/4/2016 8:46 AM, harold seigel wrote: >>>>>> Hi, >>>>>> >>>>>> Please review this update for this fix. This webrev only shows >>>>>> the changes since the last webrev. These changes include: >>>>>> >>>>>> 1. Fix forJDK-8162415 >>>>>> - the JVM now >>>>>> prints the following message when ignoring a property and >>>>>> PrintWarnings is enabled: >>>>>> warning: Ignoring system property options whose names start with >>>>>> '-Djdk.module'. They are reserved for internal use. >>>>>> >>>>>> 2. Fix for JDK-8162412 >>>>>> >>>>>> 3. Fixes a problem where JVMTI was failing two JCK vm/jvmti tests >>>>>> >>>>>> 4. Incorporates review comments from Alan, Coleen, Dan, and Lois >>>>>> 5. Fixes JTReg tests that failed due to the new option syntax. >>>>>> >>>>>> Revised webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs2/ >>>>>> >>>>>> Thanks, Harold >>>>>> >>>>>> On 8/2/2016 9:25 AM, harold seigel wrote: >>>>>>> Hi Lois, >>>>>>> >>>>>>> Thanks for the review. Please see comments in-line. >>>>>>> >>>>>>> Harold >>>>>>> >>>>>>> >>>>>>> On 8/1/2016 8:40 PM, Lois Foltan wrote: >>>>>>>> On 7/17/2016 7:05 PM, harold seigel wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Please review these Hotspot VM only changes to process the >>>>>>>>> seven module-specific options that have been renamed to have >>>>>>>>> gnu-like names. JDK changes for this bug will be reviewed >>>>>>>>> separately. >>>>>>>>> >>>>>>>>> Descriptions of these options are here >>>>>>>>> . For these six options, >>>>>>>>> --module-path, --upgrade-module-path, --add-modules, >>>>>>>>> --limit-modules, --add-reads, and --add-exports, the JVM just >>>>>>>>> sets a system property. For the --patch-module option, the JVM >>>>>>>>> sets a system property and then processes the option in the >>>>>>>>> same way as when it was named -Xpatch. >>>>>>>>> >>>>>>>>> Additionally, the JVM now checks properties specified on the >>>>>>>>> command line. If a property matches one of the properties >>>>>>>>> used by one of the above options then the JVM ignores the >>>>>>>>> property. This forces users to use the explicit option when >>>>>>>>> wanting to do things like add a module or a package export. >>>>>>>>> >>>>>>>>> The RFR contains two new tests. Also, many existing tests >>>>>>>>> were changed to use the new option names. >>>>>>>>> >>>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8136930 >>>>>>>>> >>>>>>>>> Webrev: http://cr.openjdk.java.net/~hseigel/bug_8136930.hs/ >>>>>>>> Hi Harold, >>>>>>>> >>>>>>>> Overall looks good. A couple of comments: >>>>>>>> >>>>>>>> src/share/vm/prims/jvmtiEnv.cpp >>>>>>>> - line #3428 - The if statement is incorrect. There are >>>>>>>> internal properties, like jdk.boot.class.path.append, whose >>>>>>>> value if non-null should be returned. >>>>>>> This code will be reworked in the next version of these changes >>>>>>> because of multiple issues. >>>>>>>> src/share/vm/runtime/arguments.cpp >>>>>>>> - Arguments::append_to_addmods_property was added before the VM >>>>>>>> starting to process --add-modules. So with this fix, it seems >>>>>>>> like it could be simply changed to: >>>>>>>> >>>>>>>> bool Arguments::append_to_addmods_property(const char* >>>>>>>> module_name) { >>>>>>>> PropertyList_unique_add(&_system_properties, >>>>>>>> Arguments::get_property("jdk.module.addmods"), >>>>>>>> module_name, >>>>>>>> AppendProperty, UnwriteableProperty, InternalProperty); >>>>>>>> } >>>>>>>> >>>>>>>> Please consider making this change since currently it contains >>>>>>>> a lot of duplicated code that is now unnecessary. >>>>>>> The one difference is that append_to_addmods_property() returns >>>>>>> a status but PropertyList_unique_add() does not. I'll look into >>>>>>> this a bit further. >>>>>>>> - line #3171, should the comment be "--add-modules=java.sql" >>>>>>>> instead of "--add-modules java.sql"? >>>>>>> yes. >>>>>>> >>>>>>> The changes suggested by you, Coleen, and Dan will be in the >>>>>>> next version of this webrev. >>>>>>> >>>>>>> Thanks, Harold >>>>>>>> Thanks, >>>>>>>> Lois >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> The changes were tested with the JCK lang and VM tests, the >>>>>>>>> JTreg hotspot tests, and the RBT hotspot nightlies. >>>>>>>>> >>>>>>>>> Thanks, Harold >> > From chris.plummer at oracle.com Wed Aug 10 03:56:18 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 9 Aug 2016 20:56:18 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <2b3b19a4-d6d9-99c9-87a8-3f42ce97adb6@oracle.com> References: <10b1ed76-cedb-d909-168f-863baf5869e2@oracle.com> <887ca8f2-4473-40ca-d1d1-336095c9f894@oracle.com> <2b3b19a4-d6d9-99c9-87a8-3f42ce97adb6@oracle.com> Message-ID: <5fdda014-ae54-d00f-7950-c88440d68d6e@oracle.com> On 8/8/16 5:52 PM, David Holmes wrote: > On 9/08/2016 6:22 AM, Chris Plummer wrote: >> Hi David, >> >> Did you want me to implement any of the additional cleanup work I >> mentioned: manually inline _get_previous_fp, change os::current_frame() >> to walk back one less frame, possibly rename os::current_frame()? > > Up to you. I'm not insisting on anything, but the less reliance we > have on uncheckable (at build time) compiler behaviour, the better. Ok. Given the relatively short amount of time left to resolve p4s, I think I will just leave it as-is. I've started some more robust testing the NMT detailed enabled. Once that completes I'll do the push. I'll also file a couple of RFEs for further cleanup/improvements. Can I consider the changes officially reviewed by you now? thanks, Chris > > Thanks, > David > >> thanks, >> >> Chris >> >> On 8/7/16 4:26 PM, David Holmes wrote: >>> Hi Chris, >>> >>> I don't have any good suggestions for this. So go with (2) and lets >>> work on (3). >>> >>> Thanks, >>> David >>> >>> On 5/08/2016 5:05 PM, Chris Plummer wrote: >>>> Hi David, >>>> >>>> If fixing os::current_frame() to have a better name and also make >>>> it go >>>> up one less frame makes these changes more palatable, I'm willing to >>>> make that change. I would prefer to do it with a follow up CR (it >>>> would >>>> probably have to be an RFE), but will do it with these changes if >>>> necessary. I still pull hairs over the proper name for this method, >>>> even >>>> if it is modified to return the frame of whoever called it. Usually >>>> the >>>> meaning conveyed by a method's name does not change based on >>>> whether you >>>> choose the caller's or callee's point of view, but in this case it >>>> does, >>>> and I'm not sure which point of view makes more sense. If we choose >>>> the >>>> caller's point of view, then the proper name remains >>>> os::current_frame(). If we choose the callee's point of view, then it >>>> should be os::callers_frame(). Maybe there's a name that is >>>> agnostic and >>>> means the same thing from both view points. I just haven't thought of >>>> one yet. >>>> >>>> With respect to ALWAYSINLINE, it does not work for solaris and windows >>>> slowdebug builds. Note the special case in the test I wrote to >>>> allow for >>>> AllocateHeap() in the stack trace in this case, even though it >>>> shouldn't >>>> be there because it uses ALWAYSINLINE. I could have made changes in >>>> the >>>> source to get rid of it from the stack trace, but I didn't feel the >>>> source code disruption was worth it for a slowdebug build, especially >>>> since there are only a allocation call sites where it is a problem. I >>>> could use ALWAYSINLINE for the cases where it will work to inline >>>> _get_previous_fp, but I don't really see that as being any more >>>> reliable >>>> than what is there now. >>>> >>>> As for making _get_previous_fp() a macro, that's made more complicated >>>> because it has #ifdefs already. I could move its implementation >>>> directly >>>> into os::current_frame(). That would fix the inlining problem. I think >>>> it could also use some cleanup with the #ifdefs. For example, for >>>> linux-x86 do we have to worry about the SPARC_WORKS and __clang__ >>>> cases? >>>> >>>> And yes, even with my changes the code is no less fragile, and no less >>>> misdirected in its approach to getting a consistent allocation back >>>> trace. As I see it, there are 3 options: >>>> >>>> (1) Do nothing, and leave it both broken and fragile. >>>> (2) Do the cleanup I've done to at least correct the known stack trace >>>> issues. >>>> (3) Find another solution that doesn't suffer from these fragility >>>> issues. >>>> >>>> Note that (3) does not preclude doing (2) first, and (2) seems a >>>> better >>>> alternative than leaving it in its broken state (1). That's why I have >>>> pursued these changes even though I know things will still be fragile. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 8/4/16 9:47 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> On 5/08/2016 7:53 AM, Chris Plummer wrote: >>>>>> Ping! >>>>> >>>>> I took another look at this and my earlier comments from JDK-8133749. >>>>> I hate to see the functionality "fixed" yet still have a completely >>>>> confusing and mis-named API. I'm still far from convinced that >>>>> returning the callers caller wasn't an "error" that was done due to >>>>> the lack of inlining and the appearance of an unexpected stackframe. >>>>> You've now made things consistent - but os::current_frame() is >>>>> completely mis-leading in name. And I'm still concerned that >>>>> correctness here depends on C compiler inlining choices, with no way >>>>> to verify at build time that they were indeed inlined or not! >>>>> Don't we >>>>> have ALWAYSINLINE to mark things like _get_previous_fp ? For that >>>>> matter shouldn't _get_previous_fp be a macro so inlining plays no >>>>> role ? >>>>> >>>>> Sorry but this code seems to simply limp from one broken state to >>>>> another due to its fragility. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> On 8/2/16 1:31 PM, Chris Plummer wrote: >>>>>>> Hello, >>>>>>> >>>>>>> Please review the following: >>>>>>> >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Bugs fixed: >>>>>>> >>>>>>> JDK-8133749: os::current_frame() is not returning the proper >>>>>>> frame on >>>>>>> ARM and solaris-x64 >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>>>>>> >>>>>>> JDK-8133747: NMT includes an extra stack frame due to assumption >>>>>>> NMT >>>>>>> is making on tail calls being used >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>>>>>> >>>>>>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>>>>>> includes NativeCallStack::NativeCallStack() frame in backtrace >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>>>>>> >>>>>>> The above bugs all result in the NMT detail stack traces including >>>>>>> extra frames in the stack traces. Certain frames are suppose to be >>>>>>> skipped, but sometimes are not. The frames that show up are: >>>>>>> >>>>>>> NativeCallStack::NativeCallStack >>>>>>> os::get_native_stack >>>>>>> >>>>>>> These are both methods used to generate the stack trace, and >>>>>>> therefore >>>>>>> should not be included it. However, under some (most) >>>>>>> circumstances, >>>>>>> they were. >>>>>>> >>>>>>> Also, there was no test to make sure that any NMT detail output is >>>>>>> generated, or that it is correct. I've added one with this >>>>>>> webrev. Of >>>>>>> the 27 possible builds (9 platforms * 3 build flavors), only 9 >>>>>>> of the >>>>>>> 27 initially passed this new test. They were the product and >>>>>>> fastdebug >>>>>>> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >>>>>>> builds for solaris-x64, windows-x86, and windows-x64. All the rest >>>>>>> failed. They now all pass with my fixes in place. >>>>>>> >>>>>>> Here's a summary of the changes: >>>>>>> >>>>>>> src/os/posix/vm/os_posix.cpp >>>>>>> src/os/windows/vm/os_windows.cpp >>>>>>> >>>>>>> JDK-8133747 fixes: There was some frame skipping logic here >>>>>>> which was >>>>>>> sort of correct, but was misplace. There are no extra frames being >>>>>>> added in os::get_native_stack() due to lack of inlining or lack >>>>>>> of a >>>>>>> tail call, so no need for toSkip++ here. The logic has been >>>>>>> moved to >>>>>>> NativeCallStack::NativeCallStack, which is where the tail call is >>>>>>> (sometimes) made, and also corrected (see nativeCallStack.cpp >>>>>>> below). >>>>>>> >>>>>>> src/share/vm/utilities/nativeCallStack.cpp >>>>>>> >>>>>>> JDK-8133747 fixes: The frame skipping logic that was moved here >>>>>>> assumed that NativeCallStack::NativeCallStack would not appear >>>>>>> in the >>>>>>> call stack (due to a tail call be using to call >>>>>>> os::get_native_stack) >>>>>>> except in slow debug builds. However, some platforms also don't >>>>>>> use a >>>>>>> tail call even when optimized. From what I can tell that is the >>>>>>> case >>>>>>> for 32-bit platforms and for windows. >>>>>>> >>>>>>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>>>>>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>>>>>> >>>>>>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >>>>>>> skip one extra frame >>>>>>> >>>>>>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>>>>>> >>>>>>> JDK-8133749 fixes: os:current_frame() was not consistent with other >>>>>>> platforms and needs to skip one more frame. This means it >>>>>>> returns the >>>>>>> frame for the caller's caller. So when called by >>>>>>> os:get_native_stack(), it returns the frame for whoever called >>>>>>> os::get_native_stack(). Although not intuitive, this is what >>>>>>> os:get_native_stack() expects. Probably a method rename and/or a >>>>>>> behavior change is justified here, but I would prefer to do that >>>>>>> with >>>>>>> a followup CR if anyone has a good suggestion on what to do. >>>>>>> >>>>>>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>>>>>> >>>>>>> This is the new NTM detail test. It checks for frames that >>>>>>> shouldn't >>>>>>> be present and validates at least one stack trace is what is >>>>>>> expected. >>>>>>> >>>>>>> I verified that the above test now passes on all supported >>>>>>> platforms, >>>>>>> and also did a full jprt "-testset hotpot" run. I plan on doing >>>>>>> some >>>>>>> RBT testing with NMT detail enabled before committing. >>>>>>> >>>>>>> Regarding the community contributed ports that Oracle does not >>>>>>> support, I didn't make any changes there, but it looks like some of >>>>>>> these bugs do exist. Notably: >>>>>>> >>>>>>> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >>>>>>> done to the >>>>>>> os_linux_x86.cp should also be applied here. >>>>>>> -linux-ppc: Hard to say for sure since the implementation of >>>>>>> os::current_frame is >>>>>>> different than others, but it looks to me like it suffers from >>>>>>> both >>>>>>> JDK-8133749 >>>>>>> and JDK-8133740. >>>>>>> -aix-ppc: Looks to be the same implementation as linux-ppc, so >>>>>>> would >>>>>>> need the >>>>>>> same changes. >>>>>>> >>>>>>> These ports may also be suffering from JDK-8133747, but that fix >>>>>>> is in >>>>>>> shared code (nativeCallStack.cpp). My changes there will need some >>>>>>> tweaking for these ports they don't use a tail call to call >>>>>>> os::get_native_stack(). >>>>>>> >>>>>>> If the maintainers of these ports could send me some NMT detail >>>>>>> output, I can advise better on what changes are needed. Then you >>>>>>> can >>>>>>> implement and test them, and then send them back to me and I'll >>>>>>> include them with my changes. What I need is the following command >>>>>>> run >>>>>>> on product and slowdebug builds. Initially run without any of my >>>>>>> changes applied. If needed I may followup with a request that >>>>>>> they be >>>>>>> run with the changes applied: >>>>>>> >>>>>>> bin/java -XX:+UnlockDiagnosticVMOptions >>>>>>> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>> >>>> >> From david.holmes at oracle.com Wed Aug 10 04:19:10 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 Aug 2016 14:19:10 +1000 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <5fdda014-ae54-d00f-7950-c88440d68d6e@oracle.com> References: <10b1ed76-cedb-d909-168f-863baf5869e2@oracle.com> <887ca8f2-4473-40ca-d1d1-336095c9f894@oracle.com> <2b3b19a4-d6d9-99c9-87a8-3f42ce97adb6@oracle.com> <5fdda014-ae54-d00f-7950-c88440d68d6e@oracle.com> Message-ID: On 10/08/2016 1:56 PM, Chris Plummer wrote: > On 8/8/16 5:52 PM, David Holmes wrote: >> On 9/08/2016 6:22 AM, Chris Plummer wrote: >>> Hi David, >>> >>> Did you want me to implement any of the additional cleanup work I >>> mentioned: manually inline _get_previous_fp, change os::current_frame() >>> to walk back one less frame, possibly rename os::current_frame()? >> >> Up to you. I'm not insisting on anything, but the less reliance we >> have on uncheckable (at build time) compiler behaviour, the better. > Ok. Given the relatively short amount of time left to resolve p4s, I > think I will just leave it as-is. I've started some more robust testing > the NMT detailed enabled. Once that completes I'll do the push. I'll > also file a couple of RFEs for further cleanup/improvements. > > Can I consider the changes officially reviewed by you now? Yes. Thanks, David > thanks, > > Chris >> >> Thanks, >> David >> >>> thanks, >>> >>> Chris >>> >>> On 8/7/16 4:26 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> I don't have any good suggestions for this. So go with (2) and lets >>>> work on (3). >>>> >>>> Thanks, >>>> David >>>> >>>> On 5/08/2016 5:05 PM, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> If fixing os::current_frame() to have a better name and also make >>>>> it go >>>>> up one less frame makes these changes more palatable, I'm willing to >>>>> make that change. I would prefer to do it with a follow up CR (it >>>>> would >>>>> probably have to be an RFE), but will do it with these changes if >>>>> necessary. I still pull hairs over the proper name for this method, >>>>> even >>>>> if it is modified to return the frame of whoever called it. Usually >>>>> the >>>>> meaning conveyed by a method's name does not change based on >>>>> whether you >>>>> choose the caller's or callee's point of view, but in this case it >>>>> does, >>>>> and I'm not sure which point of view makes more sense. If we choose >>>>> the >>>>> caller's point of view, then the proper name remains >>>>> os::current_frame(). If we choose the callee's point of view, then it >>>>> should be os::callers_frame(). Maybe there's a name that is >>>>> agnostic and >>>>> means the same thing from both view points. I just haven't thought of >>>>> one yet. >>>>> >>>>> With respect to ALWAYSINLINE, it does not work for solaris and windows >>>>> slowdebug builds. Note the special case in the test I wrote to >>>>> allow for >>>>> AllocateHeap() in the stack trace in this case, even though it >>>>> shouldn't >>>>> be there because it uses ALWAYSINLINE. I could have made changes in >>>>> the >>>>> source to get rid of it from the stack trace, but I didn't feel the >>>>> source code disruption was worth it for a slowdebug build, especially >>>>> since there are only a allocation call sites where it is a problem. I >>>>> could use ALWAYSINLINE for the cases where it will work to inline >>>>> _get_previous_fp, but I don't really see that as being any more >>>>> reliable >>>>> than what is there now. >>>>> >>>>> As for making _get_previous_fp() a macro, that's made more complicated >>>>> because it has #ifdefs already. I could move its implementation >>>>> directly >>>>> into os::current_frame(). That would fix the inlining problem. I think >>>>> it could also use some cleanup with the #ifdefs. For example, for >>>>> linux-x86 do we have to worry about the SPARC_WORKS and __clang__ >>>>> cases? >>>>> >>>>> And yes, even with my changes the code is no less fragile, and no less >>>>> misdirected in its approach to getting a consistent allocation back >>>>> trace. As I see it, there are 3 options: >>>>> >>>>> (1) Do nothing, and leave it both broken and fragile. >>>>> (2) Do the cleanup I've done to at least correct the known stack trace >>>>> issues. >>>>> (3) Find another solution that doesn't suffer from these fragility >>>>> issues. >>>>> >>>>> Note that (3) does not preclude doing (2) first, and (2) seems a >>>>> better >>>>> alternative than leaving it in its broken state (1). That's why I have >>>>> pursued these changes even though I know things will still be fragile. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 8/4/16 9:47 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 5/08/2016 7:53 AM, Chris Plummer wrote: >>>>>>> Ping! >>>>>> >>>>>> I took another look at this and my earlier comments from JDK-8133749. >>>>>> I hate to see the functionality "fixed" yet still have a completely >>>>>> confusing and mis-named API. I'm still far from convinced that >>>>>> returning the callers caller wasn't an "error" that was done due to >>>>>> the lack of inlining and the appearance of an unexpected stackframe. >>>>>> You've now made things consistent - but os::current_frame() is >>>>>> completely mis-leading in name. And I'm still concerned that >>>>>> correctness here depends on C compiler inlining choices, with no way >>>>>> to verify at build time that they were indeed inlined or not! >>>>>> Don't we >>>>>> have ALWAYSINLINE to mark things like _get_previous_fp ? For that >>>>>> matter shouldn't _get_previous_fp be a macro so inlining plays no >>>>>> role ? >>>>>> >>>>>> Sorry but this code seems to simply limp from one broken state to >>>>>> another due to its fragility. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> On 8/2/16 1:31 PM, Chris Plummer wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> Please review the following: >>>>>>>> >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Bugs fixed: >>>>>>>> >>>>>>>> JDK-8133749: os::current_frame() is not returning the proper >>>>>>>> frame on >>>>>>>> ARM and solaris-x64 >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>>>>>>> >>>>>>>> JDK-8133747: NMT includes an extra stack frame due to assumption >>>>>>>> NMT >>>>>>>> is making on tail calls being used >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>>>>>>> >>>>>>>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>>>>>>> includes NativeCallStack::NativeCallStack() frame in backtrace >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>>>>>>> >>>>>>>> The above bugs all result in the NMT detail stack traces including >>>>>>>> extra frames in the stack traces. Certain frames are suppose to be >>>>>>>> skipped, but sometimes are not. The frames that show up are: >>>>>>>> >>>>>>>> NativeCallStack::NativeCallStack >>>>>>>> os::get_native_stack >>>>>>>> >>>>>>>> These are both methods used to generate the stack trace, and >>>>>>>> therefore >>>>>>>> should not be included it. However, under some (most) >>>>>>>> circumstances, >>>>>>>> they were. >>>>>>>> >>>>>>>> Also, there was no test to make sure that any NMT detail output is >>>>>>>> generated, or that it is correct. I've added one with this >>>>>>>> webrev. Of >>>>>>>> the 27 possible builds (9 platforms * 3 build flavors), only 9 >>>>>>>> of the >>>>>>>> 27 initially passed this new test. They were the product and >>>>>>>> fastdebug >>>>>>>> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >>>>>>>> builds for solaris-x64, windows-x86, and windows-x64. All the rest >>>>>>>> failed. They now all pass with my fixes in place. >>>>>>>> >>>>>>>> Here's a summary of the changes: >>>>>>>> >>>>>>>> src/os/posix/vm/os_posix.cpp >>>>>>>> src/os/windows/vm/os_windows.cpp >>>>>>>> >>>>>>>> JDK-8133747 fixes: There was some frame skipping logic here >>>>>>>> which was >>>>>>>> sort of correct, but was misplace. There are no extra frames being >>>>>>>> added in os::get_native_stack() due to lack of inlining or lack >>>>>>>> of a >>>>>>>> tail call, so no need for toSkip++ here. The logic has been >>>>>>>> moved to >>>>>>>> NativeCallStack::NativeCallStack, which is where the tail call is >>>>>>>> (sometimes) made, and also corrected (see nativeCallStack.cpp >>>>>>>> below). >>>>>>>> >>>>>>>> src/share/vm/utilities/nativeCallStack.cpp >>>>>>>> >>>>>>>> JDK-8133747 fixes: The frame skipping logic that was moved here >>>>>>>> assumed that NativeCallStack::NativeCallStack would not appear >>>>>>>> in the >>>>>>>> call stack (due to a tail call be using to call >>>>>>>> os::get_native_stack) >>>>>>>> except in slow debug builds. However, some platforms also don't >>>>>>>> use a >>>>>>>> tail call even when optimized. From what I can tell that is the >>>>>>>> case >>>>>>>> for 32-bit platforms and for windows. >>>>>>>> >>>>>>>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>>>>>>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>>>>>>> >>>>>>>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >>>>>>>> skip one extra frame >>>>>>>> >>>>>>>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>>>>>>> >>>>>>>> JDK-8133749 fixes: os:current_frame() was not consistent with other >>>>>>>> platforms and needs to skip one more frame. This means it >>>>>>>> returns the >>>>>>>> frame for the caller's caller. So when called by >>>>>>>> os:get_native_stack(), it returns the frame for whoever called >>>>>>>> os::get_native_stack(). Although not intuitive, this is what >>>>>>>> os:get_native_stack() expects. Probably a method rename and/or a >>>>>>>> behavior change is justified here, but I would prefer to do that >>>>>>>> with >>>>>>>> a followup CR if anyone has a good suggestion on what to do. >>>>>>>> >>>>>>>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>>>>>>> >>>>>>>> This is the new NTM detail test. It checks for frames that >>>>>>>> shouldn't >>>>>>>> be present and validates at least one stack trace is what is >>>>>>>> expected. >>>>>>>> >>>>>>>> I verified that the above test now passes on all supported >>>>>>>> platforms, >>>>>>>> and also did a full jprt "-testset hotpot" run. I plan on doing >>>>>>>> some >>>>>>>> RBT testing with NMT detail enabled before committing. >>>>>>>> >>>>>>>> Regarding the community contributed ports that Oracle does not >>>>>>>> support, I didn't make any changes there, but it looks like some of >>>>>>>> these bugs do exist. Notably: >>>>>>>> >>>>>>>> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >>>>>>>> done to the >>>>>>>> os_linux_x86.cp should also be applied here. >>>>>>>> -linux-ppc: Hard to say for sure since the implementation of >>>>>>>> os::current_frame is >>>>>>>> different than others, but it looks to me like it suffers from >>>>>>>> both >>>>>>>> JDK-8133749 >>>>>>>> and JDK-8133740. >>>>>>>> -aix-ppc: Looks to be the same implementation as linux-ppc, so >>>>>>>> would >>>>>>>> need the >>>>>>>> same changes. >>>>>>>> >>>>>>>> These ports may also be suffering from JDK-8133747, but that fix >>>>>>>> is in >>>>>>>> shared code (nativeCallStack.cpp). My changes there will need some >>>>>>>> tweaking for these ports they don't use a tail call to call >>>>>>>> os::get_native_stack(). >>>>>>>> >>>>>>>> If the maintainers of these ports could send me some NMT detail >>>>>>>> output, I can advise better on what changes are needed. Then you >>>>>>>> can >>>>>>>> implement and test them, and then send them back to me and I'll >>>>>>>> include them with my changes. What I need is the following command >>>>>>>> run >>>>>>>> on product and slowdebug builds. Initially run without any of my >>>>>>>> changes applied. If needed I may followup with a request that >>>>>>>> they be >>>>>>>> run with the changes applied: >>>>>>>> >>>>>>>> bin/java -XX:+UnlockDiagnosticVMOptions >>>>>>>> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>> >>>>> >>> > From chris.plummer at oracle.com Wed Aug 10 05:39:30 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 9 Aug 2016 22:39:30 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: <10b1ed76-cedb-d909-168f-863baf5869e2@oracle.com> <887ca8f2-4473-40ca-d1d1-336095c9f894@oracle.com> <2b3b19a4-d6d9-99c9-87a8-3f42ce97adb6@oracle.com> <5fdda014-ae54-d00f-7950-c88440d68d6e@oracle.com> Message-ID: On 8/9/16 9:19 PM, David Holmes wrote: > On 10/08/2016 1:56 PM, Chris Plummer wrote: >> On 8/8/16 5:52 PM, David Holmes wrote: >>> On 9/08/2016 6:22 AM, Chris Plummer wrote: >>>> Hi David, >>>> >>>> Did you want me to implement any of the additional cleanup work I >>>> mentioned: manually inline _get_previous_fp, change >>>> os::current_frame() >>>> to walk back one less frame, possibly rename os::current_frame()? >>> >>> Up to you. I'm not insisting on anything, but the less reliance we >>> have on uncheckable (at build time) compiler behaviour, the better. >> Ok. Given the relatively short amount of time left to resolve p4s, I >> think I will just leave it as-is. I've started some more robust testing >> the NMT detailed enabled. Once that completes I'll do the push. I'll >> also file a couple of RFEs for further cleanup/improvements. >> >> Can I consider the changes officially reviewed by you now? > > Yes. > > Thanks, > David Thanks! Chris > >> thanks, >> >> Chris >>> >>> Thanks, >>> David >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 8/7/16 4:26 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> I don't have any good suggestions for this. So go with (2) and lets >>>>> work on (3). >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 5/08/2016 5:05 PM, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> If fixing os::current_frame() to have a better name and also make >>>>>> it go >>>>>> up one less frame makes these changes more palatable, I'm willing to >>>>>> make that change. I would prefer to do it with a follow up CR (it >>>>>> would >>>>>> probably have to be an RFE), but will do it with these changes if >>>>>> necessary. I still pull hairs over the proper name for this method, >>>>>> even >>>>>> if it is modified to return the frame of whoever called it. Usually >>>>>> the >>>>>> meaning conveyed by a method's name does not change based on >>>>>> whether you >>>>>> choose the caller's or callee's point of view, but in this case it >>>>>> does, >>>>>> and I'm not sure which point of view makes more sense. If we choose >>>>>> the >>>>>> caller's point of view, then the proper name remains >>>>>> os::current_frame(). If we choose the callee's point of view, >>>>>> then it >>>>>> should be os::callers_frame(). Maybe there's a name that is >>>>>> agnostic and >>>>>> means the same thing from both view points. I just haven't >>>>>> thought of >>>>>> one yet. >>>>>> >>>>>> With respect to ALWAYSINLINE, it does not work for solaris and >>>>>> windows >>>>>> slowdebug builds. Note the special case in the test I wrote to >>>>>> allow for >>>>>> AllocateHeap() in the stack trace in this case, even though it >>>>>> shouldn't >>>>>> be there because it uses ALWAYSINLINE. I could have made changes in >>>>>> the >>>>>> source to get rid of it from the stack trace, but I didn't feel the >>>>>> source code disruption was worth it for a slowdebug build, >>>>>> especially >>>>>> since there are only a allocation call sites where it is a >>>>>> problem. I >>>>>> could use ALWAYSINLINE for the cases where it will work to inline >>>>>> _get_previous_fp, but I don't really see that as being any more >>>>>> reliable >>>>>> than what is there now. >>>>>> >>>>>> As for making _get_previous_fp() a macro, that's made more >>>>>> complicated >>>>>> because it has #ifdefs already. I could move its implementation >>>>>> directly >>>>>> into os::current_frame(). That would fix the inlining problem. I >>>>>> think >>>>>> it could also use some cleanup with the #ifdefs. For example, for >>>>>> linux-x86 do we have to worry about the SPARC_WORKS and __clang__ >>>>>> cases? >>>>>> >>>>>> And yes, even with my changes the code is no less fragile, and no >>>>>> less >>>>>> misdirected in its approach to getting a consistent allocation back >>>>>> trace. As I see it, there are 3 options: >>>>>> >>>>>> (1) Do nothing, and leave it both broken and fragile. >>>>>> (2) Do the cleanup I've done to at least correct the known stack >>>>>> trace >>>>>> issues. >>>>>> (3) Find another solution that doesn't suffer from these fragility >>>>>> issues. >>>>>> >>>>>> Note that (3) does not preclude doing (2) first, and (2) seems a >>>>>> better >>>>>> alternative than leaving it in its broken state (1). That's why I >>>>>> have >>>>>> pursued these changes even though I know things will still be >>>>>> fragile. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 8/4/16 9:47 PM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> On 5/08/2016 7:53 AM, Chris Plummer wrote: >>>>>>>> Ping! >>>>>>> >>>>>>> I took another look at this and my earlier comments from >>>>>>> JDK-8133749. >>>>>>> I hate to see the functionality "fixed" yet still have a completely >>>>>>> confusing and mis-named API. I'm still far from convinced that >>>>>>> returning the callers caller wasn't an "error" that was done due to >>>>>>> the lack of inlining and the appearance of an unexpected >>>>>>> stackframe. >>>>>>> You've now made things consistent - but os::current_frame() is >>>>>>> completely mis-leading in name. And I'm still concerned that >>>>>>> correctness here depends on C compiler inlining choices, with no >>>>>>> way >>>>>>> to verify at build time that they were indeed inlined or not! >>>>>>> Don't we >>>>>>> have ALWAYSINLINE to mark things like _get_previous_fp ? For that >>>>>>> matter shouldn't _get_previous_fp be a macro so inlining plays no >>>>>>> role ? >>>>>>> >>>>>>> Sorry but this code seems to simply limp from one broken state to >>>>>>> another due to its fragility. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> On 8/2/16 1:31 PM, Chris Plummer wrote: >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Please review the following: >>>>>>>>> >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Bugs fixed: >>>>>>>>> >>>>>>>>> JDK-8133749: os::current_frame() is not returning the proper >>>>>>>>> frame on >>>>>>>>> ARM and solaris-x64 >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>>>>>>>> >>>>>>>>> JDK-8133747: NMT includes an extra stack frame due to assumption >>>>>>>>> NMT >>>>>>>>> is making on tail calls being used >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>>>>>>>> >>>>>>>>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>>>>>>>> includes NativeCallStack::NativeCallStack() frame in backtrace >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>>>>>>>> >>>>>>>>> The above bugs all result in the NMT detail stack traces >>>>>>>>> including >>>>>>>>> extra frames in the stack traces. Certain frames are suppose >>>>>>>>> to be >>>>>>>>> skipped, but sometimes are not. The frames that show up are: >>>>>>>>> >>>>>>>>> NativeCallStack::NativeCallStack >>>>>>>>> os::get_native_stack >>>>>>>>> >>>>>>>>> These are both methods used to generate the stack trace, and >>>>>>>>> therefore >>>>>>>>> should not be included it. However, under some (most) >>>>>>>>> circumstances, >>>>>>>>> they were. >>>>>>>>> >>>>>>>>> Also, there was no test to make sure that any NMT detail >>>>>>>>> output is >>>>>>>>> generated, or that it is correct. I've added one with this >>>>>>>>> webrev. Of >>>>>>>>> the 27 possible builds (9 platforms * 3 build flavors), only 9 >>>>>>>>> of the >>>>>>>>> 27 initially passed this new test. They were the product and >>>>>>>>> fastdebug >>>>>>>>> builds for solaris-sparc, bsd-x64, and linux-x64; and the >>>>>>>>> slowdebug >>>>>>>>> builds for solaris-x64, windows-x86, and windows-x64. All the >>>>>>>>> rest >>>>>>>>> failed. They now all pass with my fixes in place. >>>>>>>>> >>>>>>>>> Here's a summary of the changes: >>>>>>>>> >>>>>>>>> src/os/posix/vm/os_posix.cpp >>>>>>>>> src/os/windows/vm/os_windows.cpp >>>>>>>>> >>>>>>>>> JDK-8133747 fixes: There was some frame skipping logic here >>>>>>>>> which was >>>>>>>>> sort of correct, but was misplace. There are no extra frames >>>>>>>>> being >>>>>>>>> added in os::get_native_stack() due to lack of inlining or lack >>>>>>>>> of a >>>>>>>>> tail call, so no need for toSkip++ here. The logic has been >>>>>>>>> moved to >>>>>>>>> NativeCallStack::NativeCallStack, which is where the tail call is >>>>>>>>> (sometimes) made, and also corrected (see nativeCallStack.cpp >>>>>>>>> below). >>>>>>>>> >>>>>>>>> src/share/vm/utilities/nativeCallStack.cpp >>>>>>>>> >>>>>>>>> JDK-8133747 fixes: The frame skipping logic that was moved here >>>>>>>>> assumed that NativeCallStack::NativeCallStack would not appear >>>>>>>>> in the >>>>>>>>> call stack (due to a tail call be using to call >>>>>>>>> os::get_native_stack) >>>>>>>>> except in slow debug builds. However, some platforms also don't >>>>>>>>> use a >>>>>>>>> tail call even when optimized. From what I can tell that is the >>>>>>>>> case >>>>>>>>> for 32-bit platforms and for windows. >>>>>>>>> >>>>>>>>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>>>>>>>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>>>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>>>>>>>> >>>>>>>>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we >>>>>>>>> need to >>>>>>>>> skip one extra frame >>>>>>>>> >>>>>>>>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>>>>>>>> >>>>>>>>> JDK-8133749 fixes: os:current_frame() was not consistent with >>>>>>>>> other >>>>>>>>> platforms and needs to skip one more frame. This means it >>>>>>>>> returns the >>>>>>>>> frame for the caller's caller. So when called by >>>>>>>>> os:get_native_stack(), it returns the frame for whoever called >>>>>>>>> os::get_native_stack(). Although not intuitive, this is what >>>>>>>>> os:get_native_stack() expects. Probably a method rename and/or a >>>>>>>>> behavior change is justified here, but I would prefer to do that >>>>>>>>> with >>>>>>>>> a followup CR if anyone has a good suggestion on what to do. >>>>>>>>> >>>>>>>>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>>>>>>>> >>>>>>>>> This is the new NTM detail test. It checks for frames that >>>>>>>>> shouldn't >>>>>>>>> be present and validates at least one stack trace is what is >>>>>>>>> expected. >>>>>>>>> >>>>>>>>> I verified that the above test now passes on all supported >>>>>>>>> platforms, >>>>>>>>> and also did a full jprt "-testset hotpot" run. I plan on doing >>>>>>>>> some >>>>>>>>> RBT testing with NMT detail enabled before committing. >>>>>>>>> >>>>>>>>> Regarding the community contributed ports that Oracle does not >>>>>>>>> support, I didn't make any changes there, but it looks like >>>>>>>>> some of >>>>>>>>> these bugs do exist. Notably: >>>>>>>>> >>>>>>>>> -linux-aarch64: Looks like it suffers from JDK-8133740. The >>>>>>>>> changes >>>>>>>>> done to the >>>>>>>>> os_linux_x86.cp should also be applied here. >>>>>>>>> -linux-ppc: Hard to say for sure since the implementation of >>>>>>>>> os::current_frame is >>>>>>>>> different than others, but it looks to me like it suffers from >>>>>>>>> both >>>>>>>>> JDK-8133749 >>>>>>>>> and JDK-8133740. >>>>>>>>> -aix-ppc: Looks to be the same implementation as linux-ppc, so >>>>>>>>> would >>>>>>>>> need the >>>>>>>>> same changes. >>>>>>>>> >>>>>>>>> These ports may also be suffering from JDK-8133747, but that fix >>>>>>>>> is in >>>>>>>>> shared code (nativeCallStack.cpp). My changes there will need >>>>>>>>> some >>>>>>>>> tweaking for these ports they don't use a tail call to call >>>>>>>>> os::get_native_stack(). >>>>>>>>> >>>>>>>>> If the maintainers of these ports could send me some NMT detail >>>>>>>>> output, I can advise better on what changes are needed. Then you >>>>>>>>> can >>>>>>>>> implement and test them, and then send them back to me and I'll >>>>>>>>> include them with my changes. What I need is the following >>>>>>>>> command >>>>>>>>> run >>>>>>>>> on product and slowdebug builds. Initially run without any of my >>>>>>>>> changes applied. If needed I may followup with a request that >>>>>>>>> they be >>>>>>>>> run with the changes applied: >>>>>>>>> >>>>>>>>> bin/java -XX:+UnlockDiagnosticVMOptions >>>>>>>>> -XX:NativeMemoryTracking=detail -XX:+PrintNMTStatistics -version >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>> >>>>>> >>>> >> From robbin.ehn at oracle.com Wed Aug 10 07:30:17 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 10 Aug 2016 09:30:17 +0200 Subject: RFR(XXS): 8161007: GPL header missing comma in year Message-ID: <100da202-13f0-6e95-0936-f62f8a304510@oracle.com> Hi all, please review! Bug: https://bugs.openjdk.java.net/browse/JDK-8161026 diff -r 14f97d7574bf src/share/vm/logging/logTagSet.hpp --- a/src/share/vm/logging/logTagSet.hpp Mon Aug 08 15:53:02 2016 +0000 +++ b/src/share/vm/logging/logTagSet.hpp Wed Aug 10 09:25:27 2016 +0200 @@ -1,3 +1,3 @@ /* - * Copyright (c) 2015, 2016 Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2015, 2016, Oracle and/or its affiliates. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. Thanks! /Robbin From robbin.ehn at oracle.com Wed Aug 10 07:33:34 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 10 Aug 2016 09:33:34 +0200 Subject: RFR(XXS): 8161026: GPL header missing comma in year In-Reply-To: <100da202-13f0-6e95-0936-f62f8a304510@oracle.com> References: <100da202-13f0-6e95-0936-f62f8a304510@oracle.com> Message-ID: <086d7329-73f6-0c2a-df9c-9520476319a2@oracle.com> Hi, sorry wrong case in subject! /Robbin On 08/10/2016 09:30 AM, Robbin Ehn wrote: > Hi all, please review! > > Bug: https://bugs.openjdk.java.net/browse/JDK-8161026 > > diff -r 14f97d7574bf src/share/vm/logging/logTagSet.hpp > --- a/src/share/vm/logging/logTagSet.hpp Mon Aug 08 15:53:02 2016 +0000 > +++ b/src/share/vm/logging/logTagSet.hpp Wed Aug 10 09:25:27 2016 +0200 > @@ -1,3 +1,3 @@ > /* > - * Copyright (c) 2015, 2016 Oracle and/or its affiliates. All rights reserved. > + * Copyright (c) 2015, 2016, Oracle and/or its affiliates. All rights reserved. > * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. > > Thanks! > > /Robbin From claes.redestad at oracle.com Wed Aug 10 07:37:45 2016 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 10 Aug 2016 09:37:45 +0200 Subject: RFR(XXS): 8161026: GPL header missing comma in year In-Reply-To: <086d7329-73f6-0c2a-df9c-9520476319a2@oracle.com> References: <100da202-13f0-6e95-0936-f62f8a304510@oracle.com> <086d7329-73f6-0c2a-df9c-9520476319a2@oracle.com> Message-ID: <57AAD9C9.4040600@oracle.com> Ship it! /Claes On 2016-08-10 09:33, Robbin Ehn wrote: > Hi, sorry wrong case in subject! > > /Robbin > > On 08/10/2016 09:30 AM, Robbin Ehn wrote: >> Hi all, please review! >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8161026 >> >> diff -r 14f97d7574bf src/share/vm/logging/logTagSet.hpp >> --- a/src/share/vm/logging/logTagSet.hpp Mon Aug 08 15:53:02 2016 >> +0000 >> +++ b/src/share/vm/logging/logTagSet.hpp Wed Aug 10 09:25:27 2016 >> +0200 >> @@ -1,3 +1,3 @@ >> /* >> - * Copyright (c) 2015, 2016 Oracle and/or its affiliates. All rights >> reserved. >> + * Copyright (c) 2015, 2016, Oracle and/or its affiliates. All rights >> reserved. >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> >> Thanks! >> >> /Robbin From stefan.johansson at oracle.com Wed Aug 10 07:38:34 2016 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 10 Aug 2016 09:38:34 +0200 Subject: RFR(XXS): 8161026: GPL header missing comma in year In-Reply-To: <086d7329-73f6-0c2a-df9c-9520476319a2@oracle.com> References: <100da202-13f0-6e95-0936-f62f8a304510@oracle.com> <086d7329-73f6-0c2a-df9c-9520476319a2@oracle.com> Message-ID: <72047b73-752b-5de2-ffb3-b5aa462ccc3f@oracle.com> Looks good. StefanJ On 2016-08-10 09:33, Robbin Ehn wrote: > Hi, sorry wrong case in subject! > > /Robbin > > On 08/10/2016 09:30 AM, Robbin Ehn wrote: >> Hi all, please review! >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8161026 >> >> diff -r 14f97d7574bf src/share/vm/logging/logTagSet.hpp >> --- a/src/share/vm/logging/logTagSet.hpp Mon Aug 08 15:53:02 2016 >> +0000 >> +++ b/src/share/vm/logging/logTagSet.hpp Wed Aug 10 09:25:27 2016 >> +0200 >> @@ -1,3 +1,3 @@ >> /* >> - * Copyright (c) 2015, 2016 Oracle and/or its affiliates. All rights >> reserved. >> + * Copyright (c) 2015, 2016, Oracle and/or its affiliates. All >> rights reserved. >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> >> Thanks! >> >> /Robbin From robbin.ehn at oracle.com Wed Aug 10 07:47:10 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 10 Aug 2016 09:47:10 +0200 Subject: RFR(XXS): 8161026: GPL header missing comma in year In-Reply-To: <57AAD9C9.4040600@oracle.com> References: <100da202-13f0-6e95-0936-f62f8a304510@oracle.com> <086d7329-73f6-0c2a-df9c-9520476319a2@oracle.com> <57AAD9C9.4040600@oracle.com> Message-ID: <59808cf4-0084-ebb2-15b3-43f6174f68f5@oracle.com> Yes, thanks! /Robbin On 08/10/2016 09:37 AM, Claes Redestad wrote: > Ship it! > > /Claes > > On 2016-08-10 09:33, Robbin Ehn wrote: >> Hi, sorry wrong case in subject! >> >> /Robbin >> >> On 08/10/2016 09:30 AM, Robbin Ehn wrote: >>> Hi all, please review! >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8161026 >>> >>> diff -r 14f97d7574bf src/share/vm/logging/logTagSet.hpp >>> --- a/src/share/vm/logging/logTagSet.hpp Mon Aug 08 15:53:02 2016 >>> +0000 >>> +++ b/src/share/vm/logging/logTagSet.hpp Wed Aug 10 09:25:27 2016 >>> +0200 >>> @@ -1,3 +1,3 @@ >>> /* >>> - * Copyright (c) 2015, 2016 Oracle and/or its affiliates. All rights >>> reserved. >>> + * Copyright (c) 2015, 2016, Oracle and/or its affiliates. All rights >>> reserved. >>> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>> >>> Thanks! >>> >>> /Robbin From robbin.ehn at oracle.com Wed Aug 10 07:48:46 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 10 Aug 2016 09:48:46 +0200 Subject: RFR(XXS): 8161026: GPL header missing comma in year In-Reply-To: <72047b73-752b-5de2-ffb3-b5aa462ccc3f@oracle.com> References: <100da202-13f0-6e95-0936-f62f8a304510@oracle.com> <086d7329-73f6-0c2a-df9c-9520476319a2@oracle.com> <72047b73-752b-5de2-ffb3-b5aa462ccc3f@oracle.com> Message-ID: Thanks! /Robbin On 08/10/2016 09:38 AM, Stefan Johansson wrote: > Looks good. > > StefanJ > > On 2016-08-10 09:33, Robbin Ehn wrote: >> Hi, sorry wrong case in subject! >> >> /Robbin >> >> On 08/10/2016 09:30 AM, Robbin Ehn wrote: >>> Hi all, please review! >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8161026 >>> >>> diff -r 14f97d7574bf src/share/vm/logging/logTagSet.hpp >>> --- a/src/share/vm/logging/logTagSet.hpp Mon Aug 08 15:53:02 2016 +0000 >>> +++ b/src/share/vm/logging/logTagSet.hpp Wed Aug 10 09:25:27 2016 +0200 >>> @@ -1,3 +1,3 @@ >>> /* >>> - * Copyright (c) 2015, 2016 Oracle and/or its affiliates. All rights reserved. >>> + * Copyright (c) 2015, 2016, Oracle and/or its affiliates. All rights reserved. >>> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>> >>> Thanks! >>> >>> /Robbin > From shafi.s.ahmad at oracle.com Wed Aug 10 08:34:44 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Wed, 10 Aug 2016 01:34:44 -0700 (PDT) Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 Message-ID: <740a30c4-ceca-4ede-81b5-853a4c732070@default> Hi, Please review the code change for "JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968" to jdk8u. Please note this is partial backport of http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 Summary: Microsoft version of vsnprintf() behaves differently from the standard C version when there is not enough space in the buffer. Microsoft version doesn't null terminates its output under error conditions, whereas the standard C version does. On Windows, it returns -1. We handle both cases here and always return -1, and perform null termination. Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 Webrev link: http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ Testing: jprt Regards, Shafi From harold.seigel at oracle.com Wed Aug 10 13:08:16 2016 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 10 Aug 2016 09:08:16 -0400 Subject: Fwd: RFR 8058575: IllegalAccessError trying to access package-private class from VM anonymous class In-Reply-To: <0149d99a-842d-4b18-e3b8-67eced65517e@oracle.com> References: <0149d99a-842d-4b18-e3b8-67eced65517e@oracle.com> Message-ID: <30bd8b59-08f5-b09a-4273-1b5b3232aaaf@oracle.com> Hi, Please review these updated webrevs to fix bug JDK-8058575. http://cr.openjdk.java.net/~hseigel/bug_8058575.hs.2/ http://cr.openjdk.java.net/~hseigel/bug_8058575.jdk.2/ The revised changes were tested as described below. Thanks, Harold -------- Forwarded Message -------- Subject: RFR 8058575: IllegalAccessError trying to access package-private class from VM anonymous class Date: Wed, 3 Aug 2016 08:15:58 -0400 From: harold seigel Organization: Oracle Corporation To: Hotspot dev runtime Hi, Please review this fix for bug 8058575. The fix prevents a class created using Unsafe.defineAnonymousClass() from being in a different package than its host class. Being in different packages would create access problems if the packages were in different modules. With this fix, If the anonymous class is in a different package then the JVM will throw IllegalArgumentException. If the anonymous class is in the unnamed package then the JVM will move the anonymous class into its host class's package. JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8058575 Open webrevs: http://cr.openjdk.java.net/~hseigel/bug_8058575.hs/ http://cr.openjdk.java.net/~hseigel/bug_8058575.jdk/ The fix was tested with the JCK Lang and VM tests, the hotpot, and java/lang, java/util and other JTreg tests, the NSK quick tests, and with the RBT runtime nightly tests. Thanks, Harold From coleen.phillimore at oracle.com Wed Aug 10 13:22:04 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 10 Aug 2016 09:22:04 -0400 Subject: RFR 8079562: Crash in C2 compiled code with STATUS_FLOAT_MULTIPLE_TRAPS In-Reply-To: <029b035d-74d8-ac44-fa4b-08952f4ebf24@oracle.com> References: <2da35fee-4da1-7590-6841-f2cb1cc73d6a@oracle.com> <8a82ee56-3068-bcb4-5df5-8d20228fe05d@oracle.com> <0808479f-79f8-04a8-a614-08afa33f23a8@oracle.com> <0f3fc479-1aed-12e9-5029-6ee7d2875419@oracle.com> <029b035d-74d8-ac44-fa4b-08952f4ebf24@oracle.com> Message-ID: <51c1876c-a8f7-53f9-7e86-3b5864c399ca@oracle.com> This RFR is withdrawn. Coleen On 8/2/16 10:48 PM, Coleen Phillimore wrote: > > > Hi, David, That would be a better test, I guess. My goal here is to > fix this test from failing on windows 32 and salvage something out of > the existing test. We could simply delete the test and file an RFE > to test fpcw better. > > > On 8/2/16 10:34 PM, David Holmes wrote: >> Hi Coleen, >> >> So does this: >> >> setCW( (FPU_DEFAULT | FPU_IEM) & ~EM_ZERODIVIDE ); >> >> enable or disable hardware exceptions for FP-div-by-zero? > > I think it disables hardware exceptions. I guess Christian's going > to have to answer this tomorrow. >> >> To be honest I'm not sure this test serves any real point in its >> current form any more. A general cross-platform test for the >> AlwaysRestoreFPU flag would be: >> >> static native int getFPUCW(); >> static native void setFPUCW(int cw); >> >> test() { >> int default_cw = getFPUCW(); >> int new_cw = ~default_cw; >> setFPUCW(new_cw); >> assert(getFPU_CW() == default_cw, "AlwaysRestoreFPU didn't restore >> it!); >> } >> >> but that kind of misses the point about triggering unhandled C++ >> exceptions. And isn't getting to the bottom of the internal error >> that was seen. >> > > No, if that's required, I'll have to unassign myself from the bug, > quarantine the test, and save it for jdk10, because it might take a > long time to figure out, if we ever figure it out. I don't think > there's a product issue here at least from my reading of the situation. > >> Sorry. Feel free to ignore me. >> > > We never ignore you, David. > > Coleen > >> David >> >> >> On 3/08/2016 12:14 PM, Coleen Phillimore wrote: >>> >>> >>> On 8/2/16 9:32 PM, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> On 3/08/2016 11:16 AM, Coleen Phillimore wrote: >>>>> >>>>> >>>>> On 8/2/16 8:35 PM, David Holmes wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 3/08/2016 8:34 AM, Coleen Phillimore wrote: >>>>>>> Summary: Add AlwaysRestoreFPU and move the test to jtreg. >>>>>>> Contributed-by: myself and christian.tornqvist at oracle.com >>>>>>> >>>>>>> Christian moved the test, I did some cleanup and added >>>>>>> AlwaysRestoreFPU >>>>>>> to it. >>>>>> >>>>>> You modified the original C code such that this comment is no longer >>>>>> valid: >>>>>> >>>>>> 33 // Only valid for specific x86 based systems: linux-x86, or >>>>>> else >>>>>> x86 with fpu_control.h >>>>> >>>>> Thanks, David. I took out the rest of the comments but missed >>>>> this one. >>>>>> >>>>>> Not sure why you did that - we were running this test on more than >>>>>> just Windows, despite the naming. >>>>> >>>>> Actually, the #ifdefs were intended to only run on 32 bit linux as >>>>> well >>>>> as windows, but I don't think the ifdefs were correct, so the test >>>>> always printed that it was skipped on platforms other than >>>>> windows. The >>>>> test is only useful on windows. Modifying the fpcw didn't have an >>>>> effect on linux x86. Seems silly to run the test on platforms where >>>>> it's not testing anything. >>>> >>>> It's been a while but I've been involved with this test in the past >>>> and I thought it was doing similar testing on other x86 platforms. As >>>> you say the main point is to verify that messing with the FPU control >>>> word doesn't break anything so I'm not sure that is a reason not to >>>> have such a test run on linux. >>> >>> The original point of the test is unknown to me. The test was doing >>> something that can cause the vm to crash or get the wrong behavior. The >>> new point of the test is to test the AlwaysRestoreFPU flag. Since we >>> can >>> provoke a crash on windows 32 bit only, this seems like a good >>> restriction for the test. >>>> >>>>>> >>>>>> I'm also unclear exactly what changed such that we need to add >>>>>> AlwaysRestoreFPU ?? >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8076284 allegedly changed >>>>> the >>>>> behavior in 2015. >>>> >>>> I certainly can't make any connection between that bug and handling of >>>> the FPU control word. :( >>>> >>> >>> Me neither but the failure I got was from the divide by zero getting a >>> windows internal error, not the C2 generated code. Rereading the bug, >>> Vladimir thought it could have come from the VC++ compiler change and >>> not the fix. >>> >>>>> Setting floating point control word is something that we don't >>>>> promise >>>>> to work with generated code or the jvm code, which is why this >>>>> -XX:+AlwaysRestoreFPU flag was added. >>>> >>>> Understood, just puzzled about why this test suddenly requires it. >>>> Given the FPU mode used by the VM has not changed, why does the test >>>> now fail without this?? >>>> >>> >>> I don't know why it worked before. There could have been a library >>> change also that gets the internal error. It wasn't an unhandled >>> floating point exception. >>> >>>>>> >>>>>> Can you add @Summary info to the Java test (and/or other commentary) >>>>>> as it is far from clear exactly what this is testing. >>>>>> >>>>> >>>>> Yes, I can do that. How about: >>>>> >>>>> * @summary Test that modifying the floating point control word >>>>> doesn't >>>>> cause unhandled Windows floating point exceptions >>>> >>>> That's fine -thanks. Can you also document with a comment exactly what >>>> our modification of the FPU control word is intended to do and how >>>> that relates to the normal FPU mode used by the VM. This is something >>>> I always have to go looking for :) >>> >>> Okay, how about: >>> >>> * @summary Test that modifying the floating point control word doesn't >>> cause an internal >>> * error when turning off floating point exceptions for divide >>> by zero. >>> >>> See new webrev for comments about the default. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8079562.02/webrev >>> >>> >>> Again, adding the flag is the solution that we give customers. We also >>> test for changing the fpcw with -Xcheck:jni. >>> >>> Thanks, >>> Coleen >>> >>>> >>>>> >>>>> Can summary lines be > 1 line or do they have to be one line? >>>> >>>> A tag runs until the next tag is encountered so the summary can span >>>> multiple lines. >>>> >>>> Thanks, >>>> David >>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8079562.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8079562 >>>>>>> >>>>>>> Tested with JPRT. >>>>>>> >>>>>>> thanks, >>>>>>> Coleen >>>>> >>> > From david.holmes at oracle.com Thu Aug 11 04:21:34 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 11 Aug 2016 14:21:34 +1000 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: <740a30c4-ceca-4ede-81b5-853a4c732070@default> References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> Message-ID: Hi Shafi, On 10/08/2016 6:34 PM, Shafi Ahmad wrote: > Hi, > > Please review the code change for "JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968" to jdk8u. > Please note this is partial backport of http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 > > Summary: > Microsoft version of vsnprintf() behaves differently from the standard C version when there is not enough space in the buffer. > Microsoft version doesn't null terminates its output under error conditions, whereas the standard C version does. On Windows, it returns -1. > We handle both cases here and always return -1, and perform null termination. This looks fine to me. Thanks, David > Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 > Webrev link: http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ > > Testing: jprt > > Regards, > Shafi > From shafi.s.ahmad at oracle.com Thu Aug 11 05:50:12 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Wed, 10 Aug 2016 22:50:12 -0700 (PDT) Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> Message-ID: <3de08ed1-5f2b-46dd-8286-034745b1a188@default> Thanks David for reviewing it. Regards, Shafi > -----Original Message----- > From: David Holmes > Sent: Thursday, August 11, 2016 9:52 AM > To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net > Subject: Re: [8u] RFR for JDK-8162419: > closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > 8155968 > > Hi Shafi, > > On 10/08/2016 6:34 PM, Shafi Ahmad wrote: > > Hi, > > > > Please review the code change for "JDK-8162419: > closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > 8155968" to jdk8u. > > Please note this is partial backport of > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 > > > > Summary: > > Microsoft version of vsnprintf() behaves differently from the standard C > version when there is not enough space in the buffer. > > Microsoft version doesn't null terminates its output under error conditions, > whereas the standard C version does. On Windows, it returns -1. > > We handle both cases here and always return -1, and perform null > termination. > > This looks fine to me. > > Thanks, > David > > > Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 > > Webrev link: http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ > > > > Testing: jprt > > > > Regards, > > Shafi > > From shafi.s.ahmad at oracle.com Thu Aug 11 12:14:08 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Thu, 11 Aug 2016 05:14:08 -0700 (PDT) Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> Message-ID: Hi, Could I get one more review for this safe change. Regards, Shafi > -----Original Message----- > From: David Holmes > Sent: Thursday, August 11, 2016 9:52 AM > To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net > Subject: Re: [8u] RFR for JDK-8162419: > closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > 8155968 > > Hi Shafi, > > On 10/08/2016 6:34 PM, Shafi Ahmad wrote: > > Hi, > > > > Please review the code change for "JDK-8162419: > closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > 8155968" to jdk8u. > > Please note this is partial backport of > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 > > > > Summary: > > Microsoft version of vsnprintf() behaves differently from the standard C > version when there is not enough space in the buffer. > > Microsoft version doesn't null terminates its output under error conditions, > whereas the standard C version does. On Windows, it returns -1. > > We handle both cases here and always return -1, and perform null > termination. > > This looks fine to me. > > Thanks, > David > > > Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 > > Webrev link: http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ > > > > Testing: jprt > > > > Regards, > > Shafi > > From dmitry.samersoff at oracle.com Thu Aug 11 16:55:55 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Thu, 11 Aug 2016 19:55:55 +0300 Subject: RFR(S): JDK-8157236 - attach on ARMv7 fails with com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file In-Reply-To: <1cf08a48-d7c0-1953-08ef-5d75f8225a3c@oracle.com> References: <1cf08a48-d7c0-1953-08ef-5d75f8225a3c@oracle.com> Message-ID: <9a8c4cdb-0389-5d0b-6b87-7167ed66c655@oracle.com> David, Please see updated webrev. http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.03/ I didn't touch windows version because it quite different from *NIX one. -Dmitry On 2016-08-08 02:40, David Holmes wrote: > Hi Dmitry, > > On 5/08/2016 7:25 PM, Dmitry Samersoff wrote: >> Everybody, >> >> Please review the fix: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.02/ >> >> Problem: >> Tests fail intermittently because it can't attach to child process, >> these attach failures is hard to debug because attach framework >> doesn't provide enough diagnostic information. >> >> Solution: >> >> a) Increase attach timeout >> b) Slightly change attach loop to save a bit of CPU power. >> c) Add some logging to attach listener. >> >> It's just a first step in this direction. Complete cleanup of attach >> code (remove LinuxThreads support and convert all printing to UL) is not >> a goal of this fix - I'll file a separate CR for it. > > I still think you need more logging now to aid in debugging these cases. > In particular we want to be able to verify that the path of the attach > file is what we expect in all cases ie whether we find the .attach_pid > file in cwd or whether we are looking in temp directory, and whether we > ultimately succeed or fail. > > Plus whatever you do now should be done consistently for all platforms. > > Thanks, > David > >> -Dmitry >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From daniel.daugherty at oracle.com Thu Aug 11 17:57:10 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 11 Aug 2016 11:57:10 -0600 Subject: RFR(XXXS): 8163879 quarantine serviceability/sa/sadebugd/SADebugDTest.java since it hangs intermittently Message-ID: Greetings, I'm quarantining a test that hung in the 2016-08-09 and 2016-08-10 JDK9-hs nightlies: JDK-8163879 quarantine serviceability/sa/sadebugd/SADebugDTest.java since it hangs intermittently https://bugs.openjdk.java.net/browse/JDK-8163879 $ hg diff diff -r 14f97d7574bf test/serviceability/sa/sadebugd/SADebugDTest.java --- a/test/serviceability/sa/sadebugd/SADebugDTest.java Mon Aug 08 15:53:02 2016 +0000 +++ b/test/serviceability/sa/sadebugd/SADebugDTest.java Thu Aug 11 10:48:52 2016 -0700 @@ -28,6 +28,7 @@ * @modules java.base/jdk.internal.misc * @library /test/lib/share/classes * + * @ignore 8163805 * @run main/othervm SADebugDTest */ import java.io.File; This change falls under the HotSpot Trivial Change rule so I'm looking for a single (R)eviewer. Thanks, in advance, for any comments, suggestions or questions! Dan From harold.seigel at oracle.com Thu Aug 11 18:59:10 2016 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 11 Aug 2016 14:59:10 -0400 Subject: RFR(XXXS): 8163879 quarantine serviceability/sa/sadebugd/SADebugDTest.java since it hangs intermittently In-Reply-To: References: Message-ID: Looks good! Harold On 8/11/2016 1:57 PM, Daniel D. Daugherty wrote: > Greetings, > > I'm quarantining a test that hung in the 2016-08-09 and > 2016-08-10 JDK9-hs nightlies: > > JDK-8163879 quarantine serviceability/sa/sadebugd/SADebugDTest.java > since it hangs intermittently > https://bugs.openjdk.java.net/browse/JDK-8163879 > > > $ hg diff > diff -r 14f97d7574bf test/serviceability/sa/sadebugd/SADebugDTest.java > --- a/test/serviceability/sa/sadebugd/SADebugDTest.java Mon Aug 08 > 15:53:02 2016 +0000 > +++ b/test/serviceability/sa/sadebugd/SADebugDTest.java Thu Aug 11 > 10:48:52 2016 -0700 > @@ -28,6 +28,7 @@ > * @modules java.base/jdk.internal.misc > * @library /test/lib/share/classes > * > + * @ignore 8163805 > * @run main/othervm SADebugDTest > */ > import java.io.File; > > > This change falls under the HotSpot Trivial Change rule so I'm > looking for a single (R)eviewer. > > Thanks, in advance, for any comments, suggestions or questions! > > Dan From daniel.daugherty at oracle.com Thu Aug 11 19:27:30 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 11 Aug 2016 13:27:30 -0600 Subject: RFR(XXXS): 8163879 quarantine serviceability/sa/sadebugd/SADebugDTest.java since it hangs intermittently In-Reply-To: References: Message-ID: <5d5989dc-9f3b-5c4c-cfb5-9b355899a8b2@oracle.com> Added back serviceability-dev at ... The "reply to list" option usually only replies to one of the lists when more than one is specified... One of the few cases where "reply to all" is useful... :-) Thanks Harold! Dan On 8/11/16 12:59 PM, harold seigel wrote: > Looks good! > > Harold > > > On 8/11/2016 1:57 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I'm quarantining a test that hung in the 2016-08-09 and >> 2016-08-10 JDK9-hs nightlies: >> >> JDK-8163879 quarantine serviceability/sa/sadebugd/SADebugDTest.java >> since it hangs intermittently >> https://bugs.openjdk.java.net/browse/JDK-8163879 >> >> >> $ hg diff >> diff -r 14f97d7574bf test/serviceability/sa/sadebugd/SADebugDTest.java >> --- a/test/serviceability/sa/sadebugd/SADebugDTest.java Mon Aug 08 >> 15:53:02 2016 +0000 >> +++ b/test/serviceability/sa/sadebugd/SADebugDTest.java Thu Aug 11 >> 10:48:52 2016 -0700 >> @@ -28,6 +28,7 @@ >> * @modules java.base/jdk.internal.misc >> * @library /test/lib/share/classes >> * >> + * @ignore 8163805 >> * @run main/othervm SADebugDTest >> */ >> import java.io.File; >> >> >> This change falls under the HotSpot Trivial Change rule so I'm >> looking for a single (R)eviewer. >> >> Thanks, in advance, for any comments, suggestions or questions! >> >> Dan > From coleen.phillimore at oracle.com Thu Aug 11 20:39:25 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 11 Aug 2016 16:39:25 -0400 Subject: RFR(S): JDK-8146697 : VM crashes in test Test7005594 In-Reply-To: References: Message-ID: Yes, I think this fix is good for robustness, given the difficulty and time needed to find the root cause. Thanks, Coleen On 8/8/16 10:55 AM, Frederic Parain wrote: > Greetings, > > Please review this small fix for JDK-8146697 > https://bugs.openjdk.java.net/browse/JDK-8146697 > > Summary: The JVM sometimes tries to re-enable the Reserved Stack Area > while it is currently not disabled, leading to the following assertion > failure: > > share/vm/runtime/thread.cpp:2551 assert(_stack_guard_state != > stack_guard_enabled) failed: already enabled > > This problem occurred while running different tests including tests > where stack overflows are unlikely. It is rare and very hard to > reproduce. At the beginning of the investigation, I've been able to > reproduce it three times out of 1,000+ runs of metaspace stress test > (the fact that was is a metaspace test doesn't matter). But once I've > instrumented the JVM, the bug didn't show up again, even after 30,000+ > runs. > > So, I've investigated it with the limited material I had. The failures > always occurred on x86/32bits platforms. > Regarding that some failures occurred on tests where stack overflows are > unlikely (no recursive calls, small call stack), and that all failures > occurred in interpreted Java code, my guess is that the issue is in the > test performed on interpreted method exit to determine if the Reserved > Stack Area should be enabled or not. > > The test on method exit compares the SP of the caller frame to an > activation SP address stored in the JavaThread object when the Reserved > Stack Area has been disabled. Without a reproducible test case, I've not > been able to find what was the issue between the two values (de-opt, > OSR, other?). So, I've slightly changed the test to make it more robust > against the situation causing the assertion failure. Now the test checks > the status of the guard pages, and if no guard pages have been disabled, > the method exits normally. This means there's always only one test on > interpreted method exit if Reserved Stack Area has not been used, so no > difference on performances for most cases. If this first test detects > that guard pages have been disabled, then the previous test (caller SP > vs activation SP) is performed, to determine if this is the place where > the Reserved Stack Area should be re-enabled or not. > > Even if the root cause of the bug is still unknown, the fix should make > the code more robust and prevent unnecessary re-enabling of the Reserved > Stack Area. > > Webrev: > http://cr.openjdk.java.net/~fparain/8146697/webrev.00/ > > Thank you, > > Fred From karen.kinnear at oracle.com Thu Aug 11 21:07:07 2016 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 11 Aug 2016 17:07:07 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles Message-ID: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> Please review: https://bugs.openjdk.java.net/browse/JDK-8163808 http://cr.openjdk.java.net/~acorn/8163808.hs/webrev Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. Implementation needed to check this in three places, not just one. Vtable size calculation is only exact for later classfile versions. Also fixed vtable logging output - since the method name-and-sig printing was changed to also print the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. Testing: linux-x64-slowdebug rbt hs-nightly-runtime.js jck vm,lang, api.java.lang small invocation tests thanks, Karen From david.holmes at oracle.com Thu Aug 11 23:14:26 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Aug 2016 09:14:26 +1000 Subject: RFR(S): JDK-8146697 : VM crashes in test Test7005594 In-Reply-To: References: Message-ID: Hi Fred, Thanks for explaining things to me offline. This change seems fine in making the code more robust. Thanks, David On 9/08/2016 12:55 AM, Frederic Parain wrote: > Greetings, > > Please review this small fix for JDK-8146697 > https://bugs.openjdk.java.net/browse/JDK-8146697 > > Summary: The JVM sometimes tries to re-enable the Reserved Stack Area > while it is currently not disabled, leading to the following assertion > failure: > > share/vm/runtime/thread.cpp:2551 assert(_stack_guard_state != > stack_guard_enabled) failed: already enabled > > This problem occurred while running different tests including tests > where stack overflows are unlikely. It is rare and very hard to > reproduce. At the beginning of the investigation, I've been able to > reproduce it three times out of 1,000+ runs of metaspace stress test > (the fact that was is a metaspace test doesn't matter). But once I've > instrumented the JVM, the bug didn't show up again, even after 30,000+ > runs. > > So, I've investigated it with the limited material I had. The failures > always occurred on x86/32bits platforms. > Regarding that some failures occurred on tests where stack overflows are > unlikely (no recursive calls, small call stack), and that all failures > occurred in interpreted Java code, my guess is that the issue is in the > test performed on interpreted method exit to determine if the Reserved > Stack Area should be enabled or not. > > The test on method exit compares the SP of the caller frame to an > activation SP address stored in the JavaThread object when the Reserved > Stack Area has been disabled. Without a reproducible test case, I've not > been able to find what was the issue between the two values (de-opt, > OSR, other?). So, I've slightly changed the test to make it more robust > against the situation causing the assertion failure. Now the test checks > the status of the guard pages, and if no guard pages have been disabled, > the method exits normally. This means there's always only one test on > interpreted method exit if Reserved Stack Area has not been used, so no > difference on performances for most cases. If this first test detects > that guard pages have been disabled, then the previous test (caller SP > vs activation SP) is performed, to determine if this is the place where > the Reserved Stack Area should be re-enabled or not. > > Even if the root cause of the bug is still unknown, the fix should make > the code more robust and prevent unnecessary re-enabling of the Reserved > Stack Area. > > Webrev: > http://cr.openjdk.java.net/~fparain/8146697/webrev.00/ > > Thank you, > > Fred From chris.plummer at oracle.com Thu Aug 11 23:20:08 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 11 Aug 2016 16:20:08 -0700 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> Message-ID: Hi Shafi, Please update the copyright date to 2016 and change "numbers of" to "number of". I'm not so sure I agree with the comments in the CR that you can just backport this change to vsnprintf(), but not the other changes in the relevant changeset. For example: --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 @@ -705,25 +705,35 @@ } void JDK_Version::to_string(char* buffer, size_t buflen) const { + assert(buffer && buflen > 0, "call with useful buffer"); size_t index = 0; + if (!is_valid()) { jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); } else if (is_partially_initialized()) { jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); } else { - index += jio_snprintf( + int rc = jio_snprintf( &buffer[index], buflen - index, "%d.%d", _major, _minor); + if (rc == -1) return; + index += rc; if (_micro > 0) { - index += jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); } I think your change to vsnprintf() will break JDK_Version::to_string() if the above diff if not applied. You could argue that the above code is already broken because -1 is could be returned to it on Windows. However, your changes expand that risk to all platforms. cheers, Chris On 8/11/16 5:14 AM, Shafi Ahmad wrote: > Hi, > > Could I get one more review for this safe change. > > Regards, > Shafi > >> -----Original Message----- >> From: David Holmes >> Sent: Thursday, August 11, 2016 9:52 AM >> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: [8u] RFR for JDK-8162419: >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> 8155968 >> >> Hi Shafi, >> >> On 10/08/2016 6:34 PM, Shafi Ahmad wrote: >>> Hi, >>> >>> Please review the code change for "JDK-8162419: >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> 8155968" to jdk8u. >>> Please note this is partial backport of >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >>> Summary: >>> Microsoft version of vsnprintf() behaves differently from the standard C >> version when there is not enough space in the buffer. >>> Microsoft version doesn't null terminates its output under error conditions, >> whereas the standard C version does. On Windows, it returns -1. >>> We handle both cases here and always return -1, and perform null >> termination. >> >> This looks fine to me. >> >> Thanks, >> David >> >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >>> Webrev link: http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >>> >>> Testing: jprt >>> >>> Regards, >>> Shafi >>> From david.holmes at oracle.com Fri Aug 12 00:24:58 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Aug 2016 10:24:58 +1000 Subject: RFR(S): JDK-8157236 - attach on ARMv7 fails with com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file In-Reply-To: <9a8c4cdb-0389-5d0b-6b87-7167ed66c655@oracle.com> References: <1cf08a48-d7c0-1953-08ef-5d75f8225a3c@oracle.com> <9a8c4cdb-0389-5d0b-6b87-7167ed66c655@oracle.com> Message-ID: <0e789ca0-bd59-08ee-e9a8-da0646f06780@oracle.com> Hi Dmitry, On 12/08/2016 2:55 AM, Dmitry Samersoff wrote: > David, > > Please see updated webrev. > > http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.03/ > > I didn't touch windows version because it quite different from *NIX one. Do we ever see failures on Windows? Is so we should add diagnostics there too even if they are different to *NIX. I would still like to see what file it is working with. We need some logging in here: bool AttachListener::is_init_trigger() { if (init_at_startup() || is_initialized()) { return false; // initialized at startup or already initialized } char fn[PATH_MAX+1]; sprintf(fn, ".attach_pid%d", os::current_process_id()); int ret; struct stat64 st; RESTARTABLE(::stat64(fn, &st), ret); if (ret == -1) { + log ("failed to find attach file: %s, trying alternate", fn) snprintf(fn, sizeof(fn), "%s/.attach_pid%d", os::get_temp_directory(), os::current_process_id()); RESTARTABLE(::stat64(fn, &st), ret); + if (ret == -1) { + log("failed to find attach file: %s", fn); + } } All failure paths need to show us what it was that failed. typos: trigerred -> triggered Thanks, David > -Dmitry > > On 2016-08-08 02:40, David Holmes wrote: >> Hi Dmitry, >> >> On 5/08/2016 7:25 PM, Dmitry Samersoff wrote: >>> Everybody, >>> >>> Please review the fix: >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.02/ >>> >>> Problem: >>> Tests fail intermittently because it can't attach to child process, >>> these attach failures is hard to debug because attach framework >>> doesn't provide enough diagnostic information. >>> >>> Solution: >>> >>> a) Increase attach timeout >>> b) Slightly change attach loop to save a bit of CPU power. >>> c) Add some logging to attach listener. >>> >>> It's just a first step in this direction. Complete cleanup of attach >>> code (remove LinuxThreads support and convert all printing to UL) is not >>> a goal of this fix - I'll file a separate CR for it. >> >> I still think you need more logging now to aid in debugging these cases. >> In particular we want to be able to verify that the path of the attach >> file is what we expect in all cases ie whether we find the .attach_pid >> file in cwd or whether we are looking in temp directory, and whether we >> ultimately succeed or fail. >> >> Plus whatever you do now should be done consistently for all platforms. >> >> Thanks, >> David >> >>> -Dmitry >>> > > From david.holmes at oracle.com Fri Aug 12 00:42:04 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Aug 2016 10:42:04 +1000 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> Message-ID: <96958618-78e9-5ab4-102a-da26e7fd372c@oracle.com> Hi Chris, On 12/08/2016 9:20 AM, Chris Plummer wrote: > Hi Shafi, > > Please update the copyright date to 2016 and change "numbers of" to > "number of". > > I'm not so sure I agree with the comments in the CR that you can just > backport this change to vsnprintf(), but not the other changes in the > relevant changeset. For example: > > --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 > +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 > @@ -705,25 +705,35 @@ > } > > void JDK_Version::to_string(char* buffer, size_t buflen) const { > + assert(buffer && buflen > 0, "call with useful buffer"); > size_t index = 0; > + > if (!is_valid()) { > jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); > } else if (is_partially_initialized()) { > jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); > } else { > - index += jio_snprintf( > + int rc = jio_snprintf( > &buffer[index], buflen - index, "%d.%d", _major, _minor); > + if (rc == -1) return; > + index += rc; > if (_micro > 0) { > - index += jio_snprintf(&buffer[index], buflen - index, ".%d", > _micro); > + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); > } > > I think your change to vsnprintf() will break JDK_Version::to_string() > if the above diff if not applied. You could argue that the above code is > already broken because -1 is could be returned to it on Windows. > However, your changes expand that risk to all platforms. Unless I am missing something that risk only exists if there are absurd values in the version components. That would only affect someone building with their own version values defined. So yes this code could also be more robust, but it isn't essential for fixing the issue at hand that needed fixing - which was the missing nul-terminator when a too-short buffer is passed in. Cheers, David > cheers, > > Chris > > On 8/11/16 5:14 AM, Shafi Ahmad wrote: >> Hi, >> >> Could I get one more review for this safe change. >> >> Regards, >> Shafi >> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Thursday, August 11, 2016 9:52 AM >>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: [8u] RFR for JDK-8162419: >>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>> 8155968 >>> >>> Hi Shafi, >>> >>> On 10/08/2016 6:34 PM, Shafi Ahmad wrote: >>>> Hi, >>>> >>>> Please review the code change for "JDK-8162419: >>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>> 8155968" to jdk8u. >>>> Please note this is partial backport of >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >>>> Summary: >>>> Microsoft version of vsnprintf() behaves differently from the > standard C >>> version when there is not enough space in the buffer. >>>> Microsoft version doesn't null terminates its output under error > conditions, >>> whereas the standard C version does. On Windows, it returns -1. >>>> We handle both cases here and always return -1, and perform null >>> termination. >>> >>> This looks fine to me. >>> >>> Thanks, >>> David >>> >>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >>>> Webrev link: http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >>>> >>>> Testing: jprt >>>> >>>> Regards, >>>> Shafi >>>> > > From daniel.daugherty at oracle.com Fri Aug 12 01:25:39 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 11 Aug 2016 19:25:39 -0600 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: <047f996d-51b2-8993-6439-a32e6d5c7908@oracle.com> References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> <17a8d9a5-e9da-ef69-8f14-81b46710eaf1@oracle.com> <047f996d-51b2-8993-6439-a32e6d5c7908@oracle.com> Message-ID: <82648eab-d84a-a112-2862-4f905ed772d4@oracle.com> David, Sorry I forgot to respond before I left for Santa Fe, NM... More below... On 8/8/16 5:57 PM, David Holmes wrote: > Hi Dan, > > Thanks for the review. > > On 9/08/2016 2:07 AM, Daniel D. Daugherty wrote: >> On 8/4/16 8:28 PM, David Holmes wrote: >>> Hi Volker, >>> >>> Thanks for looking at this. >>> >>> On 5/08/2016 1:48 AM, Volker Simonis wrote: >>>> Hi David, >>>> >>>> thanks for doing this change on all platforms. >>>> The fix looks good. Maybe you can just extend the following comment >>>> with >>>> something like: >>>> >>>> // Note that the SR_lock plays no role in this suspend/resume >>>> protocol. >>>> // It is only used in SR_handler as a thread termination >>>> indicator if >>>> NULL. >>> >>> Darn this code is confusing - too many "SR"'s :( I have added >>> >>> // Note that the SR_lock plays no role in this suspend/resume >>> protocol, >>> // but is checked for NULL in SR_handler as a thread termination >>> indicator. >>> >>> Updated webrev: >>> >>> http://cr.openjdk.java.net/~dholmes/8159461/webrev.v2/ >> >> src/share/vm/runtime/thread.cpp >> L380: _SR_lock = NULL; >> I was expecting the _SR_lock to be freed and NULL'ed earlier >> based on the discussion in the bug report. Since the crashing >> assert() happens in a race between the JavaThread destructor >> the NULL'ing of the _SR_lock field, I was expecting the _SR_lock >> field to be dealt with as early as possible in the Thread >> destructor (or even earlier; see my last comment). > > I will respond after that comment. > >> src/os/linux/vm/os_linux.cpp >> L4010: // mask is changed as part of thread termination. Check the >> current thread >> grammar?: "Check the current" -> "Check that the current" > > Will change. > >> L4015: if (thread->SR_lock() == NULL) >> L4016: return; >> style nit: multi-line if-statements require '{' and '}' >> Please add the braces or make this a single line if-statement. >> I would prefer the braces. :-) > > Will fix. > >> Isn't there still a window between the completion of the >> JavaThread destructor and where the Thread destructor sets >> _SR_lock = NULL? > > See below. > >> L4020: OSThread* osthread = thread->osthread(); >> Not your bug. This code assumes that osthread != NULL. >> Maybe it needs to be more robust. > > Depends what kind of impossibilities we want to guard against. :) > There should be no possible way a signal can be sent to a thread that > doesn't even have a osThread as it means we never successfully > started/attached the thread. That's a really good point. I'm good with what's there for osthread. > >> src/os/aix/vm/os_aix.cpp >> L2731: if (thread->SR_lock() == NULL) >> L2732: return; >> Same style nit. >> >> Same race. >> >> L2736: OSThread* osthread = thread->osthread(); >> Same robustness comment. >> >> src/os/bsd/vm/os_bsd.cpp >> L2759: if (thread->SR_lock() == NULL) >> L2760: return; >> Same style nit. >> >> Same race. >> >> L2764: OSThread* osthread = thread->osthread(); >> Same robustness comment. >> >> It has been a very long time since I've dealt with races in the >> suspend/resume code so I'm probably very rusty with this code. >> If the _SR_lock is only used by the JavaThread suspend/resume >> protocol, then we could consider free'ing and NULL'ing the field >> in the JavaThread destructor (as the last piece of work). >> >> That should eliminate the race that was being observed by the >> SR_handler() in this bug. It will open a very small race where >> is_Java_thread() can return true, the _SR_lock field is !NULL, >> but the _SR_lock has been deleted. > > Given that it should have been impossible to get into the SR_handler > in the first place from this code I was trying to minimize the > disruption to the existing logic. Moving the delete/NULLing to just > before the call to os::free_thread() fixes the crashes that had been > observed. I was not trying to make the entire destruction sequence > safe wrt. the SR_handler. I suspect it is the combination of 1) NULLing the _SR_lock as a sentinel and 2) doing that before the more expensive os::free_thread() call that results in the change in behavior. > My major concern with deleting the SR_lock much earlier is the > potential race condition that I have previously outlined in: > > https://bugs.openjdk.java.net/browse/JDK-8152849 > > where there is no protection against a target thread terminating. The > sooner it terminates and deletes the SR_lock the more likely we may > attempt to lock a deleted lock! Ah yes... thanks for the reminder. We have seen a few of those in the past where we're racing to grab the _SR_lock and Elvis is trying to leave the building... I'm good with just the minor changes you agreed to make above. I don't think I need to see a new webrev for the above edits. Thumbs up! Dan > > Thanks, > David > >> Dan >> >> >>> >>> This also reminded me to follow up on why the Solaris SR_handler is >>> different and I found it is not actually installed as a direct signal >>> handler, but is called from the real signal handler if dealing with a >>> JavaThread or the VMThread. Consequently the Solaris version of the >>> SR_handler can not encounter this specific bug and so I have reverted >>> the changes to os_solaris.cpp >>> >>> Thanks, >>> David >>> >>> >>>> Regards, >>>> Volker >>>> >>>> On Wed, Aug 3, 2016 at 3:13 AM, David Holmes >>> > wrote: >>>> >>>> webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ >>>> >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8159461 >>>> >>>> >>>> The suspend/resume signal (SR_signum) is never sent to a thread >>>> once >>>> it has started to terminate. On one platform (SuSE 12) we have >>>> seen >>>> what appears to be a "stuck" signal, which is only delivered when >>>> the terminating thread restores its original signal mask (as if >>>> pthread_sigmask makes the system realize there is a pending >>>> signal - >>>> we already check the signal was not blocked). At this point in the >>>> thread termination we have freed the osthread, so the the >>>> SR_handler >>>> would access deallocated memory. In debug builds we first hit an >>>> assertion that the current thread is a JavaThread or the >>>> VMThread - >>>> that assertion fails, even though it is a JavaThread, because we >>>> have already executed the ~JavaThread destructor and inside the >>>> ~Thread destructor we are a plain Thread not a JavaThread. >>>> >>>> The fix was to make a small adjustment to the thread termination >>>> process so that we delete the SR_lock before calling >>>> os::free_thread(). In the SR_handler() we can then use a NULL >>>> check >>>> of SR_lock() to indicate the thread has terminated and we return. >>>> >>>> While only seen on Linux I took the opportunity to apply the >>>> fix on >>>> all platforms and also cleaned up the code where we were using >>>> Thread::current() unsafely in a signal-handling context. >>>> >>>> Testing: regular tier 1 (JPRT) >>>> Kitchensink (in progress) >>>> >>>> As we can't readily reproduce the problem I tested this by >>>> having a >>>> terminating thread raise SR_signum directly from within the >>>> ~Thread >>>> destructor. >>>> >>>> Thanks, >>>> David >>>> >>>> >> From chris.plummer at oracle.com Fri Aug 12 03:05:46 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 11 Aug 2016 20:05:46 -0700 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: <96958618-78e9-5ab4-102a-da26e7fd372c@oracle.com> References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <96958618-78e9-5ab4-102a-da26e7fd372c@oracle.com> Message-ID: <6dc528b1-e60e-02a3-2168-6bcc3ae7074d@oracle.com> On 8/11/16 5:42 PM, David Holmes wrote: > Hi Chris, > > On 12/08/2016 9:20 AM, Chris Plummer wrote: >> Hi Shafi, >> >> Please update the copyright date to 2016 and change "numbers of" to >> "number of". >> >> I'm not so sure I agree with the comments in the CR that you can just >> backport this change to vsnprintf(), but not the other changes in the >> relevant changeset. For example: >> >> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 >> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 >> @@ -705,25 +705,35 @@ >> } >> >> void JDK_Version::to_string(char* buffer, size_t buflen) const { >> + assert(buffer && buflen > 0, "call with useful buffer"); >> size_t index = 0; >> + >> if (!is_valid()) { >> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); >> } else if (is_partially_initialized()) { >> jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); >> } else { >> - index += jio_snprintf( >> + int rc = jio_snprintf( >> &buffer[index], buflen - index, "%d.%d", _major, _minor); >> + if (rc == -1) return; >> + index += rc; >> if (_micro > 0) { >> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", >> _micro); >> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); >> } >> >> I think your change to vsnprintf() will break JDK_Version::to_string() >> if the above diff if not applied. You could argue that the above code is >> already broken because -1 is could be returned to it on Windows. >> However, your changes expand that risk to all platforms. > > Unless I am missing something that risk only exists if there are > absurd values in the version components. That would only affect > someone building with their own version values defined. Hi David, That was just the first example I found of an affected jio_snprintf(). I'm guessing there are others. > > So yes this code could also be more robust, but it isn't essential for > fixing the issue at hand that needed fixing - which was the missing > nul-terminator when a too-short buffer is passed in. So why not just fix the null termination, but not change the return value to -1. In fact the correct thing to do here would be to change the return value to be the actual number of bytes written. That should fix existing code like JDK_Version::to_string() that does not expect to see the -1. cheers, Chris > > Cheers, > David > >> cheers, >> >> Chris >> >> On 8/11/16 5:14 AM, Shafi Ahmad wrote: >>> Hi, >>> >>> Could I get one more review for this safe change. >>> >>> Regards, >>> Shafi >>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Thursday, August 11, 2016 9:52 AM >>>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: [8u] RFR for JDK-8162419: >>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>> 8155968 >>>> >>>> Hi Shafi, >>>> >>>> On 10/08/2016 6:34 PM, Shafi Ahmad wrote: >>>>> Hi, >>>>> >>>>> Please review the code change for "JDK-8162419: >>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>> 8155968" to jdk8u. >>>>> Please note this is partial backport of >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >>>>> Summary: >>>>> Microsoft version of vsnprintf() behaves differently from the >> standard C >>>> version when there is not enough space in the buffer. >>>>> Microsoft version doesn't null terminates its output under error >> conditions, >>>> whereas the standard C version does. On Windows, it returns -1. >>>>> We handle both cases here and always return -1, and perform null >>>> termination. >>>> >>>> This looks fine to me. >>>> >>>> Thanks, >>>> David >>>> >>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >>>>> Webrev link: http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >>>>> >>>>> Testing: jprt >>>>> >>>>> Regards, >>>>> Shafi >>>>> >> >> From david.holmes at oracle.com Fri Aug 12 04:06:18 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Aug 2016 14:06:18 +1000 Subject: (S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread In-Reply-To: <82648eab-d84a-a112-2862-4f905ed772d4@oracle.com> References: <3c87ada2-71e3-ded3-33fb-e6dce6b12ee4@oracle.com> <17a8d9a5-e9da-ef69-8f14-81b46710eaf1@oracle.com> <047f996d-51b2-8993-6439-a32e6d5c7908@oracle.com> <82648eab-d84a-a112-2862-4f905ed772d4@oracle.com> Message-ID: <583699c2-62f2-f4e9-0839-4fced4b568fd@oracle.com> On 12/08/2016 11:25 AM, Daniel D. Daugherty wrote: > David, > > Sorry I forgot to respond before I left for Santa Fe, NM... > More below... No problem. > > On 8/8/16 5:57 PM, David Holmes wrote: >> Hi Dan, >> >> Thanks for the review. >> >> On 9/08/2016 2:07 AM, Daniel D. Daugherty wrote: >>> On 8/4/16 8:28 PM, David Holmes wrote: >>>> Hi Volker, >>>> >>>> Thanks for looking at this. >>>> >>>> On 5/08/2016 1:48 AM, Volker Simonis wrote: >>>>> Hi David, >>>>> >>>>> thanks for doing this change on all platforms. >>>>> The fix looks good. Maybe you can just extend the following comment >>>>> with >>>>> something like: >>>>> >>>>> // Note that the SR_lock plays no role in this suspend/resume >>>>> protocol. >>>>> // It is only used in SR_handler as a thread termination >>>>> indicator if >>>>> NULL. >>>> >>>> Darn this code is confusing - too many "SR"'s :( I have added >>>> >>>> // Note that the SR_lock plays no role in this suspend/resume >>>> protocol, >>>> // but is checked for NULL in SR_handler as a thread termination >>>> indicator. >>>> >>>> Updated webrev: >>>> >>>> http://cr.openjdk.java.net/~dholmes/8159461/webrev.v2/ >>> >>> src/share/vm/runtime/thread.cpp >>> L380: _SR_lock = NULL; >>> I was expecting the _SR_lock to be freed and NULL'ed earlier >>> based on the discussion in the bug report. Since the crashing >>> assert() happens in a race between the JavaThread destructor >>> the NULL'ing of the _SR_lock field, I was expecting the _SR_lock >>> field to be dealt with as early as possible in the Thread >>> destructor (or even earlier; see my last comment). >> >> I will respond after that comment. >> >>> src/os/linux/vm/os_linux.cpp >>> L4010: // mask is changed as part of thread termination. Check the >>> current thread >>> grammar?: "Check the current" -> "Check that the current" >> >> Will change. >> >>> L4015: if (thread->SR_lock() == NULL) >>> L4016: return; >>> style nit: multi-line if-statements require '{' and '}' >>> Please add the braces or make this a single line if-statement. >>> I would prefer the braces. :-) >> >> Will fix. >> >>> Isn't there still a window between the completion of the >>> JavaThread destructor and where the Thread destructor sets >>> _SR_lock = NULL? >> >> See below. >> >>> L4020: OSThread* osthread = thread->osthread(); >>> Not your bug. This code assumes that osthread != NULL. >>> Maybe it needs to be more robust. >> >> Depends what kind of impossibilities we want to guard against. :) >> There should be no possible way a signal can be sent to a thread that >> doesn't even have a osThread as it means we never successfully >> started/attached the thread. > > That's a really good point. I'm good with what's there > for osthread. > > >> >>> src/os/aix/vm/os_aix.cpp >>> L2731: if (thread->SR_lock() == NULL) >>> L2732: return; >>> Same style nit. >>> >>> Same race. >>> >>> L2736: OSThread* osthread = thread->osthread(); >>> Same robustness comment. >>> >>> src/os/bsd/vm/os_bsd.cpp >>> L2759: if (thread->SR_lock() == NULL) >>> L2760: return; >>> Same style nit. >>> >>> Same race. >>> >>> L2764: OSThread* osthread = thread->osthread(); >>> Same robustness comment. >>> >>> It has been a very long time since I've dealt with races in the >>> suspend/resume code so I'm probably very rusty with this code. >>> If the _SR_lock is only used by the JavaThread suspend/resume >>> protocol, then we could consider free'ing and NULL'ing the field >>> in the JavaThread destructor (as the last piece of work). >>> >>> That should eliminate the race that was being observed by the >>> SR_handler() in this bug. It will open a very small race where >>> is_Java_thread() can return true, the _SR_lock field is !NULL, >>> but the _SR_lock has been deleted. >> >> Given that it should have been impossible to get into the SR_handler >> in the first place from this code I was trying to minimize the >> disruption to the existing logic. Moving the delete/NULLing to just >> before the call to os::free_thread() fixes the crashes that had been >> observed. I was not trying to make the entire destruction sequence >> safe wrt. the SR_handler. > > I suspect it is the combination of 1) NULLing the _SR_lock as a sentinel > and > 2) doing that before the more expensive os::free_thread() call that results > in the change in behavior. Right. The call to pthread_sigmask in os::free_thread is what appeared to un-jam the pending signal; so if we bail out before os::free_thread we avoid that. > >> My major concern with deleting the SR_lock much earlier is the >> potential race condition that I have previously outlined in: >> >> https://bugs.openjdk.java.net/browse/JDK-8152849 >> >> where there is no protection against a target thread terminating. The >> sooner it terminates and deletes the SR_lock the more likely we may >> attempt to lock a deleted lock! > > Ah yes... thanks for the reminder. We have seen a few of those in the > past where we're racing to grab the _SR_lock and Elvis is trying to > leave the building... > > I'm good with just the minor changes you agreed to make above. I don't > think I need to see a new webrev for the above edits. > > Thumbs up! Thanks! David ----- > Dan > > > >> >> Thanks, >> David >> >>> Dan >>> >>> >>>> >>>> This also reminded me to follow up on why the Solaris SR_handler is >>>> different and I found it is not actually installed as a direct signal >>>> handler, but is called from the real signal handler if dealing with a >>>> JavaThread or the VMThread. Consequently the Solaris version of the >>>> SR_handler can not encounter this specific bug and so I have reverted >>>> the changes to os_solaris.cpp >>>> >>>> Thanks, >>>> David >>>> >>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> On Wed, Aug 3, 2016 at 3:13 AM, David Holmes >>>> > wrote: >>>>> >>>>> webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/ >>>>> >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8159461 >>>>> >>>>> >>>>> The suspend/resume signal (SR_signum) is never sent to a thread >>>>> once >>>>> it has started to terminate. On one platform (SuSE 12) we have >>>>> seen >>>>> what appears to be a "stuck" signal, which is only delivered when >>>>> the terminating thread restores its original signal mask (as if >>>>> pthread_sigmask makes the system realize there is a pending >>>>> signal - >>>>> we already check the signal was not blocked). At this point in the >>>>> thread termination we have freed the osthread, so the the >>>>> SR_handler >>>>> would access deallocated memory. In debug builds we first hit an >>>>> assertion that the current thread is a JavaThread or the >>>>> VMThread - >>>>> that assertion fails, even though it is a JavaThread, because we >>>>> have already executed the ~JavaThread destructor and inside the >>>>> ~Thread destructor we are a plain Thread not a JavaThread. >>>>> >>>>> The fix was to make a small adjustment to the thread termination >>>>> process so that we delete the SR_lock before calling >>>>> os::free_thread(). In the SR_handler() we can then use a NULL >>>>> check >>>>> of SR_lock() to indicate the thread has terminated and we return. >>>>> >>>>> While only seen on Linux I took the opportunity to apply the >>>>> fix on >>>>> all platforms and also cleaned up the code where we were using >>>>> Thread::current() unsafely in a signal-handling context. >>>>> >>>>> Testing: regular tier 1 (JPRT) >>>>> Kitchensink (in progress) >>>>> >>>>> As we can't readily reproduce the problem I tested this by >>>>> having a >>>>> terminating thread raise SR_signum directly from within the >>>>> ~Thread >>>>> destructor. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>> > From shafi.s.ahmad at oracle.com Fri Aug 12 05:21:35 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Thu, 11 Aug 2016 22:21:35 -0700 (PDT) Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> Message-ID: <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> Hi Chris, Thanks for reviewing. > -----Original Message----- > From: Chris Plummer > Sent: Friday, August 12, 2016 4:50 AM > To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes > Subject: Re: [8u] RFR for JDK-8162419: > closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > 8155968 > > Hi Shafi, > > Please update the copyright date to 2016 and change "numbers of" to > "number of". > > I'm not so sure I agree with the comments in the CR that you can just > backport this change to vsnprintf(), but not the other changes in the relevant > changeset. For example: > > --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 > +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 > @@ -705,25 +705,35 @@ > } > > void JDK_Version::to_string(char* buffer, size_t buflen) const { > + assert(buffer && buflen > 0, "call with useful buffer"); > size_t index = 0; > + > if (!is_valid()) { > jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); > } else if (is_partially_initialized()) { > jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); > } else { > - index += jio_snprintf( > + int rc = jio_snprintf( > &buffer[index], buflen - index, "%d.%d", _major, _minor); > + if (rc == -1) return; > + index += rc; > if (_micro > 0) { > - index += jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); > + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); > } > > I think your change to vsnprintf() will break JDK_Version::to_string() if the > above diff if not applied. You could argue that the above code is already > broken because -1 is could be returned to it on Windows. > However, your changes expand that risk to all platforms. I am agree with you. I think I have to revisit at least all reference of jio_snprintf for which we are using return value of this method. shafi at shafi-ahmad:~/Java/jdk8/jdk8u-dev/hotspot$ find ./ -name "*.cpp" -exec grep -H jio_snprintf {} \; | egrep "=|if" | grep -v close ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, "replay_pid%p_compid%d.log", os::current_process_id(), compile_id); ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, "inline_pid%p_compid%d.log", os::current_process_id(), compile_id); ./src/share/vm/runtime/os.cpp: const int printed = jio_snprintf(buffer, buffer_length, iso8601_format, ./src/share/vm/runtime/arguments.cpp: int ret = jio_snprintf(b, buf_sz, "%d", os::current_process_id()); ./src/share/vm/runtime/arguments.cpp: // if jio_snprintf fails or the buffer is not long enough to hold ./src/share/vm/runtime/java.cpp: index += jio_snprintf( ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, "_%02d", _update); ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, "%c", _special); ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, "-b%02d", _build); ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "#%d", trap_state); ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "%s%s", ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "reason='%s' action='%s'", ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "reason='%s' action='%s' index='%d'", ./src/share/vm/services/diagnosticArgument.cpp: jio_snprintf(buf, len, "%s", (c != NULL) ? c : ""); ./src/share/vm/classfile/vmSymbols.cpp: int len = jio_snprintf(buf, buflen, "%s: %s%s.%s%s", ./src/share/vm/classfile/classLoader.cpp: if (jio_snprintf(path, sizeof(path), "%s%s%s", _dir, os::file_separator(), name) == -1) { ./src/share/vm/classfile/verifier.cpp: jio_snprintf(message, message_len, "Could not link verifier"); ./src/share/vm/utilities/ostream.cpp: int result = jio_snprintf(current_file_name, JVM_MAXPATHLEN, ./src/share/vm/utilities/ostream.cpp: int result = jio_snprintf(current_file_name, JVM_MAXPATHLEN, "%s.%d" CURRENTAPPX, ./src/share/vm/utilities/vmError.cpp: int n = jio_snprintf(buf, buflen, ./src/share/vm/utilities/vmError.cpp: int fsep_len = jio_snprintf(&buf[pos], buflen-pos, "%s", os::file_separator()); ./src/share/vm/utilities/vmError.cpp: int pos = jio_snprintf(buf, buflen, "%s%s", tmpdir, os::file_separator()); ./src/cpu/ppc/vm/methodHandles_ppc.cpp: jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); ./src/cpu/x86/vm/methodHandles_x86.cpp: jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); ./src/cpu/sparc/vm/methodHandles_sparc.cpp: jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); ./src/os/bsd/vm/os_bsd.cpp: int n = jio_snprintf(buffer, bufferSize, "/cores"); I will resend the updated webrev. Jdk9:src/share/vm/runtime/java.cpp 714 int rc = jio_snprintf( 715 &buffer[index], buflen - index, "%d.%d", _major, _minor); 716 if (rc == -1) return; 717 index += rc; 718 if (_security > 0) { 719 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _security); 720 } 721 if (_patch > 0) { 722 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _patch); 723 if (rc == -1) return; 724 index += rc; 725 } After line# 719 we are not updating the index variable and hence if _security > 0 and _patch > 0 then in that case value of _security is getting overwritten by value of _patch in the buffer. Is this a bug or we are ignoring _security field, in that case this is redundant code? Please note _security field is not there in jdk8 code. Regards, Shafi > cheers, > > Chris > > On 8/11/16 5:14 AM, Shafi Ahmad wrote: > > Hi, > > > > Could I get one more review for this safe change. > > > > Regards, > > Shafi > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Thursday, August 11, 2016 9:52 AM >> To: Shafi Ahmad; hotspot- > runtime-dev at openjdk.java.net > >> Subject: Re: [8u] RFR for JDK-8162419: > >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> > 8155968 >> >> Hi Shafi, >> >> On 10/08/2016 6:34 PM, Shafi Ahmad wrote: > >>> Hi, > >>> > >>> Please review the code change for "JDK-8162419: > >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> > 8155968" to jdk8u. > >>> Please note this is partial backport of >> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 > >>> Summary: > >>> Microsoft version of vsnprintf() behaves differently from the standard C > >> version when there is not enough space in the buffer. > >>> Microsoft version doesn't null terminates its output under error > conditions, >> whereas the standard C version does. On Windows, it returns > -1. > >>> We handle both cases here and always return -1, and perform null >> > termination. > >> > >> This looks fine to me. > >> > >> Thanks, > >> David > >> > >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 > >>> Webrev link: > http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ > >>> > >>> Testing: jprt > >>> > >>> Regards, > >>> Shafi > >>> > > From david.holmes at oracle.com Fri Aug 12 06:07:36 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Aug 2016 16:07:36 +1000 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: <6dc528b1-e60e-02a3-2168-6bcc3ae7074d@oracle.com> References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <96958618-78e9-5ab4-102a-da26e7fd372c@oracle.com> <6dc528b1-e60e-02a3-2168-6bcc3ae7074d@oracle.com> Message-ID: <4faa43b6-bf3e-bd97-b960-ae18ebaf4ded@oracle.com> On 12/08/2016 1:05 PM, Chris Plummer wrote: > On 8/11/16 5:42 PM, David Holmes wrote: >> Hi Chris, >> >> On 12/08/2016 9:20 AM, Chris Plummer wrote: >>> Hi Shafi, >>> >>> Please update the copyright date to 2016 and change "numbers of" to >>> "number of". >>> >>> I'm not so sure I agree with the comments in the CR that you can just >>> backport this change to vsnprintf(), but not the other changes in the >>> relevant changeset. For example: >>> >>> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 >>> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 >>> @@ -705,25 +705,35 @@ >>> } >>> >>> void JDK_Version::to_string(char* buffer, size_t buflen) const { >>> + assert(buffer && buflen > 0, "call with useful buffer"); >>> size_t index = 0; >>> + >>> if (!is_valid()) { >>> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); >>> } else if (is_partially_initialized()) { >>> jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); >>> } else { >>> - index += jio_snprintf( >>> + int rc = jio_snprintf( >>> &buffer[index], buflen - index, "%d.%d", _major, _minor); >>> + if (rc == -1) return; >>> + index += rc; >>> if (_micro > 0) { >>> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", >>> _micro); >>> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); >>> } >>> >>> I think your change to vsnprintf() will break JDK_Version::to_string() >>> if the above diff if not applied. You could argue that the above code is >>> already broken because -1 is could be returned to it on Windows. >>> However, your changes expand that risk to all platforms. >> >> Unless I am missing something that risk only exists if there are >> absurd values in the version components. That would only affect >> someone building with their own version values defined. > Hi David, > > That was just the first example I found of an affected jio_snprintf(). > I'm guessing there are others. AFAICS there are no other modified uses in: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 If this change could have affected other uses of jio_snprintf then I would have expected them to be covered in that patch. If it didn't do the right thing and that is a bug in the 9 code then it needs to be flagged in 9, fixed and then backported. >> >> So yes this code could also be more robust, but it isn't essential for >> fixing the issue at hand that needed fixing - which was the missing >> nul-terminator when a too-short buffer is passed in. > So why not just fix the null termination, but not change the return > value to -1. In fact the correct thing to do here would be to change the Yes, I think that would be a minimal fix and avoid the ripple effects that any other change would make. > return value to be the actual number of bytes written. That should fix Yes that may have been a better choice - especially for 9 going forward (as of VS2015 Windows vsnprintf behaves the same as POSIX (C99 standard conforming)). But that choice wasn't made in 9 and we really shouldn't have 8 and 9 differ unnecessarily. Plus it would still require checking all uses. Cheers, David > existing code like JDK_Version::to_string() that does not expect to see > the -1. > > cheers, > > Chris >> >> Cheers, >> David >> >>> cheers, >>> >>> Chris >>> >>> On 8/11/16 5:14 AM, Shafi Ahmad wrote: >>>> Hi, >>>> >>>> Could I get one more review for this safe change. >>>> >>>> Regards, >>>> Shafi >>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Thursday, August 11, 2016 9:52 AM >>>>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: [8u] RFR for JDK-8162419: >>>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>>> 8155968 >>>>> >>>>> Hi Shafi, >>>>> >>>>> On 10/08/2016 6:34 PM, Shafi Ahmad wrote: >>>>>> Hi, >>>>>> >>>>>> Please review the code change for "JDK-8162419: >>>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>>> 8155968" to jdk8u. >>>>>> Please note this is partial backport of >>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >>>>>> Summary: >>>>>> Microsoft version of vsnprintf() behaves differently from the >>> standard C >>>>> version when there is not enough space in the buffer. >>>>>> Microsoft version doesn't null terminates its output under error >>> conditions, >>>>> whereas the standard C version does. On Windows, it returns -1. >>>>>> We handle both cases here and always return -1, and perform null >>>>> termination. >>>>> >>>>> This looks fine to me. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >>>>>> Webrev link: http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >>>>>> >>>>>> Testing: jprt >>>>>> >>>>>> Regards, >>>>>> Shafi >>>>>> >>> >>> > From chris.plummer at oracle.com Fri Aug 12 07:09:08 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 12 Aug 2016 00:09:08 -0700 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: <4faa43b6-bf3e-bd97-b960-ae18ebaf4ded@oracle.com> References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <96958618-78e9-5ab4-102a-da26e7fd372c@oracle.com> <6dc528b1-e60e-02a3-2168-6bcc3ae7074d@oracle.com> <4faa43b6-bf3e-bd97-b960-ae18ebaf4ded@oracle.com> Message-ID: On 8/11/16 11:07 PM, David Holmes wrote: > On 12/08/2016 1:05 PM, Chris Plummer wrote: >> On 8/11/16 5:42 PM, David Holmes wrote: >>> Hi Chris, >>> >>> On 12/08/2016 9:20 AM, Chris Plummer wrote: >>>> Hi Shafi, >>>> >>>> Please update the copyright date to 2016 and change "numbers of" to >>>> "number of". >>>> >>>> I'm not so sure I agree with the comments in the CR that you can just >>>> backport this change to vsnprintf(), but not the other changes in the >>>> relevant changeset. For example: >>>> >>>> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 >>>> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 >>>> @@ -705,25 +705,35 @@ >>>> } >>>> >>>> void JDK_Version::to_string(char* buffer, size_t buflen) const { >>>> + assert(buffer && buflen > 0, "call with useful buffer"); >>>> size_t index = 0; >>>> + >>>> if (!is_valid()) { >>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); >>>> } else if (is_partially_initialized()) { >>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); >>>> } else { >>>> - index += jio_snprintf( >>>> + int rc = jio_snprintf( >>>> &buffer[index], buflen - index, "%d.%d", _major, _minor); >>>> + if (rc == -1) return; >>>> + index += rc; >>>> if (_micro > 0) { >>>> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", >>>> _micro); >>>> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", >>>> _micro); >>>> } >>>> >>>> I think your change to vsnprintf() will break JDK_Version::to_string() >>>> if the above diff if not applied. You could argue that the above >>>> code is >>>> already broken because -1 is could be returned to it on Windows. >>>> However, your changes expand that risk to all platforms. >>> >>> Unless I am missing something that risk only exists if there are >>> absurd values in the version components. That would only affect >>> someone building with their own version values defined. >> Hi David, >> >> That was just the first example I found of an affected jio_snprintf(). >> I'm guessing there are others. > > AFAICS there are no other modified uses in: > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 > > If this change could have affected other uses of jio_snprintf then I > would have expected them to be covered in that patch. If it didn't do > the right thing and that is a bug in the 9 code then it needs to be > flagged in 9, fixed and then backported. If that's the case then maybe that part of the changeset should be backported. > >>> >>> So yes this code could also be more robust, but it isn't essential for >>> fixing the issue at hand that needed fixing - which was the missing >>> nul-terminator when a too-short buffer is passed in. >> So why not just fix the null termination, but not change the return >> value to -1. In fact the correct thing to do here would be to change the > > Yes, I think that would be a minimal fix and avoid the ripple effects > that any other change would make. > >> return value to be the actual number of bytes written. That should fix > > Yes that may have been a better choice - especially for 9 going > forward (as of VS2015 Windows vsnprintf behaves the same as POSIX (C99 > standard conforming)). But that choice wasn't made in 9 and we really > shouldn't have 8 and 9 differ unnecessarily. Plus it would still > require checking all uses. Ok. I see that there are lots of places that already check for "ret < 0 || buflen < ret", so it looks like the Windows difference is being special cased in shared code. cheers, Chris > > Cheers, > David > >> existing code like JDK_Version::to_string() that does not expect to see >> the -1. >> >> cheers, >> >> Chris >>> >>> Cheers, >>> David >>> >>>> cheers, >>>> >>>> Chris >>>> >>>> On 8/11/16 5:14 AM, Shafi Ahmad wrote: >>>>> Hi, >>>>> >>>>> Could I get one more review for this safe change. >>>>> >>>>> Regards, >>>>> Shafi >>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes >>>>>> Sent: Thursday, August 11, 2016 9:52 AM >>>>>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: Re: [8u] RFR for JDK-8162419: >>>>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>>>> 8155968 >>>>>> >>>>>> Hi Shafi, >>>>>> >>>>>> On 10/08/2016 6:34 PM, Shafi Ahmad wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Please review the code change for "JDK-8162419: >>>>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>>>> 8155968" to jdk8u. >>>>>>> Please note this is partial backport of >>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >>>>>>> Summary: >>>>>>> Microsoft version of vsnprintf() behaves differently from the >>>> standard C >>>>>> version when there is not enough space in the buffer. >>>>>>> Microsoft version doesn't null terminates its output under error >>>> conditions, >>>>>> whereas the standard C version does. On Windows, it returns -1. >>>>>>> We handle both cases here and always return -1, and perform null >>>>>> termination. >>>>>> >>>>>> This looks fine to me. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >>>>>>> Webrev link: >>>>>>> http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >>>>>>> >>>>>>> Testing: jprt >>>>>>> >>>>>>> Regards, >>>>>>> Shafi >>>>>>> >>>> >>>> >> From chris.plummer at oracle.com Fri Aug 12 07:12:45 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 12 Aug 2016 00:12:45 -0700 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> Message-ID: <803179b0-3237-cd0a-4c13-11c57605e345@oracle.com> On 8/11/16 10:21 PM, Shafi Ahmad wrote: > Hi Chris, > > Thanks for reviewing. > >> -----Original Message----- >> From: Chris Plummer >> Sent: Friday, August 12, 2016 4:50 AM >> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes >> Subject: Re: [8u] RFR for JDK-8162419: >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> 8155968 >> >> Hi Shafi, >> >> Please update the copyright date to 2016 and change "numbers of" to >> "number of". >> >> I'm not so sure I agree with the comments in the CR that you can just >> backport this change to vsnprintf(), but not the other changes in the relevant >> changeset. For example: >> >> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 >> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 >> @@ -705,25 +705,35 @@ >> } >> >> void JDK_Version::to_string(char* buffer, size_t buflen) const { >> + assert(buffer && buflen > 0, "call with useful buffer"); >> size_t index = 0; >> + >> if (!is_valid()) { >> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); >> } else if (is_partially_initialized()) { >> jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); >> } else { >> - index += jio_snprintf( >> + int rc = jio_snprintf( >> &buffer[index], buflen - index, "%d.%d", _major, _minor); >> + if (rc == -1) return; >> + index += rc; >> if (_micro > 0) { >> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); >> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); >> } >> >> I think your change to vsnprintf() will break JDK_Version::to_string() if the >> above diff if not applied. You could argue that the above code is already >> broken because -1 is could be returned to it on Windows. >> However, your changes expand that risk to all platforms. > I am agree with you. I think I have to revisit at least all reference of jio_snprintf for which we are using return value of this method. > > shafi at shafi-ahmad:~/Java/jdk8/jdk8u-dev/hotspot$ find ./ -name "*.cpp" -exec grep -H jio_snprintf {} \; | egrep "=|if" | grep -v close > ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, "replay_pid%p_compid%d.log", os::current_process_id(), compile_id); > ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, "inline_pid%p_compid%d.log", os::current_process_id(), compile_id); > ./src/share/vm/runtime/os.cpp: const int printed = jio_snprintf(buffer, buffer_length, iso8601_format, > ./src/share/vm/runtime/arguments.cpp: int ret = jio_snprintf(b, buf_sz, "%d", os::current_process_id()); > ./src/share/vm/runtime/arguments.cpp: // if jio_snprintf fails or the buffer is not long enough to hold > ./src/share/vm/runtime/java.cpp: index += jio_snprintf( > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, "_%02d", _update); > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, "%c", _special); > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, "-b%02d", _build); > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "#%d", trap_state); > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "%s%s", > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "reason='%s' action='%s'", > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "reason='%s' action='%s' index='%d'", > ./src/share/vm/services/diagnosticArgument.cpp: jio_snprintf(buf, len, "%s", (c != NULL) ? c : ""); > ./src/share/vm/classfile/vmSymbols.cpp: int len = jio_snprintf(buf, buflen, "%s: %s%s.%s%s", > ./src/share/vm/classfile/classLoader.cpp: if (jio_snprintf(path, sizeof(path), "%s%s%s", _dir, os::file_separator(), name) == -1) { > ./src/share/vm/classfile/verifier.cpp: jio_snprintf(message, message_len, "Could not link verifier"); > ./src/share/vm/utilities/ostream.cpp: int result = jio_snprintf(current_file_name, JVM_MAXPATHLEN, > ./src/share/vm/utilities/ostream.cpp: int result = jio_snprintf(current_file_name, JVM_MAXPATHLEN, "%s.%d" CURRENTAPPX, > ./src/share/vm/utilities/vmError.cpp: int n = jio_snprintf(buf, buflen, > ./src/share/vm/utilities/vmError.cpp: int fsep_len = jio_snprintf(&buf[pos], buflen-pos, "%s", os::file_separator()); > ./src/share/vm/utilities/vmError.cpp: int pos = jio_snprintf(buf, buflen, "%s%s", tmpdir, os::file_separator()); > ./src/cpu/ppc/vm/methodHandles_ppc.cpp: jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); > ./src/cpu/x86/vm/methodHandles_x86.cpp: jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); > ./src/cpu/sparc/vm/methodHandles_sparc.cpp: jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); > ./src/os/bsd/vm/os_bsd.cpp: int n = jio_snprintf(buffer, bufferSize, "/cores"); Hi Shafi, As David pointed out, it looks like only java.cpp needs to be updated to account for changes you are making jio_snprintf. The others either don't use the result (even if it is assigned to a local) or already have special handling for -1. The exception is the os_bsd.cpp case. I noticed it looks buggy, both in JDK9 and JDK8u. cheers, Chris > > I will resend the updated webrev. > > Jdk9:src/share/vm/runtime/java.cpp > 714 int rc = jio_snprintf( > 715 &buffer[index], buflen - index, "%d.%d", _major, _minor); > 716 if (rc == -1) return; > 717 index += rc; > 718 if (_security > 0) { > 719 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _security); > 720 } > 721 if (_patch > 0) { > 722 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _patch); > 723 if (rc == -1) return; > 724 index += rc; > 725 } > > After line# 719 we are not updating the index variable and hence if _security > 0 and _patch > 0 then in that case value of _security is getting overwritten by value of _patch in the buffer. > Is this a bug or we are ignoring _security field, in that case this is redundant code? Please note _security field is not there in jdk8 code. > > Regards, > Shafi > > > >> cheers, >> >> Chris >> >> On 8/11/16 5:14 AM, Shafi Ahmad wrote: >> > Hi, >> > >> > Could I get one more review for this safe change. >> > >> > Regards, >> > Shafi >> > >> >> -----Original Message----- >> >> From: David Holmes >> >> Sent: Thursday, August 11, 2016 9:52 AM >> To: Shafi Ahmad; hotspot- >> runtime-dev at openjdk.java.net >> >> Subject: Re: [8u] RFR for JDK-8162419: >> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> >> 8155968 >> >> Hi Shafi, >> >> On 10/08/2016 6:34 PM, Shafi Ahmad wrote: >> >>> Hi, >> >>> >> >>> Please review the code change for "JDK-8162419: >> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> >> 8155968" to jdk8u. >> >>> Please note this is partial backport of >> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >> >>> Summary: >> >>> Microsoft version of vsnprintf() behaves differently from the standard C >>>> version when there is not enough space in the buffer. >> >>> Microsoft version doesn't null terminates its output under error >> conditions, >> whereas the standard C version does. On Windows, it returns >> -1. >> >>> We handle both cases here and always return -1, and perform null >> >> termination. >> >> >> >> This looks fine to me. >> >> >> >> Thanks, >> >> David >> >> >> >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >> >>> Webrev link: >> http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >> >>> >> >>> Testing: jprt >> >>> >> >>> Regards, >> >>> Shafi >> >>> >> >> From dmitry.samersoff at oracle.com Fri Aug 12 09:04:42 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 12 Aug 2016 12:04:42 +0300 Subject: RFR(S): JDK-8157236 - attach on ARMv7 fails with com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file In-Reply-To: <0e789ca0-bd59-08ee-e9a8-da0646f06780@oracle.com> References: <1cf08a48-d7c0-1953-08ef-5d75f8225a3c@oracle.com> <9a8c4cdb-0389-5d0b-6b87-7167ed66c655@oracle.com> <0e789ca0-bd59-08ee-e9a8-da0646f06780@oracle.com> Message-ID: David, Updated webrev is: http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.04/ Windows is absolutely different story that requires significant efforts to reproduce error conditions and test changes. Also it has nothing to do with ARMv7. So I would prefer to address windows issues separately either as a part of JDK-8159799 or as a separate CR. -Dmitry On 2016-08-12 03:24, David Holmes wrote: > Hi Dmitry, > > On 12/08/2016 2:55 AM, Dmitry Samersoff wrote: >> David, >> >> Please see updated webrev. >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.03/ >> >> I didn't touch windows version because it quite different from *NIX one. > > Do we ever see failures on Windows? Is so we should add diagnostics > there too even if they are different to *NIX. > > I would still like to see what file it is working with. We need some > logging in here: > > bool AttachListener::is_init_trigger() { > if (init_at_startup() || is_initialized()) { > return false; // initialized at startup or already > initialized > } > char fn[PATH_MAX+1]; > sprintf(fn, ".attach_pid%d", os::current_process_id()); > int ret; > struct stat64 st; > RESTARTABLE(::stat64(fn, &st), ret); > if (ret == -1) { > + log ("failed to find attach file: %s, trying alternate", fn) > snprintf(fn, sizeof(fn), "%s/.attach_pid%d", > os::get_temp_directory(), os::current_process_id()); > RESTARTABLE(::stat64(fn, &st), ret); > + if (ret == -1) { > + log("failed to find attach file: %s", fn); > + } > } > > All failure paths need to show us what it was that failed. > > typos: trigerred -> triggered > > Thanks, > David > >> -Dmitry >> >> On 2016-08-08 02:40, David Holmes wrote: >>> Hi Dmitry, >>> >>> On 5/08/2016 7:25 PM, Dmitry Samersoff wrote: >>>> Everybody, >>>> >>>> Please review the fix: >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.02/ >>>> >>>> Problem: >>>> Tests fail intermittently because it can't attach to child process, >>>> these attach failures is hard to debug because attach framework >>>> doesn't provide enough diagnostic information. >>>> >>>> Solution: >>>> >>>> a) Increase attach timeout >>>> b) Slightly change attach loop to save a bit of CPU power. >>>> c) Add some logging to attach listener. >>>> >>>> It's just a first step in this direction. Complete cleanup of attach >>>> code (remove LinuxThreads support and convert all printing to UL) is >>>> not >>>> a goal of this fix - I'll file a separate CR for it. >>> >>> I still think you need more logging now to aid in debugging these cases. >>> In particular we want to be able to verify that the path of the attach >>> file is what we expect in all cases ie whether we find the .attach_pid >>> file in cwd or whether we are looking in temp directory, and whether we >>> ultimately succeed or fail. >>> >>> Plus whatever you do now should be done consistently for all platforms. >>> >>> Thanks, >>> David >>> >>>> -Dmitry >>>> >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From frederic.parain at oracle.com Fri Aug 12 13:07:31 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Fri, 12 Aug 2016 09:07:31 -0400 Subject: RFR(S): JDK-8146697 : VM crashes in test Test7005594 In-Reply-To: References: Message-ID: Thank you Coleen. Fred On 08/11/2016 04:39 PM, Coleen Phillimore wrote: > > Yes, I think this fix is good for robustness, given the difficulty and > time needed to find the root cause. > Thanks, > Coleen > > On 8/8/16 10:55 AM, Frederic Parain wrote: >> Greetings, >> >> Please review this small fix for JDK-8146697 >> https://bugs.openjdk.java.net/browse/JDK-8146697 >> >> Summary: The JVM sometimes tries to re-enable the Reserved Stack Area >> while it is currently not disabled, leading to the following assertion >> failure: >> >> share/vm/runtime/thread.cpp:2551 assert(_stack_guard_state != >> stack_guard_enabled) failed: already enabled >> >> This problem occurred while running different tests including tests >> where stack overflows are unlikely. It is rare and very hard to >> reproduce. At the beginning of the investigation, I've been able to >> reproduce it three times out of 1,000+ runs of metaspace stress test >> (the fact that was is a metaspace test doesn't matter). But once I've >> instrumented the JVM, the bug didn't show up again, even after 30,000+ >> runs. >> >> So, I've investigated it with the limited material I had. The failures >> always occurred on x86/32bits platforms. >> Regarding that some failures occurred on tests where stack overflows are >> unlikely (no recursive calls, small call stack), and that all failures >> occurred in interpreted Java code, my guess is that the issue is in the >> test performed on interpreted method exit to determine if the Reserved >> Stack Area should be enabled or not. >> >> The test on method exit compares the SP of the caller frame to an >> activation SP address stored in the JavaThread object when the Reserved >> Stack Area has been disabled. Without a reproducible test case, I've not >> been able to find what was the issue between the two values (de-opt, >> OSR, other?). So, I've slightly changed the test to make it more robust >> against the situation causing the assertion failure. Now the test checks >> the status of the guard pages, and if no guard pages have been disabled, >> the method exits normally. This means there's always only one test on >> interpreted method exit if Reserved Stack Area has not been used, so no >> difference on performances for most cases. If this first test detects >> that guard pages have been disabled, then the previous test (caller SP >> vs activation SP) is performed, to determine if this is the place where >> the Reserved Stack Area should be re-enabled or not. >> >> Even if the root cause of the bug is still unknown, the fix should make >> the code more robust and prevent unnecessary re-enabling of the Reserved >> Stack Area. >> >> Webrev: >> http://cr.openjdk.java.net/~fparain/8146697/webrev.00/ >> >> Thank you, >> >> Fred > From frederic.parain at oracle.com Fri Aug 12 13:07:57 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Fri, 12 Aug 2016 09:07:57 -0400 Subject: RFR(S): JDK-8146697 : VM crashes in test Test7005594 In-Reply-To: References: Message-ID: <65226f5a-0611-8fb6-c25f-0135cc51bd62@oracle.com> Thank you David Fred On 08/11/2016 07:14 PM, David Holmes wrote: > Hi Fred, > > Thanks for explaining things to me offline. This change seems fine in > making the code more robust. > > Thanks, > David > > On 9/08/2016 12:55 AM, Frederic Parain wrote: >> Greetings, >> >> Please review this small fix for JDK-8146697 >> https://bugs.openjdk.java.net/browse/JDK-8146697 >> >> Summary: The JVM sometimes tries to re-enable the Reserved Stack Area >> while it is currently not disabled, leading to the following assertion >> failure: >> >> share/vm/runtime/thread.cpp:2551 assert(_stack_guard_state != >> stack_guard_enabled) failed: already enabled >> >> This problem occurred while running different tests including tests >> where stack overflows are unlikely. It is rare and very hard to >> reproduce. At the beginning of the investigation, I've been able to >> reproduce it three times out of 1,000+ runs of metaspace stress test >> (the fact that was is a metaspace test doesn't matter). But once I've >> instrumented the JVM, the bug didn't show up again, even after 30,000+ >> runs. >> >> So, I've investigated it with the limited material I had. The failures >> always occurred on x86/32bits platforms. >> Regarding that some failures occurred on tests where stack overflows are >> unlikely (no recursive calls, small call stack), and that all failures >> occurred in interpreted Java code, my guess is that the issue is in the >> test performed on interpreted method exit to determine if the Reserved >> Stack Area should be enabled or not. >> >> The test on method exit compares the SP of the caller frame to an >> activation SP address stored in the JavaThread object when the Reserved >> Stack Area has been disabled. Without a reproducible test case, I've not >> been able to find what was the issue between the two values (de-opt, >> OSR, other?). So, I've slightly changed the test to make it more robust >> against the situation causing the assertion failure. Now the test checks >> the status of the guard pages, and if no guard pages have been disabled, >> the method exits normally. This means there's always only one test on >> interpreted method exit if Reserved Stack Area has not been used, so no >> difference on performances for most cases. If this first test detects >> that guard pages have been disabled, then the previous test (caller SP >> vs activation SP) is performed, to determine if this is the place where >> the Reserved Stack Area should be re-enabled or not. >> >> Even if the root cause of the bug is still unknown, the fix should make >> the code more robust and prevent unnecessary re-enabling of the Reserved >> Stack Area. >> >> Webrev: >> http://cr.openjdk.java.net/~fparain/8146697/webrev.00/ >> >> Thank you, >> >> Fred From frederic.parain at oracle.com Fri Aug 12 13:46:58 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Fri, 12 Aug 2016 09:46:58 -0400 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: References: Message-ID: Dean, In file macroAssembler_x86.cpp, could it be possible to get rid of the clear_pc argument? It seems completely useless now. Fred On 08/09/2016 01:39 PM, dean.long at oracle.com wrote: > Ping. > > dl > > > On 8/4/16 3:28 PM, dean.long at oracle.com wrote: >> https://bugs.openjdk.java.net/browse/JDK-8161598 >> >> http://cr.openjdk.java.net/~dlong/8161598/webrev/ >> >> Sorry, this issue is Confidential. The problem is similar to 8029441, >> where we suspend a thread and use pd_get_top_frame_for_profiling() to >> get the top frame for stack walking. The problem is "last Java frame" >> anchor frames on x86. In lots of places we do not store last_Java_pc. >> This is OK in the synchronous stack walk case done by the current >> thread. But in the asynchronous case, there are small windows where >> it's not always safe to get PC from sp[-1]. >> >> The solution is not to treat x86 anchor frames as "always walkable". >> Instead, we follow the example of sparc and make them walking by >> filling in last_Java_pc when it's safe. >> >> I went for the minimal fix, resetting clear_pc to true in >> reset_last_Java_frame() but not changing the API and all the callers. >> I can fix this if reviewers feel strongly about it. >> >> dl >> > From lois.foltan at oracle.com Fri Aug 12 14:11:57 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Fri, 12 Aug 2016 10:11:57 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> Message-ID: <57ADD92D.40809@oracle.com> Hi Karen, Looks good. For my clarification, it doesn't matter what the version of the supers are? The transitive over-ride behavior is only governed by the version of the current class whose vtable is being constructed, correct? Thanks, Lois On 8/11/2016 5:07 PM, Karen Kinnear wrote: > Please review: > https://bugs.openjdk.java.net/browse/JDK-8163808 > > http://cr.openjdk.java.net/~acorn/8163808.hs/webrev > > Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. > Implementation needed to check this in three places, not just one. Vtable size calculation is only exact > for later classfile versions. > > Also fixed vtable logging output - since the method name-and-sig printing was changed to also print > the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. > > Testing: linux-x64-slowdebug > rbt hs-nightly-runtime.js > jck vm,lang, api.java.lang > small invocation tests > > thanks, > Karen From coleen.phillimore at oracle.com Fri Aug 12 14:47:51 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 12 Aug 2016 10:47:51 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> Message-ID: <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html void vtableEntry::verify(klassVtable* vt, outputStream* st) { NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); + KlassHandle vtklass_h = vt->klass(); + Klass* vtklass = vtklass_h(); + if (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { assert(method() != NULL, "must have set method"); + } I might be wrong but the vtable->klass() can be an ArrayKlass, so I think you have to do: if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) InstanceKlass::cast makes this assertion. Otherwise, the code looks good. Coleen On 8/11/16 5:07 PM, Karen Kinnear wrote: > Please review: > https://bugs.openjdk.java.net/browse/JDK-8163808 > > http://cr.openjdk.java.net/~acorn/8163808.hs/webrev > > Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. > Implementation needed to check this in three places, not just one. Vtable size calculation is only exact > for later classfile versions. > > Also fixed vtable logging output - since the method name-and-sig printing was changed to also print > the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. > > Testing: linux-x64-slowdebug > rbt hs-nightly-runtime.js > jck vm,lang, api.java.lang > small invocation tests > > thanks, > Karen From karen.kinnear at oracle.com Fri Aug 12 15:47:25 2016 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 12 Aug 2016 11:47:25 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: <57ADD92D.40809@oracle.com> References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <57ADD92D.40809@oracle.com> Message-ID: <08D7E46A-AB71-4A23-8AA8-E70E13D0A996@oracle.com> Lois, Thank you for the review and the question. The new overriding rules were fixed for JDK-4766230 in JDK7 based on discussions between Alex Buckley, David Holmes, Vladimir Ivanov and myself. Today we would have modified the specification to clarify how the fix applies to the different classfile versions - at the time we left the old classfile versions obeying the previous version of the JVMS. And yes, that was explicitly the overriding rule (JVMS 5.4.5 Overriding) - i.e. does my class method override an inherited class method - and so it depends on the classfile version of the current class which we are processing and for whom we are calculating the overriding rules. The classfile-version specific logic for this was added in 2009. This fix does not change the logic for inheritance and overriding at all. What it does change is the vtable pre-calculation of expected size for older classfiles to match the usage logic, so simply an internal inconsistency fix. thanks, Karen > On Aug 12, 2016, at 10:11 AM, Lois Foltan wrote: > > Hi Karen, > > Looks good. For my clarification, it doesn't matter what the version of the supers are? The transitive over-ride behavior is only governed by the version of the current class whose vtable is being constructed, correct? > > Thanks, > Lois > > On 8/11/2016 5:07 PM, Karen Kinnear wrote: >> Please review: >> https://bugs.openjdk.java.net/browse/JDK-8163808 >> >> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >> >> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >> for later classfile versions. >> >> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >> >> Testing: linux-x64-slowdebug >> rbt hs-nightly-runtime.js >> jck vm,lang, api.java.lang >> small invocation tests >> >> thanks, >> Karen > From lois.foltan at oracle.com Fri Aug 12 15:56:13 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Fri, 12 Aug 2016 11:56:13 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: <08D7E46A-AB71-4A23-8AA8-E70E13D0A996@oracle.com> References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <57ADD92D.40809@oracle.com> <08D7E46A-AB71-4A23-8AA8-E70E13D0A996@oracle.com> Message-ID: <57ADF19D.2010103@oracle.com> On 8/12/2016 11:47 AM, Karen Kinnear wrote: > Lois, > > Thank you for the review and the question. > > The new overriding rules were fixed for JDK-4766230 in JDK7 based on discussions between > Alex Buckley, David Holmes, Vladimir Ivanov and myself. Today we would have modified the specification > to clarify how the fix applies to the different classfile versions - at the time we left the old classfile versions > obeying the previous version of the JVMS. And yes, that was explicitly the overriding rule (JVMS 5.4.5 Overriding) - i.e. does my > class method override an inherited class method - and so it depends on the classfile version of the current > class which we are processing and for whom we are calculating the overriding rules. > > The classfile-version specific logic for this was added in 2009. This fix does not change the logic for inheritance > and overriding at all. What it does change is the vtable pre-calculation of expected size for older classfiles to > match the usage logic, so simply an internal inconsistency fix. Got it, thanks for the explanation! Lois > > thanks, > Karen > >> On Aug 12, 2016, at 10:11 AM, Lois Foltan wrote: >> >> Hi Karen, >> >> Looks good. For my clarification, it doesn't matter what the version of the supers are? The transitive over-ride behavior is only governed by the version of the current class whose vtable is being constructed, correct? >> >> Thanks, >> Lois >> >> On 8/11/2016 5:07 PM, Karen Kinnear wrote: >>> Please review: >>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>> >>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>> >>> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >>> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >>> for later classfile versions. >>> >>> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >>> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >>> >>> Testing: linux-x64-slowdebug >>> rbt hs-nightly-runtime.js >>> jck vm,lang, api.java.lang >>> small invocation tests >>> >>> thanks, >>> Karen From karen.kinnear at oracle.com Fri Aug 12 17:33:14 2016 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 12 Aug 2016 13:33:14 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> Message-ID: <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> Coleen, Good catch - I will make that change. Today this code is not called for arrays, but I totally appreciate you looking at the bigger picture and preparing for potential other uses. Here is the updated lines: KlassHandle vtklass_h = vt->klass(); Klass* vtklass = vtklass_h(); if (vtklass->is_instance_klass() && (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { assert(method() != NULL, "must have set method"); } Thanks! Karen > On Aug 12, 2016, at 10:47 AM, Coleen Phillimore wrote: > > > http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html > > void vtableEntry::verify(klassVtable* vt, outputStream* st) { > NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); > + KlassHandle vtklass_h = vt->klass(); > + Klass* vtklass = vtklass_h(); > + if (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { > assert(method() != NULL, "must have set method"); > + } > > I might be wrong but the vtable->klass() can be an ArrayKlass, so I think you have to do: > > if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) > > InstanceKlass::cast makes this assertion. Otherwise, the code looks good. > > Coleen > > On 8/11/16 5:07 PM, Karen Kinnear wrote: >> Please review: >> https://bugs.openjdk.java.net/browse/JDK-8163808 >> >> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >> >> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >> for later classfile versions. >> >> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >> >> Testing: linux-x64-slowdebug >> rbt hs-nightly-runtime.js >> jck vm,lang, api.java.lang >> small invocation tests >> >> thanks, >> Karen > From rachel.protacio at oracle.com Fri Aug 12 17:35:53 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Fri, 12 Aug 2016 13:35:53 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> Message-ID: <81e58610-e8e6-e4ce-31bb-27ff20134b87@oracle.com> Looks good to me! Rachel On 8/12/2016 1:33 PM, Karen Kinnear wrote: > Coleen, > > Good catch - I will make that change. > Today this code is not called for arrays, but I totally appreciate you looking at the bigger picture > and preparing for potential other uses. > > > Here is the updated lines: > KlassHandle vtklass_h = vt->klass(); > Klass* vtklass = vtklass_h(); > if (vtklass->is_instance_klass() && > (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { > assert(method() != NULL, "must have set method"); > } > > Thanks! > Karen > >> On Aug 12, 2016, at 10:47 AM, Coleen Phillimore wrote: >> >> >> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html >> >> void vtableEntry::verify(klassVtable* vt, outputStream* st) { >> NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); >> + KlassHandle vtklass_h = vt->klass(); >> + Klass* vtklass = vtklass_h(); >> + if (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { >> assert(method() != NULL, "must have set method"); >> + } >> >> I might be wrong but the vtable->klass() can be an ArrayKlass, so I think you have to do: >> >> if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) >> >> InstanceKlass::cast makes this assertion. Otherwise, the code looks good. >> >> Coleen >> >> On 8/11/16 5:07 PM, Karen Kinnear wrote: >>> Please review: >>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>> >>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>> >>> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >>> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >>> for later classfile versions. >>> >>> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >>> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >>> >>> Testing: linux-x64-slowdebug >>> rbt hs-nightly-runtime.js >>> jck vm,lang, api.java.lang >>> small invocation tests >>> >>> thanks, >>> Karen From coleen.phillimore at oracle.com Fri Aug 12 17:46:51 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 12 Aug 2016 13:46:51 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> Message-ID: On 8/12/16 1:33 PM, Karen Kinnear wrote: > Coleen, > > Good catch - I will make that change. > Today this code is not called for arrays, but I totally appreciate you > looking at the bigger picture > and preparing for potential other uses. > > > Here is the updated lines: > KlassHandle vtklass_h = vt->klass(); > Klass* vtklass = vtklass_h(); > if (vtklass->is_instance_klass() && > (InstanceKlass::cast(vtklass)->major_version() >= > klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { > assert(method() != NULL, "must have set method"); > } > This looks good. Thanks, Coleen > Thanks! > Karen > >> On Aug 12, 2016, at 10:47 AM, Coleen Phillimore >> > >> wrote: >> >> >> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html >> >> >> void vtableEntry::verify(klassVtable* vt, outputStream* st) { >> NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); >> + KlassHandle vtklass_h = vt->klass(); >> + Klass* vtklass = vtklass_h(); >> + if (InstanceKlass::cast(vtklass)->major_version() >= >> klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { >> assert(method() != NULL, "must have set method"); >> + } >> >> I might be wrong but the vtable->klass() can be an ArrayKlass, so I >> think you have to do: >> >> if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) >> >> InstanceKlass::cast makes this assertion. Otherwise, the code looks >> good. >> >> Coleen >> >> On 8/11/16 5:07 PM, Karen Kinnear wrote: >>> Please review: >>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>> >>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>> >>> Bug: For classfiles before class file version 51, JVMS did not >>> support transitive over-ride behavior. >>> Implementation needed to check this in three places, not just one. >>> Vtable size calculation is only exact >>> for later classfile versions. >>> >>> Also fixed vtable logging output - since the method name-and-sig >>> printing was changed to also print >>> the holder?s class name, we do not need to print the holder?s class >>> name separately - it was printing twice. >>> >>> Testing: linux-x64-slowdebug >>> rbt hs-nightly-runtime.js >>> jck vm,lang, api.java.lang >>> small invocation tests >>> >>> thanks, >>> Karen >> > From shafi.s.ahmad at oracle.com Fri Aug 12 18:18:13 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Fri, 12 Aug 2016 18:18:13 +0000 (UTC) Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: <803179b0-3237-cd0a-4c13-11c57605e345@oracle.com> References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> <803179b0-3237-cd0a-4c13-11c57605e345@oracle.com> Message-ID: Hi, Please find updated webrev link. http://cr.openjdk.java.net/~shshahma/8162419/webrev.01/ Regards, Shafi > -----Original Message----- > From: Chris Plummer > Sent: Friday, August 12, 2016 12:43 PM > To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes > Subject: Re: [8u] RFR for JDK-8162419: > closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > 8155968 > > On 8/11/16 10:21 PM, Shafi Ahmad wrote: > > Hi Chris, > > > > Thanks for reviewing. > > > >> -----Original Message----- > >> From: Chris Plummer > >> Sent: Friday, August 12, 2016 4:50 AM > >> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes > >> Subject: Re: [8u] RFR for JDK-8162419: > >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > >> 8155968 > >> > >> Hi Shafi, > >> > >> Please update the copyright date to 2016 and change "numbers of" to > >> "number of". > >> > >> I'm not so sure I agree with the comments in the CR that you can just > >> backport this change to vsnprintf(), but not the other changes in the > >> relevant changeset. For example: > >> > >> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 > >> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 > >> @@ -705,25 +705,35 @@ > >> } > >> > >> void JDK_Version::to_string(char* buffer, size_t buflen) const { > >> + assert(buffer && buflen > 0, "call with useful buffer"); > >> size_t index = 0; > >> + > >> if (!is_valid()) { > >> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); > >> } else if (is_partially_initialized()) { > >> jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); > >> } else { > >> - index += jio_snprintf( > >> + int rc = jio_snprintf( > >> &buffer[index], buflen - index, "%d.%d", _major, _minor); > >> + if (rc == -1) return; > >> + index += rc; > >> if (_micro > 0) { > >> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); > >> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", > >> + _micro); > >> } > >> > >> I think your change to vsnprintf() will break > >> JDK_Version::to_string() if the above diff if not applied. You could > >> argue that the above code is already broken because -1 is could be > returned to it on Windows. > >> However, your changes expand that risk to all platforms. > > I am agree with you. I think I have to revisit at least all reference of > jio_snprintf for which we are using return value of this method. > > > > shafi at shafi-ahmad:~/Java/jdk8/jdk8u-dev/hotspot$ find ./ -name "*.cpp" > > -exec grep -H jio_snprintf {} \; | egrep "=|if" | grep -v close > > ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, > > "replay_pid%p_compid%d.log", os::current_process_id(), compile_id); > > ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, > > "inline_pid%p_compid%d.log", os::current_process_id(), compile_id); > > ./src/share/vm/runtime/os.cpp: const int printed = jio_snprintf(buffer, > buffer_length, iso8601_format, > > ./src/share/vm/runtime/arguments.cpp: int ret = jio_snprintf(b, > buf_sz, "%d", os::current_process_id()); > > ./src/share/vm/runtime/arguments.cpp: // if jio_snprintf fails or the > buffer is not long enough to hold > > ./src/share/vm/runtime/java.cpp: index += jio_snprintf( > > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], > buflen - index, ".%d", _micro); > > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], > buflen - index, "_%02d", _update); > > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], > buflen - index, "%c", _special); > > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], > buflen - index, "-b%02d", _build); > > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, > buflen, "#%d", trap_state); > > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, > buflen, "%s%s", > > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, > buflen, "reason='%s' action='%s'", > > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, > buflen, "reason='%s' action='%s' index='%d'", > > ./src/share/vm/services/diagnosticArgument.cpp: jio_snprintf(buf, > > len, "%s", (c != NULL) ? c : ""); > > ./src/share/vm/classfile/vmSymbols.cpp: int len = jio_snprintf(buf, > > buflen, "%s: %s%s.%s%s", > > ./src/share/vm/classfile/classLoader.cpp: if (jio_snprintf(path, > sizeof(path), "%s%s%s", _dir, os::file_separator(), name) == -1) { > > ./src/share/vm/classfile/verifier.cpp: jio_snprintf(message, message_len, > "Could not link verifier"); > > ./src/share/vm/utilities/ostream.cpp: int result = > jio_snprintf(current_file_name, JVM_MAXPATHLEN, > > ./src/share/vm/utilities/ostream.cpp: int result = > jio_snprintf(current_file_name, JVM_MAXPATHLEN, "%s.%d" > CURRENTAPPX, > > ./src/share/vm/utilities/vmError.cpp: int n = jio_snprintf(buf, buflen, > > ./src/share/vm/utilities/vmError.cpp: int fsep_len = > jio_snprintf(&buf[pos], buflen-pos, "%s", os::file_separator()); > > ./src/share/vm/utilities/vmError.cpp: int pos = jio_snprintf(buf, buflen, > "%s%s", tmpdir, os::file_separator()); > > ./src/cpu/ppc/vm/methodHandles_ppc.cpp: jio_snprintf(buf, 100, > "verify_ref_kind expected %x", ref_kind); > > ./src/cpu/x86/vm/methodHandles_x86.cpp: jio_snprintf(buf, 100, > "verify_ref_kind expected %x", ref_kind); > > ./src/cpu/sparc/vm/methodHandles_sparc.cpp: jio_snprintf(buf, 100, > "verify_ref_kind expected %x", ref_kind); > > ./src/os/bsd/vm/os_bsd.cpp: int n = jio_snprintf(buffer, bufferSize, > > "/cores"); > Hi Shafi, > > As David pointed out, it looks like only java.cpp needs to be updated to > account for changes you are making jio_snprintf. The others either don't use > the result (even if it is assigned to a local) or already have special handling for > -1. The exception is the os_bsd.cpp case. I noticed it looks buggy, both in > JDK9 and JDK8u. > > cheers, > > Chris > > > > I will resend the updated webrev. > > > > Jdk9:src/share/vm/runtime/java.cpp > > 714 int rc = jio_snprintf( > > 715 &buffer[index], buflen - index, "%d.%d", _major, _minor); > > 716 if (rc == -1) return; > > 717 index += rc; > > 718 if (_security > 0) { > > 719 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _security); > > 720 } > > 721 if (_patch > 0) { > > 722 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _patch); > > 723 if (rc == -1) return; > > 724 index += rc; > > 725 } > > > > After line# 719 we are not updating the index variable and hence if > _security > 0 and _patch > 0 then in that case value of _security is getting > overwritten by value of _patch in the buffer. > > Is this a bug or we are ignoring _security field, in that case this is redundant > code? Please note _security field is not there in jdk8 code. > > > > Regards, > > Shafi > > > > > > > >> cheers, > >> > >> Chris > >> > >> On 8/11/16 5:14 AM, Shafi Ahmad wrote: > >> > Hi, > >> > > >> > Could I get one more review for this safe change. > >> > > >> > Regards, > >> > Shafi > >> > > >> >> -----Original Message----- > >> >> From: David Holmes > >> >> Sent: Thursday, August 11, 2016 9:52 AM >> To: Shafi Ahmad; > >> hotspot- runtime-dev at openjdk.java.net > >> >> Subject: Re: [8u] RFR for JDK-8162419: > >> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after > >> JDK- >> > >> 8155968 >> >> Hi Shafi, >> >> On 10/08/2016 6:34 PM, Shafi Ahmad > wrote: > >> >>> Hi, > >> >>> > >> >>> Please review the code change for "JDK-8162419: > >> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after > >> JDK- >> 8155968" to jdk8u. > >> >>> Please note this is partial backport of >> > >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 > >> >>> Summary: > >> >>> Microsoft version of vsnprintf() behaves differently from the > >> standard C > >>>> version when there is not enough space in the buffer. > >> >>> Microsoft version doesn't null terminates its output under > >> error conditions, >> whereas the standard C version does. On > >> Windows, it returns -1. > >> >>> We handle both cases here and always return -1, and perform > >> null >> termination. > >> >> > >> >> This looks fine to me. > >> >> > >> >> Thanks, > >> >> David > >> >> > >> >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 > >> >>> Webrev link: > >> http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ > >> >>> > >> >>> Testing: jprt > >> >>> > >> >>> Regards, > >> >>> Shafi > >> >>> > >> > >> > From coleen.phillimore at oracle.com Fri Aug 12 18:47:21 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 12 Aug 2016 14:47:21 -0400 Subject: RFR 8058575: IllegalAccessError trying to access package-private class from VM anonymous class In-Reply-To: <0149d99a-842d-4b18-e3b8-67eced65517e@oracle.com> References: <0149d99a-842d-4b18-e3b8-67eced65517e@oracle.com> Message-ID: <366ebc2f-f5f1-c605-f253-830f97e16303@oracle.com> http://cr.openjdk.java.net/~hseigel/bug_8058575.hs/src/share/vm/classfile/classFileParser.cpp.udiff.html *+ const Klass* host_klass;* *+ if (_host_klass->is_objArray_klass()) {* *+ host_klass = ObjArrayKlass::cast(_host_klass)->element_klass();* *+ } else {* *+ host_klass = _host_klass;* *+ }* *+ assert(host_klass->is_instance_klass(), "host klass is not an instance class");* Can host_class really be an array class or is this code trying to be defensive since host_class is Klass* and not InstanceKlass? If it can be an objArray class,then I think you want bottom_klass() not element_klass(), but if it can also be a typeArrayKlass, or [[I objArrayKlass, then you don't have any sort of InstanceKlass. It seems like the check for what sort of Klass host_class can be should be further up the stack and the more specific type passed here. I don't see such a check though. This doesn't seem right. Apart from this, everything else looks great. I even reviewed your test. Thanks, Coleen On 8/3/16 8:15 AM, harold seigel wrote: > Hi, > > Please review this fix for bug 8058575. The fix prevents a class > created using Unsafe.defineAnonymousClass() from being in a different > package than its host class. Being in different packages would create > access problems if the packages were in different modules. > > With this fix, If the anonymous class is in a different package then > the JVM will throw IllegalArgumentException. If the anonymous class > is in the unnamed package then the JVM will move the anonymous class > into its host class's package. > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8058575 > > Open webrevs: > > http://cr.openjdk.java.net/~hseigel/bug_8058575.hs/ > > http://cr.openjdk.java.net/~hseigel/bug_8058575.jdk/ > > The fix was tested with the JCK Lang and VM tests, the hotpot, and > java/lang, java/util and other JTreg tests, the NSK quick tests, and > with the RBT runtime nightly tests. > > Thanks, Harold > From chris.plummer at oracle.com Fri Aug 12 19:04:00 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 12 Aug 2016 12:04:00 -0700 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> Message-ID: On 8/11/16 10:21 PM, Shafi Ahmad wrote: > Hi Chris, > > Thanks for reviewing. > >> -----Original Message----- >> From: Chris Plummer >> Sent: Friday, August 12, 2016 4:50 AM >> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes >> Subject: Re: [8u] RFR for JDK-8162419: >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> 8155968 >> >> Hi Shafi, >> >> Please update the copyright date to 2016 and change "numbers of" to >> "number of". >> >> I'm not so sure I agree with the comments in the CR that you can just >> backport this change to vsnprintf(), but not the other changes in the relevant >> changeset. For example: >> >> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 >> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 >> @@ -705,25 +705,35 @@ >> } >> >> void JDK_Version::to_string(char* buffer, size_t buflen) const { >> + assert(buffer && buflen > 0, "call with useful buffer"); >> size_t index = 0; >> + >> if (!is_valid()) { >> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); >> } else if (is_partially_initialized()) { >> jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); >> } else { >> - index += jio_snprintf( >> + int rc = jio_snprintf( >> &buffer[index], buflen - index, "%d.%d", _major, _minor); >> + if (rc == -1) return; >> + index += rc; >> if (_micro > 0) { >> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); >> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); >> } >> >> I think your change to vsnprintf() will break JDK_Version::to_string() if the >> above diff if not applied. You could argue that the above code is already >> broken because -1 is could be returned to it on Windows. >> However, your changes expand that risk to all platforms. > I am agree with you. I think I have to revisit at least all reference of jio_snprintf for which we are using return value of this method. > > shafi at shafi-ahmad:~/Java/jdk8/jdk8u-dev/hotspot$ find ./ -name "*.cpp" -exec grep -H jio_snprintf {} \; | egrep "=|if" | grep -v close > ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, "replay_pid%p_compid%d.log", os::current_process_id(), compile_id); > ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, "inline_pid%p_compid%d.log", os::current_process_id(), compile_id); > ./src/share/vm/runtime/os.cpp: const int printed = jio_snprintf(buffer, buffer_length, iso8601_format, > ./src/share/vm/runtime/arguments.cpp: int ret = jio_snprintf(b, buf_sz, "%d", os::current_process_id()); > ./src/share/vm/runtime/arguments.cpp: // if jio_snprintf fails or the buffer is not long enough to hold > ./src/share/vm/runtime/java.cpp: index += jio_snprintf( > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, "_%02d", _update); > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, "%c", _special); > ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], buflen - index, "-b%02d", _build); > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "#%d", trap_state); > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "%s%s", > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "reason='%s' action='%s'", > ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, buflen, "reason='%s' action='%s' index='%d'", > ./src/share/vm/services/diagnosticArgument.cpp: jio_snprintf(buf, len, "%s", (c != NULL) ? c : ""); > ./src/share/vm/classfile/vmSymbols.cpp: int len = jio_snprintf(buf, buflen, "%s: %s%s.%s%s", > ./src/share/vm/classfile/classLoader.cpp: if (jio_snprintf(path, sizeof(path), "%s%s%s", _dir, os::file_separator(), name) == -1) { > ./src/share/vm/classfile/verifier.cpp: jio_snprintf(message, message_len, "Could not link verifier"); > ./src/share/vm/utilities/ostream.cpp: int result = jio_snprintf(current_file_name, JVM_MAXPATHLEN, > ./src/share/vm/utilities/ostream.cpp: int result = jio_snprintf(current_file_name, JVM_MAXPATHLEN, "%s.%d" CURRENTAPPX, > ./src/share/vm/utilities/vmError.cpp: int n = jio_snprintf(buf, buflen, > ./src/share/vm/utilities/vmError.cpp: int fsep_len = jio_snprintf(&buf[pos], buflen-pos, "%s", os::file_separator()); > ./src/share/vm/utilities/vmError.cpp: int pos = jio_snprintf(buf, buflen, "%s%s", tmpdir, os::file_separator()); > ./src/cpu/ppc/vm/methodHandles_ppc.cpp: jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); > ./src/cpu/x86/vm/methodHandles_x86.cpp: jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); > ./src/cpu/sparc/vm/methodHandles_sparc.cpp: jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); > ./src/os/bsd/vm/os_bsd.cpp: int n = jio_snprintf(buffer, bufferSize, "/cores"); > > I will resend the updated webrev. > > Jdk9:src/share/vm/runtime/java.cpp > 714 int rc = jio_snprintf( > 715 &buffer[index], buflen - index, "%d.%d", _major, _minor); > 716 if (rc == -1) return; > 717 index += rc; > 718 if (_security > 0) { > 719 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _security); > 720 } > 721 if (_patch > 0) { > 722 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _patch); > 723 if (rc == -1) return; > 724 index += rc; > 725 } > > After line# 719 we are not updating the index variable and hence if _security > 0 and _patch > 0 then in that case value of _security is getting overwritten by value of _patch in the buffer. > Is this a bug or we are ignoring _security field, in that case this is redundant code? Please note _security field is not there in jdk8 code. If that is a bug, it was introduced with the following changeset: http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/110ec5963eb1 Note that _security used to be _micro. The only way this is not a bug is if _security != NULL implies that _patch == NULL and _build == NULL. If that is the case, there should be an assert for it, and also an explicit return statement. I would suggest getting in touch with the author of the above changeset and the reviewers. You might also want to include the authors and reviewers of JDK-8085822, which updated the version string format (they would likely be able to tell you if the current state of the code is indeed buggy). cheers, Chris > > Regards, > Shafi > > > >> cheers, >> >> Chris >> >> On 8/11/16 5:14 AM, Shafi Ahmad wrote: >> > Hi, >> > >> > Could I get one more review for this safe change. >> > >> > Regards, >> > Shafi >> > >> >> -----Original Message----- >> >> From: David Holmes >> >> Sent: Thursday, August 11, 2016 9:52 AM >> To: Shafi Ahmad; hotspot- >> runtime-dev at openjdk.java.net >> >> Subject: Re: [8u] RFR for JDK-8162419: >> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> >> 8155968 >> >> Hi Shafi, >> >> On 10/08/2016 6:34 PM, Shafi Ahmad wrote: >> >>> Hi, >> >>> >> >>> Please review the code change for "JDK-8162419: >> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> >> 8155968" to jdk8u. >> >>> Please note this is partial backport of >> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >> >>> Summary: >> >>> Microsoft version of vsnprintf() behaves differently from the standard C >>>> version when there is not enough space in the buffer. >> >>> Microsoft version doesn't null terminates its output under error >> conditions, >> whereas the standard C version does. On Windows, it returns >> -1. >> >>> We handle both cases here and always return -1, and perform null >> >> termination. >> >> >> >> This looks fine to me. >> >> >> >> Thanks, >> >> David >> >> >> >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >> >>> Webrev link: >> http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >> >>> >> >>> Testing: jprt >> >>> >> >>> Regards, >> >>> Shafi >> >>> >> >> From chris.plummer at oracle.com Fri Aug 12 19:08:04 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 12 Aug 2016 12:08:04 -0700 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> <803179b0-3237-cd0a-4c13-11c57605e345@oracle.com> Message-ID: <6f978d4c-1765-cc0b-5611-55de302b7ec1@oracle.com> Hi Shafi, I'm not so sure the assert you added is all that useful. If you are going to keep it, please use "buffer != NULL". Testing for a NULL pointer should be explicit. Other than that it looks fine. No need for another webrev. thanks, Chris On 8/12/16 11:18 AM, Shafi Ahmad wrote: > Hi, > > Please find updated webrev link. > > http://cr.openjdk.java.net/~shshahma/8162419/webrev.01/ > > Regards, > Shafi > >> -----Original Message----- >> From: Chris Plummer >> Sent: Friday, August 12, 2016 12:43 PM >> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes >> Subject: Re: [8u] RFR for JDK-8162419: >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> 8155968 >> >> On 8/11/16 10:21 PM, Shafi Ahmad wrote: >>> Hi Chris, >>> >>> Thanks for reviewing. >>> >>>> -----Original Message----- >>>> From: Chris Plummer >>>> Sent: Friday, August 12, 2016 4:50 AM >>>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes >>>> Subject: Re: [8u] RFR for JDK-8162419: >>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>> 8155968 >>>> >>>> Hi Shafi, >>>> >>>> Please update the copyright date to 2016 and change "numbers of" to >>>> "number of". >>>> >>>> I'm not so sure I agree with the comments in the CR that you can just >>>> backport this change to vsnprintf(), but not the other changes in the >>>> relevant changeset. For example: >>>> >>>> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 >>>> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 +0100 >>>> @@ -705,25 +705,35 @@ >>>> } >>>> >>>> void JDK_Version::to_string(char* buffer, size_t buflen) const { >>>> + assert(buffer && buflen > 0, "call with useful buffer"); >>>> size_t index = 0; >>>> + >>>> if (!is_valid()) { >>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); >>>> } else if (is_partially_initialized()) { >>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); >>>> } else { >>>> - index += jio_snprintf( >>>> + int rc = jio_snprintf( >>>> &buffer[index], buflen - index, "%d.%d", _major, _minor); >>>> + if (rc == -1) return; >>>> + index += rc; >>>> if (_micro > 0) { >>>> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", _micro); >>>> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", >>>> + _micro); >>>> } >>>> >>>> I think your change to vsnprintf() will break >>>> JDK_Version::to_string() if the above diff if not applied. You could >>>> argue that the above code is already broken because -1 is could be >> returned to it on Windows. >>>> However, your changes expand that risk to all platforms. >>> I am agree with you. I think I have to revisit at least all reference of >> jio_snprintf for which we are using return value of this method. >>> shafi at shafi-ahmad:~/Java/jdk8/jdk8u-dev/hotspot$ find ./ -name "*.cpp" >>> -exec grep -H jio_snprintf {} \; | egrep "=|if" | grep -v close >>> ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, >>> "replay_pid%p_compid%d.log", os::current_process_id(), compile_id); >>> ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, O_BUFLEN, >>> "inline_pid%p_compid%d.log", os::current_process_id(), compile_id); >>> ./src/share/vm/runtime/os.cpp: const int printed = jio_snprintf(buffer, >> buffer_length, iso8601_format, >>> ./src/share/vm/runtime/arguments.cpp: int ret = jio_snprintf(b, >> buf_sz, "%d", os::current_process_id()); >>> ./src/share/vm/runtime/arguments.cpp: // if jio_snprintf fails or the >> buffer is not long enough to hold >>> ./src/share/vm/runtime/java.cpp: index += jio_snprintf( >>> ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], >> buflen - index, ".%d", _micro); >>> ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], >> buflen - index, "_%02d", _update); >>> ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], >> buflen - index, "%c", _special); >>> ./src/share/vm/runtime/java.cpp: index += jio_snprintf(&buffer[index], >> buflen - index, "-b%02d", _build); >>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >> buflen, "#%d", trap_state); >>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >> buflen, "%s%s", >>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >> buflen, "reason='%s' action='%s'", >>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >> buflen, "reason='%s' action='%s' index='%d'", >>> ./src/share/vm/services/diagnosticArgument.cpp: jio_snprintf(buf, >>> len, "%s", (c != NULL) ? c : ""); >>> ./src/share/vm/classfile/vmSymbols.cpp: int len = jio_snprintf(buf, >>> buflen, "%s: %s%s.%s%s", >>> ./src/share/vm/classfile/classLoader.cpp: if (jio_snprintf(path, >> sizeof(path), "%s%s%s", _dir, os::file_separator(), name) == -1) { >>> ./src/share/vm/classfile/verifier.cpp: jio_snprintf(message, message_len, >> "Could not link verifier"); >>> ./src/share/vm/utilities/ostream.cpp: int result = >> jio_snprintf(current_file_name, JVM_MAXPATHLEN, >>> ./src/share/vm/utilities/ostream.cpp: int result = >> jio_snprintf(current_file_name, JVM_MAXPATHLEN, "%s.%d" >> CURRENTAPPX, >>> ./src/share/vm/utilities/vmError.cpp: int n = jio_snprintf(buf, buflen, >>> ./src/share/vm/utilities/vmError.cpp: int fsep_len = >> jio_snprintf(&buf[pos], buflen-pos, "%s", os::file_separator()); >>> ./src/share/vm/utilities/vmError.cpp: int pos = jio_snprintf(buf, buflen, >> "%s%s", tmpdir, os::file_separator()); >>> ./src/cpu/ppc/vm/methodHandles_ppc.cpp: jio_snprintf(buf, 100, >> "verify_ref_kind expected %x", ref_kind); >>> ./src/cpu/x86/vm/methodHandles_x86.cpp: jio_snprintf(buf, 100, >> "verify_ref_kind expected %x", ref_kind); >>> ./src/cpu/sparc/vm/methodHandles_sparc.cpp: jio_snprintf(buf, 100, >> "verify_ref_kind expected %x", ref_kind); >>> ./src/os/bsd/vm/os_bsd.cpp: int n = jio_snprintf(buffer, bufferSize, >>> "/cores"); >> Hi Shafi, >> >> As David pointed out, it looks like only java.cpp needs to be updated to >> account for changes you are making jio_snprintf. The others either don't use >> the result (even if it is assigned to a local) or already have special handling for >> -1. The exception is the os_bsd.cpp case. I noticed it looks buggy, both in >> JDK9 and JDK8u. >> >> cheers, >> >> Chris >>> I will resend the updated webrev. >>> >>> Jdk9:src/share/vm/runtime/java.cpp >>> 714 int rc = jio_snprintf( >>> 715 &buffer[index], buflen - index, "%d.%d", _major, _minor); >>> 716 if (rc == -1) return; >>> 717 index += rc; >>> 718 if (_security > 0) { >>> 719 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _security); >>> 720 } >>> 721 if (_patch > 0) { >>> 722 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _patch); >>> 723 if (rc == -1) return; >>> 724 index += rc; >>> 725 } >>> >>> After line# 719 we are not updating the index variable and hence if >> _security > 0 and _patch > 0 then in that case value of _security is getting >> overwritten by value of _patch in the buffer. >>> Is this a bug or we are ignoring _security field, in that case this is redundant >> code? Please note _security field is not there in jdk8 code. >>> Regards, >>> Shafi >>> >>> >>> >>>> cheers, >>>> >>>> Chris >>>> >>>> On 8/11/16 5:14 AM, Shafi Ahmad wrote: >>>> > Hi, >>>> > >>>> > Could I get one more review for this safe change. >>>> > >>>> > Regards, >>>> > Shafi >>>> > >>>> >> -----Original Message----- >>>> >> From: David Holmes >>>> >> Sent: Thursday, August 11, 2016 9:52 AM >> To: Shafi Ahmad; >>>> hotspot- runtime-dev at openjdk.java.net >>>> >> Subject: Re: [8u] RFR for JDK-8162419: >>>> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after >>>> JDK- >> >>>> 8155968 >> >> Hi Shafi, >> >> On 10/08/2016 6:34 PM, Shafi Ahmad >> wrote: >>>> >>> Hi, >>>> >>> >>>> >>> Please review the code change for "JDK-8162419: >>>> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after >>>> JDK- >> 8155968" to jdk8u. >>>> >>> Please note this is partial backport of >> >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >>>> >>> Summary: >>>> >>> Microsoft version of vsnprintf() behaves differently from the >>>> standard C >>>>>> version when there is not enough space in the buffer. >>>> >>> Microsoft version doesn't null terminates its output under >>>> error conditions, >> whereas the standard C version does. On >>>> Windows, it returns -1. >>>> >>> We handle both cases here and always return -1, and perform >>>> null >> termination. >>>> >> >>>> >> This looks fine to me. >>>> >> >>>> >> Thanks, >>>> >> David >>>> >> >>>> >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >>>> >>> Webrev link: >>>> http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >>>> >>> >>>> >>> Testing: jprt >>>> >>> >>>> >>> Regards, >>>> >>> Shafi >>>> >>> >>>> >>>> From karen.kinnear at oracle.com Fri Aug 12 19:13:51 2016 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 12 Aug 2016 15:13:51 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: <81e58610-e8e6-e4ce-31bb-27ff20134b87@oracle.com> References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> <81e58610-e8e6-e4ce-31bb-27ff20134b87@oracle.com> Message-ID: <854DF677-67B3-4D25-BB56-1B48AE62B80A@oracle.com> Rachel, Thank you - I was very much hoping you would look this over, particularly the logging changes. thanks, Karen > On Aug 12, 2016, at 1:35 PM, Rachel Protacio wrote: > > Looks good to me! > > Rachel > > > On 8/12/2016 1:33 PM, Karen Kinnear wrote: >> Coleen, >> >> Good catch - I will make that change. >> Today this code is not called for arrays, but I totally appreciate you looking at the bigger picture >> and preparing for potential other uses. >> >> >> Here is the updated lines: >> KlassHandle vtklass_h = vt->klass(); >> Klass* vtklass = vtklass_h(); >> if (vtklass->is_instance_klass() && >> (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { >> assert(method() != NULL, "must have set method"); >> } >> >> Thanks! >> Karen >> >>> On Aug 12, 2016, at 10:47 AM, Coleen Phillimore wrote: >>> >>> >>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html >>> >>> void vtableEntry::verify(klassVtable* vt, outputStream* st) { >>> NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); >>> + KlassHandle vtklass_h = vt->klass(); >>> + Klass* vtklass = vtklass_h(); >>> + if (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { >>> assert(method() != NULL, "must have set method"); >>> + } >>> >>> I might be wrong but the vtable->klass() can be an ArrayKlass, so I think you have to do: >>> >>> if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) >>> >>> InstanceKlass::cast makes this assertion. Otherwise, the code looks good. >>> >>> Coleen >>> >>> On 8/11/16 5:07 PM, Karen Kinnear wrote: >>>> Please review: >>>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>>> >>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>>> >>>> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >>>> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >>>> for later classfile versions. >>>> >>>> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >>>> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >>>> >>>> Testing: linux-x64-slowdebug >>>> rbt hs-nightly-runtime.js >>>> jck vm,lang, api.java.lang >>>> small invocation tests >>>> >>>> thanks, >>>> Karen > From karen.kinnear at oracle.com Fri Aug 12 19:14:08 2016 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 12 Aug 2016 15:14:08 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> Message-ID: Thanks Coleen. Karen > On Aug 12, 2016, at 1:46 PM, Coleen Phillimore wrote: > > > > On 8/12/16 1:33 PM, Karen Kinnear wrote: >> Coleen, >> >> Good catch - I will make that change. >> Today this code is not called for arrays, but I totally appreciate you looking at the bigger picture >> and preparing for potential other uses. >> >> >> Here is the updated lines: >> KlassHandle vtklass_h = vt->klass(); >> Klass* vtklass = vtklass_h(); >> if (vtklass->is_instance_klass() && >> (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { >> assert(method() != NULL, "must have set method"); >> } >> > This looks good. > Thanks, > Coleen > >> Thanks! >> Karen >> >>> On Aug 12, 2016, at 10:47 AM, Coleen Phillimore > wrote: >>> >>> >>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html >>> >>> void vtableEntry::verify(klassVtable* vt, outputStream* st) { >>> NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); >>> + KlassHandle vtklass_h = vt->klass(); >>> + Klass* vtklass = vtklass_h(); >>> + if (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { >>> assert(method() != NULL, "must have set method"); >>> + } >>> >>> I might be wrong but the vtable->klass() can be an ArrayKlass, so I think you have to do: >>> >>> if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) >>> >>> InstanceKlass::cast makes this assertion. Otherwise, the code looks good. >>> >>> Coleen >>> >>> On 8/11/16 5:07 PM, Karen Kinnear wrote: >>>> Please review: >>>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>>> >>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>>> >>>> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >>>> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >>>> for later classfile versions. >>>> >>>> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >>>> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >>>> >>>> Testing: linux-x64-slowdebug >>>> rbt hs-nightly-runtime.js >>>> jck vm,lang, api.java.lang >>>> small invocation tests >>>> >>>> thanks, >>>> Karen >>> >> > From shafi.s.ahmad at oracle.com Fri Aug 12 19:16:45 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Fri, 12 Aug 2016 19:16:45 +0000 (UTC) Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: <6f978d4c-1765-cc0b-5611-55de302b7ec1@oracle.com> References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> <803179b0-3237-cd0a-4c13-11c57605e345@oracle.com> <6f978d4c-1765-cc0b-5611-55de302b7ec1@oracle.com> Message-ID: <6eb6d4cd-ade3-459c-aee3-29e7b8f4f3b7@default> Hi Chris, Thank you for reviewing it. I have just copied the assert from jdk9 code change. I will make the change "buffer != NULL". Regards, Shafi > -----Original Message----- > From: Chris Plummer > Sent: Saturday, August 13, 2016 12:38 AM > To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes > Subject: Re: [8u] RFR for JDK-8162419: > closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > 8155968 > > Hi Shafi, > > I'm not so sure the assert you added is all that useful. If you are going to keep > it, please use "buffer != NULL". Testing for a NULL pointer should be explicit. > Other than that it looks fine. No need for another webrev. > > thanks, > > Chris > > On 8/12/16 11:18 AM, Shafi Ahmad wrote: > > Hi, > > > > Please find updated webrev link. > > > > http://cr.openjdk.java.net/~shshahma/8162419/webrev.01/ > > > > Regards, > > Shafi > > > >> -----Original Message----- > >> From: Chris Plummer > >> Sent: Friday, August 12, 2016 12:43 PM > >> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes > >> Subject: Re: [8u] RFR for JDK-8162419: > >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > >> 8155968 > >> > >> On 8/11/16 10:21 PM, Shafi Ahmad wrote: > >>> Hi Chris, > >>> > >>> Thanks for reviewing. > >>> > >>>> -----Original Message----- > >>>> From: Chris Plummer > >>>> Sent: Friday, August 12, 2016 4:50 AM > >>>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David > Holmes > >>>> Subject: Re: [8u] RFR for JDK-8162419: > >>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- > >>>> 8155968 > >>>> > >>>> Hi Shafi, > >>>> > >>>> Please update the copyright date to 2016 and change "numbers of" to > >>>> "number of". > >>>> > >>>> I'm not so sure I agree with the comments in the CR that you can > >>>> just backport this change to vsnprintf(), but not the other changes > >>>> in the relevant changeset. For example: > >>>> > >>>> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 > >>>> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 > +0100 > >>>> @@ -705,25 +705,35 @@ > >>>> } > >>>> > >>>> void JDK_Version::to_string(char* buffer, size_t buflen) const > >>>> { > >>>> + assert(buffer && buflen > 0, "call with useful buffer"); > >>>> size_t index = 0; > >>>> + > >>>> if (!is_valid()) { > >>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); > >>>> } else if (is_partially_initialized()) { > >>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); > >>>> } else { > >>>> - index += jio_snprintf( > >>>> + int rc = jio_snprintf( > >>>> &buffer[index], buflen - index, "%d.%d", _major, > >>>> _minor); > >>>> + if (rc == -1) return; > >>>> + index += rc; > >>>> if (_micro > 0) { > >>>> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", > _micro); > >>>> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", > >>>> + _micro); > >>>> } > >>>> > >>>> I think your change to vsnprintf() will break > >>>> JDK_Version::to_string() if the above diff if not applied. You > >>>> could argue that the above code is already broken because -1 is > >>>> could be > >> returned to it on Windows. > >>>> However, your changes expand that risk to all platforms. > >>> I am agree with you. I think I have to revisit at least all > >>> reference of > >> jio_snprintf for which we are using return value of this method. > >>> shafi at shafi-ahmad:~/Java/jdk8/jdk8u-dev/hotspot$ find ./ -name > "*.cpp" > >>> -exec grep -H jio_snprintf {} \; | egrep "=|if" | grep -v close > >>> ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, > >>> O_BUFLEN, "replay_pid%p_compid%d.log", os::current_process_id(), > >>> compile_id); > >>> ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, > >>> O_BUFLEN, "inline_pid%p_compid%d.log", os::current_process_id(), > >>> compile_id); > >>> ./src/share/vm/runtime/os.cpp: const int printed = > >>> jio_snprintf(buffer, > >> buffer_length, iso8601_format, > >>> ./src/share/vm/runtime/arguments.cpp: int ret = jio_snprintf(b, > >> buf_sz, "%d", os::current_process_id()); > >>> ./src/share/vm/runtime/arguments.cpp: // if jio_snprintf fails or the > >> buffer is not long enough to hold > >>> ./src/share/vm/runtime/java.cpp: index += jio_snprintf( > >>> ./src/share/vm/runtime/java.cpp: index += > jio_snprintf(&buffer[index], > >> buflen - index, ".%d", _micro); > >>> ./src/share/vm/runtime/java.cpp: index += > jio_snprintf(&buffer[index], > >> buflen - index, "_%02d", _update); > >>> ./src/share/vm/runtime/java.cpp: index += > jio_snprintf(&buffer[index], > >> buflen - index, "%c", _special); > >>> ./src/share/vm/runtime/java.cpp: index += > jio_snprintf(&buffer[index], > >> buflen - index, "-b%02d", _build); > >>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, > >> buflen, "#%d", trap_state); > >>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, > >> buflen, "%s%s", > >>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, > >> buflen, "reason='%s' action='%s'", > >>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, > >> buflen, "reason='%s' action='%s' index='%d'", > >>> ./src/share/vm/services/diagnosticArgument.cpp: jio_snprintf(buf, > >>> len, "%s", (c != NULL) ? c : ""); > >>> ./src/share/vm/classfile/vmSymbols.cpp: int len = jio_snprintf(buf, > >>> buflen, "%s: %s%s.%s%s", > >>> ./src/share/vm/classfile/classLoader.cpp: if (jio_snprintf(path, > >> sizeof(path), "%s%s%s", _dir, os::file_separator(), name) == -1) { > >>> ./src/share/vm/classfile/verifier.cpp: jio_snprintf(message, > message_len, > >> "Could not link verifier"); > >>> ./src/share/vm/utilities/ostream.cpp: int result = > >> jio_snprintf(current_file_name, JVM_MAXPATHLEN, > >>> ./src/share/vm/utilities/ostream.cpp: int result = > >> jio_snprintf(current_file_name, JVM_MAXPATHLEN, "%s.%d" > >> CURRENTAPPX, > >>> ./src/share/vm/utilities/vmError.cpp: int n = jio_snprintf(buf, buflen, > >>> ./src/share/vm/utilities/vmError.cpp: int fsep_len = > >> jio_snprintf(&buf[pos], buflen-pos, "%s", os::file_separator()); > >>> ./src/share/vm/utilities/vmError.cpp: int pos = jio_snprintf(buf, > buflen, > >> "%s%s", tmpdir, os::file_separator()); > >>> ./src/cpu/ppc/vm/methodHandles_ppc.cpp: jio_snprintf(buf, 100, > >> "verify_ref_kind expected %x", ref_kind); > >>> ./src/cpu/x86/vm/methodHandles_x86.cpp: jio_snprintf(buf, 100, > >> "verify_ref_kind expected %x", ref_kind); > >>> ./src/cpu/sparc/vm/methodHandles_sparc.cpp: jio_snprintf(buf, 100, > >> "verify_ref_kind expected %x", ref_kind); > >>> ./src/os/bsd/vm/os_bsd.cpp: int n = jio_snprintf(buffer, > >>> bufferSize, "/cores"); > >> Hi Shafi, > >> > >> As David pointed out, it looks like only java.cpp needs to be updated > >> to account for changes you are making jio_snprintf. The others either > >> don't use the result (even if it is assigned to a local) or already > >> have special handling for -1. The exception is the os_bsd.cpp case. I > >> noticed it looks buggy, both in > >> JDK9 and JDK8u. > >> > >> cheers, > >> > >> Chris > >>> I will resend the updated webrev. > >>> > >>> Jdk9:src/share/vm/runtime/java.cpp > >>> 714 int rc = jio_snprintf( > >>> 715 &buffer[index], buflen - index, "%d.%d", _major, _minor); > >>> 716 if (rc == -1) return; > >>> 717 index += rc; > >>> 718 if (_security > 0) { > >>> 719 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _security); > >>> 720 } > >>> 721 if (_patch > 0) { > >>> 722 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _patch); > >>> 723 if (rc == -1) return; > >>> 724 index += rc; > >>> 725 } > >>> > >>> After line# 719 we are not updating the index variable and hence if > >> _security > 0 and _patch > 0 then in that case value of _security is > >> getting overwritten by value of _patch in the buffer. > >>> Is this a bug or we are ignoring _security field, in that case this > >>> is redundant > >> code? Please note _security field is not there in jdk8 code. > >>> Regards, > >>> Shafi > >>> > >>> > >>> > >>>> cheers, > >>>> > >>>> Chris > >>>> > >>>> On 8/11/16 5:14 AM, Shafi Ahmad wrote: > >>>> > Hi, > >>>> > > >>>> > Could I get one more review for this safe change. > >>>> > > >>>> > Regards, > >>>> > Shafi > >>>> > > >>>> >> -----Original Message----- > >>>> >> From: David Holmes > >>>> >> Sent: Thursday, August 11, 2016 9:52 AM >> To: Shafi Ahmad; > >>>> hotspot- runtime-dev at openjdk.java.net > >>>> >> Subject: Re: [8u] RFR for JDK-8162419: > >>>> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing > >>>> after > >>>> JDK- >> > >>>> 8155968 >> >> Hi Shafi, >> >> On 10/08/2016 6:34 PM, Shafi > >>>> Ahmad > >> wrote: > >>>> >>> Hi, > >>>> >>> > >>>> >>> Please review the code change for "JDK-8162419: > >>>> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing > >>>> after > >>>> JDK- >> 8155968" to jdk8u. > >>>> >>> Please note this is partial backport of >> > >>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 > >>>> >>> Summary: > >>>> >>> Microsoft version of vsnprintf() behaves differently from > >>>> the standard C > >>>>>> version when there is not enough space in the buffer. > >>>> >>> Microsoft version doesn't null terminates its output under > >>>> error conditions, >> whereas the standard C version does. On > >>>> Windows, it returns -1. > >>>> >>> We handle both cases here and always return -1, and perform > >>>> null >> termination. > >>>> >> > >>>> >> This looks fine to me. > >>>> >> > >>>> >> Thanks, > >>>> >> David > >>>> >> > >>>> >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 > >>>> >>> Webrev link: > >>>> http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ > >>>> >>> > >>>> >>> Testing: jprt > >>>> >>> > >>>> >>> Regards, > >>>> >>> Shafi > >>>> >>> > >>>> > >>>> > From chris.plummer at oracle.com Fri Aug 12 19:21:16 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 12 Aug 2016 12:21:16 -0700 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: <6eb6d4cd-ade3-459c-aee3-29e7b8f4f3b7@default> References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> <803179b0-3237-cd0a-4c13-11c57605e345@oracle.com> <6f978d4c-1765-cc0b-5611-55de302b7ec1@oracle.com> <6eb6d4cd-ade3-459c-aee3-29e7b8f4f3b7@default> Message-ID: Actually maybe best not to change it in that case. I thought I had checked JDK9 and didn't see the assert, but I see it there now. I must have been looking at the 8u version. thanks, Chris On 8/12/16 12:16 PM, Shafi Ahmad wrote: > Hi Chris, > > Thank you for reviewing it. I have just copied the assert from jdk9 code change. I will make the change "buffer != NULL". > > Regards, > Shafi > >> -----Original Message----- >> From: Chris Plummer >> Sent: Saturday, August 13, 2016 12:38 AM >> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes >> Subject: Re: [8u] RFR for JDK-8162419: >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >> 8155968 >> >> Hi Shafi, >> >> I'm not so sure the assert you added is all that useful. If you are going to keep >> it, please use "buffer != NULL". Testing for a NULL pointer should be explicit. >> Other than that it looks fine. No need for another webrev. >> >> thanks, >> >> Chris >> >> On 8/12/16 11:18 AM, Shafi Ahmad wrote: >>> Hi, >>> >>> Please find updated webrev link. >>> >>> http://cr.openjdk.java.net/~shshahma/8162419/webrev.01/ >>> >>> Regards, >>> Shafi >>> >>>> -----Original Message----- >>>> From: Chris Plummer >>>> Sent: Friday, August 12, 2016 12:43 PM >>>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes >>>> Subject: Re: [8u] RFR for JDK-8162419: >>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>> 8155968 >>>> >>>> On 8/11/16 10:21 PM, Shafi Ahmad wrote: >>>>> Hi Chris, >>>>> >>>>> Thanks for reviewing. >>>>> >>>>>> -----Original Message----- >>>>>> From: Chris Plummer >>>>>> Sent: Friday, August 12, 2016 4:50 AM >>>>>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David >> Holmes >>>>>> Subject: Re: [8u] RFR for JDK-8162419: >>>>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>>>> 8155968 >>>>>> >>>>>> Hi Shafi, >>>>>> >>>>>> Please update the copyright date to 2016 and change "numbers of" to >>>>>> "number of". >>>>>> >>>>>> I'm not so sure I agree with the comments in the CR that you can >>>>>> just backport this change to vsnprintf(), but not the other changes >>>>>> in the relevant changeset. For example: >>>>>> >>>>>> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 -0800 >>>>>> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 >> +0100 >>>>>> @@ -705,25 +705,35 @@ >>>>>> } >>>>>> >>>>>> void JDK_Version::to_string(char* buffer, size_t buflen) const >>>>>> { >>>>>> + assert(buffer && buflen > 0, "call with useful buffer"); >>>>>> size_t index = 0; >>>>>> + >>>>>> if (!is_valid()) { >>>>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); >>>>>> } else if (is_partially_initialized()) { >>>>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized) pre-1.6.0"); >>>>>> } else { >>>>>> - index += jio_snprintf( >>>>>> + int rc = jio_snprintf( >>>>>> &buffer[index], buflen - index, "%d.%d", _major, >>>>>> _minor); >>>>>> + if (rc == -1) return; >>>>>> + index += rc; >>>>>> if (_micro > 0) { >>>>>> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", >> _micro); >>>>>> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", >>>>>> + _micro); >>>>>> } >>>>>> >>>>>> I think your change to vsnprintf() will break >>>>>> JDK_Version::to_string() if the above diff if not applied. You >>>>>> could argue that the above code is already broken because -1 is >>>>>> could be >>>> returned to it on Windows. >>>>>> However, your changes expand that risk to all platforms. >>>>> I am agree with you. I think I have to revisit at least all >>>>> reference of >>>> jio_snprintf for which we are using return value of this method. >>>>> shafi at shafi-ahmad:~/Java/jdk8/jdk8u-dev/hotspot$ find ./ -name >> "*.cpp" >>>>> -exec grep -H jio_snprintf {} \; | egrep "=|if" | grep -v close >>>>> ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, >>>>> O_BUFLEN, "replay_pid%p_compid%d.log", os::current_process_id(), >>>>> compile_id); >>>>> ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, >>>>> O_BUFLEN, "inline_pid%p_compid%d.log", os::current_process_id(), >>>>> compile_id); >>>>> ./src/share/vm/runtime/os.cpp: const int printed = >>>>> jio_snprintf(buffer, >>>> buffer_length, iso8601_format, >>>>> ./src/share/vm/runtime/arguments.cpp: int ret = jio_snprintf(b, >>>> buf_sz, "%d", os::current_process_id()); >>>>> ./src/share/vm/runtime/arguments.cpp: // if jio_snprintf fails or the >>>> buffer is not long enough to hold >>>>> ./src/share/vm/runtime/java.cpp: index += jio_snprintf( >>>>> ./src/share/vm/runtime/java.cpp: index += >> jio_snprintf(&buffer[index], >>>> buflen - index, ".%d", _micro); >>>>> ./src/share/vm/runtime/java.cpp: index += >> jio_snprintf(&buffer[index], >>>> buflen - index, "_%02d", _update); >>>>> ./src/share/vm/runtime/java.cpp: index += >> jio_snprintf(&buffer[index], >>>> buflen - index, "%c", _special); >>>>> ./src/share/vm/runtime/java.cpp: index += >> jio_snprintf(&buffer[index], >>>> buflen - index, "-b%02d", _build); >>>>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >>>> buflen, "#%d", trap_state); >>>>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >>>> buflen, "%s%s", >>>>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >>>> buflen, "reason='%s' action='%s'", >>>>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >>>> buflen, "reason='%s' action='%s' index='%d'", >>>>> ./src/share/vm/services/diagnosticArgument.cpp: jio_snprintf(buf, >>>>> len, "%s", (c != NULL) ? c : ""); >>>>> ./src/share/vm/classfile/vmSymbols.cpp: int len = jio_snprintf(buf, >>>>> buflen, "%s: %s%s.%s%s", >>>>> ./src/share/vm/classfile/classLoader.cpp: if (jio_snprintf(path, >>>> sizeof(path), "%s%s%s", _dir, os::file_separator(), name) == -1) { >>>>> ./src/share/vm/classfile/verifier.cpp: jio_snprintf(message, >> message_len, >>>> "Could not link verifier"); >>>>> ./src/share/vm/utilities/ostream.cpp: int result = >>>> jio_snprintf(current_file_name, JVM_MAXPATHLEN, >>>>> ./src/share/vm/utilities/ostream.cpp: int result = >>>> jio_snprintf(current_file_name, JVM_MAXPATHLEN, "%s.%d" >>>> CURRENTAPPX, >>>>> ./src/share/vm/utilities/vmError.cpp: int n = jio_snprintf(buf, buflen, >>>>> ./src/share/vm/utilities/vmError.cpp: int fsep_len = >>>> jio_snprintf(&buf[pos], buflen-pos, "%s", os::file_separator()); >>>>> ./src/share/vm/utilities/vmError.cpp: int pos = jio_snprintf(buf, >> buflen, >>>> "%s%s", tmpdir, os::file_separator()); >>>>> ./src/cpu/ppc/vm/methodHandles_ppc.cpp: jio_snprintf(buf, 100, >>>> "verify_ref_kind expected %x", ref_kind); >>>>> ./src/cpu/x86/vm/methodHandles_x86.cpp: jio_snprintf(buf, 100, >>>> "verify_ref_kind expected %x", ref_kind); >>>>> ./src/cpu/sparc/vm/methodHandles_sparc.cpp: jio_snprintf(buf, 100, >>>> "verify_ref_kind expected %x", ref_kind); >>>>> ./src/os/bsd/vm/os_bsd.cpp: int n = jio_snprintf(buffer, >>>>> bufferSize, "/cores"); >>>> Hi Shafi, >>>> >>>> As David pointed out, it looks like only java.cpp needs to be updated >>>> to account for changes you are making jio_snprintf. The others either >>>> don't use the result (even if it is assigned to a local) or already >>>> have special handling for -1. The exception is the os_bsd.cpp case. I >>>> noticed it looks buggy, both in >>>> JDK9 and JDK8u. >>>> >>>> cheers, >>>> >>>> Chris >>>>> I will resend the updated webrev. >>>>> >>>>> Jdk9:src/share/vm/runtime/java.cpp >>>>> 714 int rc = jio_snprintf( >>>>> 715 &buffer[index], buflen - index, "%d.%d", _major, _minor); >>>>> 716 if (rc == -1) return; >>>>> 717 index += rc; >>>>> 718 if (_security > 0) { >>>>> 719 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _security); >>>>> 720 } >>>>> 721 if (_patch > 0) { >>>>> 722 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", _patch); >>>>> 723 if (rc == -1) return; >>>>> 724 index += rc; >>>>> 725 } >>>>> >>>>> After line# 719 we are not updating the index variable and hence if >>>> _security > 0 and _patch > 0 then in that case value of _security is >>>> getting overwritten by value of _patch in the buffer. >>>>> Is this a bug or we are ignoring _security field, in that case this >>>>> is redundant >>>> code? Please note _security field is not there in jdk8 code. >>>>> Regards, >>>>> Shafi >>>>> >>>>> >>>>> >>>>>> cheers, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 8/11/16 5:14 AM, Shafi Ahmad wrote: >>>>>> > Hi, >>>>>> > >>>>>> > Could I get one more review for this safe change. >>>>>> > >>>>>> > Regards, >>>>>> > Shafi >>>>>> > >>>>>> >> -----Original Message----- >>>>>> >> From: David Holmes >>>>>> >> Sent: Thursday, August 11, 2016 9:52 AM >> To: Shafi Ahmad; >>>>>> hotspot- runtime-dev at openjdk.java.net >>>>>> >> Subject: Re: [8u] RFR for JDK-8162419: >>>>>> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing >>>>>> after >>>>>> JDK- >> >>>>>> 8155968 >> >> Hi Shafi, >> >> On 10/08/2016 6:34 PM, Shafi >>>>>> Ahmad >>>> wrote: >>>>>> >>> Hi, >>>>>> >>> >>>>>> >>> Please review the code change for "JDK-8162419: >>>>>> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing >>>>>> after >>>>>> JDK- >> 8155968" to jdk8u. >>>>>> >>> Please note this is partial backport of >> >>>>>> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >>>>>> >>> Summary: >>>>>> >>> Microsoft version of vsnprintf() behaves differently from >>>>>> the standard C >>>>>>>> version when there is not enough space in the buffer. >>>>>> >>> Microsoft version doesn't null terminates its output under >>>>>> error conditions, >> whereas the standard C version does. On >>>>>> Windows, it returns -1. >>>>>> >>> We handle both cases here and always return -1, and perform >>>>>> null >> termination. >>>>>> >> >>>>>> >> This looks fine to me. >>>>>> >> >>>>>> >> Thanks, >>>>>> >> David >>>>>> >> >>>>>> >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >>>>>> >>> Webrev link: >>>>>> http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >>>>>> >>> >>>>>> >>> Testing: jprt >>>>>> >>> >>>>>> >>> Regards, >>>>>> >>> Shafi >>>>>> >>> >>>>>> >>>>>> From karen.kinnear at oracle.com Fri Aug 12 19:28:41 2016 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 12 Aug 2016 15:28:41 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> Message-ID: Added a targeted test case for class files with different class file versions in the inheritance hierarchy. http://cr.openjdk.java.net/~acorn/8163808.hs.1/webrev/ thanks, Karen > On Aug 12, 2016, at 1:46 PM, Coleen Phillimore wrote: > > > > On 8/12/16 1:33 PM, Karen Kinnear wrote: >> Coleen, >> >> Good catch - I will make that change. >> Today this code is not called for arrays, but I totally appreciate you looking at the bigger picture >> and preparing for potential other uses. >> >> >> Here is the updated lines: >> KlassHandle vtklass_h = vt->klass(); >> Klass* vtklass = vtklass_h(); >> if (vtklass->is_instance_klass() && >> (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { >> assert(method() != NULL, "must have set method"); >> } >> > This looks good. > Thanks, > Coleen > >> Thanks! >> Karen >> >>> On Aug 12, 2016, at 10:47 AM, Coleen Phillimore > wrote: >>> >>> >>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html >>> >>> void vtableEntry::verify(klassVtable* vt, outputStream* st) { >>> NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); >>> + KlassHandle vtklass_h = vt->klass(); >>> + Klass* vtklass = vtklass_h(); >>> + if (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { >>> assert(method() != NULL, "must have set method"); >>> + } >>> >>> I might be wrong but the vtable->klass() can be an ArrayKlass, so I think you have to do: >>> >>> if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) >>> >>> InstanceKlass::cast makes this assertion. Otherwise, the code looks good. >>> >>> Coleen >>> >>> On 8/11/16 5:07 PM, Karen Kinnear wrote: >>>> Please review: >>>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>>> >>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>>> >>>> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >>>> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >>>> for later classfile versions. >>>> >>>> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >>>> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >>>> >>>> Testing: linux-x64-slowdebug >>>> rbt hs-nightly-runtime.js >>>> jck vm,lang, api.java.lang >>>> small invocation tests >>>> >>>> thanks, >>>> Karen >>> >> > From harold.seigel at oracle.com Fri Aug 12 20:07:59 2016 From: harold.seigel at oracle.com (harold seigel) Date: Fri, 12 Aug 2016 16:07:59 -0400 Subject: RFR 8058575: IllegalAccessError trying to access package-private class from VM anonymous class In-Reply-To: <366ebc2f-f5f1-c605-f253-830f97e16303@oracle.com> References: <0149d99a-842d-4b18-e3b8-67eced65517e@oracle.com> <366ebc2f-f5f1-c605-f253-830f97e16303@oracle.com> Message-ID: <87b03305-d959-07bc-d4e5-3f0b9ac6b72a@oracle.com> Thanks Coleen. I'll look into this and get back to you. Harold On 8/12/2016 2:47 PM, Coleen Phillimore wrote: > > http://cr.openjdk.java.net/~hseigel/bug_8058575.hs/src/share/vm/classfile/classFileParser.cpp.udiff.html > > > *+ const Klass* host_klass;* > *+ if (_host_klass->is_objArray_klass()) {* > *+ host_klass = ObjArrayKlass::cast(_host_klass)->element_klass();* > *+ } else {* > *+ host_klass = _host_klass;* > *+ }* > *+ assert(host_klass->is_instance_klass(), "host klass is not an > instance class");* > > > Can host_class really be an array class or is this code trying to be > defensive since host_class is Klass* and not InstanceKlass? If it can > be an objArray class,then I think you want bottom_klass() not > element_klass(), but if it can also be a typeArrayKlass, or [[I > objArrayKlass, then you don't have any sort of InstanceKlass. It > seems like the check for what sort of Klass host_class can be should > be further up the stack and the more specific type passed here. I > don't see such a check though. This doesn't seem right. > > Apart from this, everything else looks great. I even reviewed your test. > > Thanks, > Coleen > > On 8/3/16 8:15 AM, harold seigel wrote: >> Hi, >> >> Please review this fix for bug 8058575. The fix prevents a class >> created using Unsafe.defineAnonymousClass() from being in a different >> package than its host class. Being in different packages would >> create access problems if the packages were in different modules. >> >> With this fix, If the anonymous class is in a different package then >> the JVM will throw IllegalArgumentException. If the anonymous class >> is in the unnamed package then the JVM will move the anonymous class >> into its host class's package. >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8058575 >> >> Open webrevs: >> >> http://cr.openjdk.java.net/~hseigel/bug_8058575.hs/ >> >> http://cr.openjdk.java.net/~hseigel/bug_8058575.jdk/ >> >> The fix was tested with the JCK Lang and VM tests, the hotpot, and >> java/lang, java/util and other JTreg tests, the NSK quick tests, and >> with the RBT runtime nightly tests. >> >> Thanks, Harold >> > From dean.long at oracle.com Fri Aug 12 20:19:03 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 12 Aug 2016 13:19:03 -0700 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: References: Message-ID: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> Sure: http://cr.openjdk.java.net/~dlong/8161598/webrev.1/ dl On 8/12/16 6:46 AM, Frederic Parain wrote: > Dean, > > In file macroAssembler_x86.cpp, could it be possible to > get rid of the clear_pc argument? It seems completely > useless now. > > Fred > > > On 08/09/2016 01:39 PM, dean.long at oracle.com wrote: >> Ping. >> >> dl >> >> >> On 8/4/16 3:28 PM, dean.long at oracle.com wrote: >>> https://bugs.openjdk.java.net/browse/JDK-8161598 >>> >>> http://cr.openjdk.java.net/~dlong/8161598/webrev/ >>> >>> Sorry, this issue is Confidential. The problem is similar to 8029441, >>> where we suspend a thread and use pd_get_top_frame_for_profiling() to >>> get the top frame for stack walking. The problem is "last Java frame" >>> anchor frames on x86. In lots of places we do not store last_Java_pc. >>> This is OK in the synchronous stack walk case done by the current >>> thread. But in the asynchronous case, there are small windows where >>> it's not always safe to get PC from sp[-1]. >>> >>> The solution is not to treat x86 anchor frames as "always walkable". >>> Instead, we follow the example of sparc and make them walking by >>> filling in last_Java_pc when it's safe. >>> >>> I went for the minimal fix, resetting clear_pc to true in >>> reset_last_Java_frame() but not changing the API and all the callers. >>> I can fix this if reviewers feel strongly about it. >>> >>> dl >>> >> From leelamohan.venati at gmail.com Fri Aug 12 22:21:45 2016 From: leelamohan.venati at gmail.com (Leela Mohan) Date: Fri, 12 Aug 2016 15:21:45 -0700 Subject: Setting JVMTI Capabilities when VM is in "Live Phase" Message-ID: Hi experts, It looks like, we don't disallow setting capabilities when VM is in "JVMTI_PHASE_LIVE". And, I notice, for every new compilation of method, ciEnv caches the JVMTI state and expects those assumptions to be true during the compilation. Otherwise, dump the compiled method. However, we don't seem to do anything with the methods which were compiled before setting the capability. What is the understanding? Thanks, Leela From daniel.daugherty at oracle.com Fri Aug 12 23:04:25 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 12 Aug 2016 17:04:25 -0600 Subject: Setting JVMTI Capabilities when VM is in "Live Phase" In-Reply-To: References: Message-ID: <3d486133-a635-638c-fcc2-18230c694b2f@oracle.com> On 8/12/16 4:21 PM, Leela Mohan wrote: > Hi experts, > > It looks like, we don't disallow setting capabilities when VM is in > "JVMTI_PHASE_LIVE". And, I notice, for every new compilation of method, > ciEnv caches the JVMTI state and expects those assumptions to be true > during the compilation. Otherwise, dump the compiled method. > > However, we don't seem to do anything with the methods which were compiled > before setting the capability. > > What is the understanding? > > Thanks, > Leela Hi Leela, I'm guessing that you are talking about this capability: http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#jvmtiCapabilities.can_generate_compiled_method_load_events and this event: http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#CompiledMethodLoad The can_generate_compiled_method_load_events capability needs to be added in order to generate CompiledMethodLoad events. Capabilities are added via http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#AddCapabilities which can be called from different JVM/TI phases. Different VM implementations can require that certain capabilities can only be added in certain JVM/TI phases. However, if AddCapabilities() does not return a JVM/TI error when a capability is added in a phase, e.g., the live phase, then you can safely assume that the capability has been added. In your example, it sounds like the capability is added in the live phase because you are seeing events generated for newly compiled methods. In order to see synthetic events for methods that were compiled before you added the capability, your agent needs to use a different function: http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#GenerateEvents The documentation for Compiled Method Load has this line: > These events can be sent after their initial occurrence with GenerateEvents. and that sounds just like your situation. Hope this helps. Dan From leelamohan.venati at gmail.com Fri Aug 12 23:27:05 2016 From: leelamohan.venati at gmail.com (Leela Mohan) Date: Fri, 12 Aug 2016 16:27:05 -0700 Subject: Setting JVMTI Capabilities when VM is in "Live Phase" In-Reply-To: <3d486133-a635-638c-fcc2-18230c694b2f@oracle.com> References: <3d486133-a635-638c-fcc2-18230c694b2f@oracle.com> Message-ID: Hi Daniel, Actually, I was thinking about the case where compiler choose not to have complete "de-opt" state. For ex: Local pruning. I can also think of other cases which need callback events like, posting exceptions to the agent. JVMTI requests for examining/changing the stack frame would conservatively de-optimize the compile methods but not all de-optimizable locations can restore the java state user expect. What are the expectations for VM for these cases ? Thanks, Leela On Fri, Aug 12, 2016 at 4:04 PM, Daniel D. Daugherty < daniel.daugherty at oracle.com> wrote: > On 8/12/16 4:21 PM, Leela Mohan wrote: > >> Hi experts, >> >> It looks like, we don't disallow setting capabilities when VM is in >> "JVMTI_PHASE_LIVE". And, I notice, for every new compilation of method, >> ciEnv caches the JVMTI state and expects those assumptions to be true >> during the compilation. Otherwise, dump the compiled method. >> >> However, we don't seem to do anything with the methods which were compiled >> before setting the capability. >> >> What is the understanding? >> >> Thanks, >> Leela >> > > Hi Leela, > > I'm guessing that you are talking about this capability: > > http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht > ml#jvmtiCapabilities.can_generate_compiled_method_load_events > > and this event: > > http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht > ml#CompiledMethodLoad > > > The can_generate_compiled_method_load_events capability needs to be added > in order to generate CompiledMethodLoad events. Capabilities are added via > http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht > ml#AddCapabilities > which can be called from different JVM/TI phases. Different VM > implementations > can require that certain capabilities can only be added in certain JVM/TI > phases. > However, if AddCapabilities() does not return a JVM/TI error when a > capability > is added in a phase, e.g., the live phase, then you can safely assume that > the capability has been added. > > > In your example, it sounds like the capability is added in the live phase > because you are seeing events generated for newly compiled methods. In > order > to see synthetic events for methods that were compiled before you added the > capability, your agent needs to use a different function: > > http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht > ml#GenerateEvents > > The documentation for Compiled Method Load has this line: > > > These events can be sent after their initial occurrence with > GenerateEvents. > > and that sounds just like your situation. > > Hope this helps. > > Dan > > From daniel.daugherty at oracle.com Fri Aug 12 23:33:29 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 12 Aug 2016 17:33:29 -0600 Subject: Setting JVMTI Capabilities when VM is in "Live Phase" In-Reply-To: References: <3d486133-a635-638c-fcc2-18230c694b2f@oracle.com> Message-ID: <716a0ec1-b6d9-0c18-4a55-ffb889dcc633@oracle.com> On 8/12/16 5:27 PM, Leela Mohan wrote: > Hi Daniel, > > Actually, I was thinking about the case where compiler choose not to > have complete "de-opt" state. For ex: Local pruning. I can also think > of other cases which need callback events like, posting exceptions to > the agent. > > JVMTI requests for examining/changing the stack frame would > conservatively de-optimize the compile methods but not all > de-optimizable locations can restore the java state user expect. > > What are the expectations for VM for these cases ? I think we'll have to wait for someone more current in how the compilers interact with JVM/TI to chime in here. I stopped actively working on JVM/TI back in 2010 or so... :-) Dan > > Thanks, > Leela > > On Fri, Aug 12, 2016 at 4:04 PM, Daniel D. Daugherty > > wrote: > > On 8/12/16 4:21 PM, Leela Mohan wrote: > > Hi experts, > > It looks like, we don't disallow setting capabilities when VM > is in > "JVMTI_PHASE_LIVE". And, I notice, for every new compilation > of method, > ciEnv caches the JVMTI state and expects those assumptions to > be true > during the compilation. Otherwise, dump the compiled method. > > However, we don't seem to do anything with the methods which > were compiled > before setting the capability. > > What is the understanding? > > Thanks, > Leela > > > Hi Leela, > > I'm guessing that you are talking about this capability: > > http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#jvmtiCapabilities.can_generate_compiled_method_load_events > > > and this event: > > http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#CompiledMethodLoad > > > > The can_generate_compiled_method_load_events capability needs to > be added > in order to generate CompiledMethodLoad events. Capabilities are > added via > http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#AddCapabilities > > which can be called from different JVM/TI phases. Different VM > implementations > can require that certain capabilities can only be added in certain > JVM/TI phases. > However, if AddCapabilities() does not return a JVM/TI error when > a capability > is added in a phase, e.g., the live phase, then you can safely > assume that > the capability has been added. > > > In your example, it sounds like the capability is added in the > live phase > because you are seeing events generated for newly compiled > methods. In order > to see synthetic events for methods that were compiled before you > added the > capability, your agent needs to use a different function: > > http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#GenerateEvents > > > The documentation for Compiled Method Load has this line: > > > These events can be sent after their initial occurrence with > GenerateEvents. > > and that sounds just like your situation. > > Hope this helps. > > Dan > > From david.holmes at oracle.com Mon Aug 15 01:39:15 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 15 Aug 2016 11:39:15 +1000 Subject: [8u] RFR for JDK-8162419: closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK-8155968 In-Reply-To: References: <740a30c4-ceca-4ede-81b5-853a4c732070@default> <6ed5bbfd-b29a-466d-99f6-fdb649e19510@default> <803179b0-3237-cd0a-4c13-11c57605e345@oracle.com> <6f978d4c-1765-cc0b-5611-55de302b7ec1@oracle.com> <6eb6d4cd-ade3-459c-aee3-29e7b8f4f3b7@default> Message-ID: <5523752e-38ba-76e5-6f27-fa08316d79ee@oracle.com> I am okay with this version as well. Thanks, David On 13/08/2016 5:21 AM, Chris Plummer wrote: > Actually maybe best not to change it in that case. I thought I had > checked JDK9 and didn't see the assert, but I see it there now. I must > have been looking at the 8u version. > > thanks, > > Chris > > On 8/12/16 12:16 PM, Shafi Ahmad wrote: >> Hi Chris, >> >> Thank you for reviewing it. I have just copied the assert from jdk9 >> code change. I will make the change "buffer != NULL". >> >> Regards, >> Shafi >> >>> -----Original Message----- >>> From: Chris Plummer >>> Sent: Saturday, August 13, 2016 12:38 AM >>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes >>> Subject: Re: [8u] RFR for JDK-8162419: >>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>> 8155968 >>> >>> Hi Shafi, >>> >>> I'm not so sure the assert you added is all that useful. If you are >>> going to keep >>> it, please use "buffer != NULL". Testing for a NULL pointer should be >>> explicit. >>> Other than that it looks fine. No need for another webrev. >>> >>> thanks, >>> >>> Chris >>> >>> On 8/12/16 11:18 AM, Shafi Ahmad wrote: >>>> Hi, >>>> >>>> Please find updated webrev link. >>>> >>>> http://cr.openjdk.java.net/~shshahma/8162419/webrev.01/ >>>> >>>> Regards, >>>> Shafi >>>> >>>>> -----Original Message----- >>>>> From: Chris Plummer >>>>> Sent: Friday, August 12, 2016 12:43 PM >>>>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David Holmes >>>>> Subject: Re: [8u] RFR for JDK-8162419: >>>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>>> 8155968 >>>>> >>>>> On 8/11/16 10:21 PM, Shafi Ahmad wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Thanks for reviewing. >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Chris Plummer >>>>>>> Sent: Friday, August 12, 2016 4:50 AM >>>>>>> To: Shafi Ahmad; hotspot-runtime-dev at openjdk.java.net; David >>> Holmes >>>>>>> Subject: Re: [8u] RFR for JDK-8162419: >>>>>>> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing after JDK- >>>>>>> 8155968 >>>>>>> >>>>>>> Hi Shafi, >>>>>>> >>>>>>> Please update the copyright date to 2016 and change "numbers of" to >>>>>>> "number of". >>>>>>> >>>>>>> I'm not so sure I agree with the comments in the CR that you can >>>>>>> just backport this change to vsnprintf(), but not the other changes >>>>>>> in the relevant changeset. For example: >>>>>>> >>>>>>> --- a/src/share/vm/runtime/java.cpp Mon Nov 03 11:34:13 2014 >>>>>>> -0800 >>>>>>> +++ b/src/share/vm/runtime/java.cpp Wed Oct 29 10:13:24 2014 >>> +0100 >>>>>>> @@ -705,25 +705,35 @@ >>>>>>> } >>>>>>> >>>>>>> void JDK_Version::to_string(char* buffer, size_t buflen) const >>>>>>> { >>>>>>> + assert(buffer && buflen > 0, "call with useful buffer"); >>>>>>> size_t index = 0; >>>>>>> + >>>>>>> if (!is_valid()) { >>>>>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized)"); >>>>>>> } else if (is_partially_initialized()) { >>>>>>> jio_snprintf(buffer, buflen, "%s", "(uninitialized) >>>>>>> pre-1.6.0"); >>>>>>> } else { >>>>>>> - index += jio_snprintf( >>>>>>> + int rc = jio_snprintf( >>>>>>> &buffer[index], buflen - index, "%d.%d", _major, >>>>>>> _minor); >>>>>>> + if (rc == -1) return; >>>>>>> + index += rc; >>>>>>> if (_micro > 0) { >>>>>>> - index += jio_snprintf(&buffer[index], buflen - index, ".%d", >>> _micro); >>>>>>> + rc = jio_snprintf(&buffer[index], buflen - index, ".%d", >>>>>>> + _micro); >>>>>>> } >>>>>>> >>>>>>> I think your change to vsnprintf() will break >>>>>>> JDK_Version::to_string() if the above diff if not applied. You >>>>>>> could argue that the above code is already broken because -1 is >>>>>>> could be >>>>> returned to it on Windows. >>>>>>> However, your changes expand that risk to all platforms. >>>>>> I am agree with you. I think I have to revisit at least all >>>>>> reference of >>>>> jio_snprintf for which we are using return value of this method. >>>>>> shafi at shafi-ahmad:~/Java/jdk8/jdk8u-dev/hotspot$ find ./ -name >>> "*.cpp" >>>>>> -exec grep -H jio_snprintf {} \; | egrep "=|if" | grep -v close >>>>>> ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, >>>>>> O_BUFLEN, "replay_pid%p_compid%d.log", os::current_process_id(), >>>>>> compile_id); >>>>>> ./src/share/vm/ci/ciEnv.cpp: int ret = jio_snprintf(buffer, >>>>>> O_BUFLEN, "inline_pid%p_compid%d.log", os::current_process_id(), >>>>>> compile_id); >>>>>> ./src/share/vm/runtime/os.cpp: const int printed = >>>>>> jio_snprintf(buffer, >>>>> buffer_length, iso8601_format, >>>>>> ./src/share/vm/runtime/arguments.cpp: int ret = >>>>>> jio_snprintf(b, >>>>> buf_sz, "%d", os::current_process_id()); >>>>>> ./src/share/vm/runtime/arguments.cpp: // if jio_snprintf >>>>>> fails or the >>>>> buffer is not long enough to hold >>>>>> ./src/share/vm/runtime/java.cpp: index += jio_snprintf( >>>>>> ./src/share/vm/runtime/java.cpp: index += >>> jio_snprintf(&buffer[index], >>>>> buflen - index, ".%d", _micro); >>>>>> ./src/share/vm/runtime/java.cpp: index += >>> jio_snprintf(&buffer[index], >>>>> buflen - index, "_%02d", _update); >>>>>> ./src/share/vm/runtime/java.cpp: index += >>> jio_snprintf(&buffer[index], >>>>> buflen - index, "%c", _special); >>>>>> ./src/share/vm/runtime/java.cpp: index += >>> jio_snprintf(&buffer[index], >>>>> buflen - index, "-b%02d", _build); >>>>>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >>>>> buflen, "#%d", trap_state); >>>>>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >>>>> buflen, "%s%s", >>>>>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >>>>> buflen, "reason='%s' action='%s'", >>>>>> ./src/share/vm/runtime/deoptimization.cpp: len = jio_snprintf(buf, >>>>> buflen, "reason='%s' action='%s' index='%d'", >>>>>> ./src/share/vm/services/diagnosticArgument.cpp: jio_snprintf(buf, >>>>>> len, "%s", (c != NULL) ? c : ""); >>>>>> ./src/share/vm/classfile/vmSymbols.cpp: int len = jio_snprintf(buf, >>>>>> buflen, "%s: %s%s.%s%s", >>>>>> ./src/share/vm/classfile/classLoader.cpp: if (jio_snprintf(path, >>>>> sizeof(path), "%s%s%s", _dir, os::file_separator(), name) == -1) { >>>>>> ./src/share/vm/classfile/verifier.cpp: jio_snprintf(message, >>> message_len, >>>>> "Could not link verifier"); >>>>>> ./src/share/vm/utilities/ostream.cpp: int result = >>>>> jio_snprintf(current_file_name, JVM_MAXPATHLEN, >>>>>> ./src/share/vm/utilities/ostream.cpp: int result = >>>>> jio_snprintf(current_file_name, JVM_MAXPATHLEN, "%s.%d" >>>>> CURRENTAPPX, >>>>>> ./src/share/vm/utilities/vmError.cpp: int n = jio_snprintf(buf, >>>>>> buflen, >>>>>> ./src/share/vm/utilities/vmError.cpp: int fsep_len = >>>>> jio_snprintf(&buf[pos], buflen-pos, "%s", os::file_separator()); >>>>>> ./src/share/vm/utilities/vmError.cpp: int pos = >>>>>> jio_snprintf(buf, >>> buflen, >>>>> "%s%s", tmpdir, os::file_separator()); >>>>>> ./src/cpu/ppc/vm/methodHandles_ppc.cpp: jio_snprintf(buf, 100, >>>>> "verify_ref_kind expected %x", ref_kind); >>>>>> ./src/cpu/x86/vm/methodHandles_x86.cpp: jio_snprintf(buf, 100, >>>>> "verify_ref_kind expected %x", ref_kind); >>>>>> ./src/cpu/sparc/vm/methodHandles_sparc.cpp: jio_snprintf(buf, 100, >>>>> "verify_ref_kind expected %x", ref_kind); >>>>>> ./src/os/bsd/vm/os_bsd.cpp: int n = jio_snprintf(buffer, >>>>>> bufferSize, "/cores"); >>>>> Hi Shafi, >>>>> >>>>> As David pointed out, it looks like only java.cpp needs to be updated >>>>> to account for changes you are making jio_snprintf. The others either >>>>> don't use the result (even if it is assigned to a local) or already >>>>> have special handling for -1. The exception is the os_bsd.cpp case. I >>>>> noticed it looks buggy, both in >>>>> JDK9 and JDK8u. >>>>> >>>>> cheers, >>>>> >>>>> Chris >>>>>> I will resend the updated webrev. >>>>>> >>>>>> Jdk9:src/share/vm/runtime/java.cpp >>>>>> 714 int rc = jio_snprintf( >>>>>> 715 &buffer[index], buflen - index, "%d.%d", _major, _minor); >>>>>> 716 if (rc == -1) return; >>>>>> 717 index += rc; >>>>>> 718 if (_security > 0) { >>>>>> 719 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", >>>>>> _security); >>>>>> 720 } >>>>>> 721 if (_patch > 0) { >>>>>> 722 rc = jio_snprintf(&buffer[index], buflen - index, ".%d", >>>>>> _patch); >>>>>> 723 if (rc == -1) return; >>>>>> 724 index += rc; >>>>>> 725 } >>>>>> >>>>>> After line# 719 we are not updating the index variable and hence if >>>>> _security > 0 and _patch > 0 then in that case value of _security is >>>>> getting overwritten by value of _patch in the buffer. >>>>>> Is this a bug or we are ignoring _security field, in that case this >>>>>> is redundant >>>>> code? Please note _security field is not there in jdk8 code. >>>>>> Regards, >>>>>> Shafi >>>>>> >>>>>> >>>>>> >>>>>>> cheers, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 8/11/16 5:14 AM, Shafi Ahmad wrote: >>>>>>> > Hi, >>>>>>> > >>>>>>> > Could I get one more review for this safe change. >>>>>>> > >>>>>>> > Regards, >>>>>>> > Shafi >>>>>>> > >>>>>>> >> -----Original Message----- >>>>>>> >> From: David Holmes >>>>>>> >> Sent: Thursday, August 11, 2016 9:52 AM >> To: Shafi Ahmad; >>>>>>> hotspot- runtime-dev at openjdk.java.net >>>>>>> >> Subject: Re: [8u] RFR for JDK-8162419: >>>>>>> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing >>>>>>> after >>>>>>> JDK- >> >>>>>>> 8155968 >> >> Hi Shafi, >> >> On 10/08/2016 6:34 PM, Shafi >>>>>>> Ahmad >>>>> wrote: >>>>>>> >>> Hi, >>>>>>> >>> >>>>>>> >>> Please review the code change for "JDK-8162419: >>>>>>> >> closed/com/oracle/jfr/runtime/TestVMInfoEvent.sh failing >>>>>>> after >>>>>>> JDK- >> 8155968" to jdk8u. >>>>>>> >>> Please note this is partial backport of >> >>>>>>> >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/110ec5963eb1#l23.1 >>>>>>> >>> Summary: >>>>>>> >>> Microsoft version of vsnprintf() behaves differently from >>>>>>> the standard C >>>>>>>>> version when there is not enough space in the buffer. >>>>>>> >>> Microsoft version doesn't null terminates its output under >>>>>>> error conditions, >> whereas the standard C version does. On >>>>>>> Windows, it returns -1. >>>>>>> >>> We handle both cases here and always return -1, and perform >>>>>>> null >> termination. >>>>>>> >> >>>>>>> >> This looks fine to me. >>>>>>> >> >>>>>>> >> Thanks, >>>>>>> >> David >>>>>>> >> >>>>>>> >>> Jdk8 bug: https://bugs.openjdk.java.net/browse/JDK-8162419 >>>>>>> >>> Webrev link: >>>>>>> http://cr.openjdk.java.net/~shshahma/8162419/webrev.00/ >>>>>>> >>> >>>>>>> >>> Testing: jprt >>>>>>> >>> >>>>>>> >>> Regards, >>>>>>> >>> Shafi >>>>>>> >>> >>>>>>> >>>>>>> > From david.holmes at oracle.com Mon Aug 15 04:23:17 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 15 Aug 2016 14:23:17 +1000 Subject: RFR(S): JDK-8157236 - attach on ARMv7 fails with com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file In-Reply-To: References: <1cf08a48-d7c0-1953-08ef-5d75f8225a3c@oracle.com> <9a8c4cdb-0389-5d0b-6b87-7167ed66c655@oracle.com> <0e789ca0-bd59-08ee-e9a8-da0646f06780@oracle.com> Message-ID: On 12/08/2016 7:04 PM, Dmitry Samersoff wrote: > David, > > Updated webrev is: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.04/ Thanks, this looks okay. Only minor concern is whether we have to apply casts to the results of geteuid() and st.st_uid when used with %d format specifier? > Windows is absolutely different story that requires significant efforts > to reproduce error conditions and test changes. Also it has nothing to > do with ARMv7. > > So I would prefer to address windows issues separately either as a part > of JDK-8159799 or as a separate CR. Ok. Thanks, David > -Dmitry > > On 2016-08-12 03:24, David Holmes wrote: >> Hi Dmitry, >> >> On 12/08/2016 2:55 AM, Dmitry Samersoff wrote: >>> David, >>> >>> Please see updated webrev. >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.03/ >>> >>> I didn't touch windows version because it quite different from *NIX one. >> >> Do we ever see failures on Windows? Is so we should add diagnostics >> there too even if they are different to *NIX. >> >> I would still like to see what file it is working with. We need some >> logging in here: >> >> bool AttachListener::is_init_trigger() { >> if (init_at_startup() || is_initialized()) { >> return false; // initialized at startup or already >> initialized >> } >> char fn[PATH_MAX+1]; >> sprintf(fn, ".attach_pid%d", os::current_process_id()); >> int ret; >> struct stat64 st; >> RESTARTABLE(::stat64(fn, &st), ret); >> if (ret == -1) { >> + log ("failed to find attach file: %s, trying alternate", fn) >> snprintf(fn, sizeof(fn), "%s/.attach_pid%d", >> os::get_temp_directory(), os::current_process_id()); >> RESTARTABLE(::stat64(fn, &st), ret); >> + if (ret == -1) { >> + log("failed to find attach file: %s", fn); >> + } >> } >> >> All failure paths need to show us what it was that failed. >> >> typos: trigerred -> triggered >> >> Thanks, >> David >> >>> -Dmitry >>> >>> On 2016-08-08 02:40, David Holmes wrote: >>>> Hi Dmitry, >>>> >>>> On 5/08/2016 7:25 PM, Dmitry Samersoff wrote: >>>>> Everybody, >>>>> >>>>> Please review the fix: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.02/ >>>>> >>>>> Problem: >>>>> Tests fail intermittently because it can't attach to child process, >>>>> these attach failures is hard to debug because attach framework >>>>> doesn't provide enough diagnostic information. >>>>> >>>>> Solution: >>>>> >>>>> a) Increase attach timeout >>>>> b) Slightly change attach loop to save a bit of CPU power. >>>>> c) Add some logging to attach listener. >>>>> >>>>> It's just a first step in this direction. Complete cleanup of attach >>>>> code (remove LinuxThreads support and convert all printing to UL) is >>>>> not >>>>> a goal of this fix - I'll file a separate CR for it. >>>> >>>> I still think you need more logging now to aid in debugging these cases. >>>> In particular we want to be able to verify that the path of the attach >>>> file is what we expect in all cases ie whether we find the .attach_pid >>>> file in cwd or whether we are looking in temp directory, and whether we >>>> ultimately succeed or fail. >>>> >>>> Plus whatever you do now should be done consistently for all platforms. >>>> >>>> Thanks, >>>> David >>>> >>>>> -Dmitry >>>>> >>> >>> > > From dmitry.samersoff at oracle.com Mon Aug 15 08:48:43 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 15 Aug 2016 11:48:43 +0300 Subject: RFR(S): JDK-8157236 - attach on ARMv7 fails with com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file In-Reply-To: References: <1cf08a48-d7c0-1953-08ef-5d75f8225a3c@oracle.com> <9a8c4cdb-0389-5d0b-6b87-7167ed66c655@oracle.com> <0e789ca0-bd59-08ee-e9a8-da0646f06780@oracle.com> Message-ID: <61d03ed0-770f-ffd6-67e8-9d7078272ea1@oracle.com> David, > Thanks, this looks okay. Thank you for review! > Only minor concern is whether we have to > apply casts to the results of geteuid() and st.st_uid when used with > %d format specifier? I didn't see any complains neither from jprt nor building locally. uid_t is 4 byte type for both 32bit/64bit so I don't think we need to cast. -Dmitry On 2016-08-15 07:23, David Holmes wrote: > On 12/08/2016 7:04 PM, Dmitry Samersoff wrote: >> David, >> >> Updated webrev is: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.04/ > > Thanks, this looks okay. Only minor concern is whether we have to > apply casts to the results of geteuid() and st.st_uid when used with > %d format specifier? > >> Windows is absolutely different story that requires significant >> efforts to reproduce error conditions and test changes. Also it has >> nothing to do with ARMv7. >> >> So I would prefer to address windows issues separately either as a >> part of JDK-8159799 or as a separate CR. > > Ok. > > Thanks, David > >> -Dmitry >> >> On 2016-08-12 03:24, David Holmes wrote: >>> Hi Dmitry, >>> >>> On 12/08/2016 2:55 AM, Dmitry Samersoff wrote: >>>> David, >>>> >>>> Please see updated webrev. >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.03/ >>>> >>>> I didn't touch windows version because it quite different from >>>> *NIX one. >>> >>> Do we ever see failures on Windows? Is so we should add >>> diagnostics there too even if they are different to *NIX. >>> >>> I would still like to see what file it is working with. We need >>> some logging in here: >>> >>> bool AttachListener::is_init_trigger() { if (init_at_startup() || >>> is_initialized()) { return false; // initialized at >>> startup or already initialized } char fn[PATH_MAX+1]; sprintf(fn, >>> ".attach_pid%d", os::current_process_id()); int ret; struct >>> stat64 st; RESTARTABLE(::stat64(fn, &st), ret); if (ret == -1) { >>> + log ("failed to find attach file: %s, trying alternate", >>> fn) snprintf(fn, sizeof(fn), "%s/.attach_pid%d", >>> os::get_temp_directory(), os::current_process_id()); >>> RESTARTABLE(::stat64(fn, &st), ret); + if (ret == -1) { + >>> log("failed to find attach file: %s", fn); + } } >>> >>> All failure paths need to show us what it was that failed. >>> >>> typos: trigerred -> triggered >>> >>> Thanks, David >>> >>>> -Dmitry >>>> >>>> On 2016-08-08 02:40, David Holmes wrote: >>>>> Hi Dmitry, >>>>> >>>>> On 5/08/2016 7:25 PM, Dmitry Samersoff wrote: >>>>>> Everybody, >>>>>> >>>>>> Please review the fix: >>>>>> >>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.02/ >>>>>> >>>>>> >>>>>> Problem: >>>>>> Tests fail intermittently because it can't attach to child >>>>>> process, these attach failures is hard to debug because >>>>>> attach framework doesn't provide enough diagnostic >>>>>> information. >>>>>> >>>>>> Solution: >>>>>> >>>>>> a) Increase attach timeout b) Slightly change attach loop >>>>>> to save a bit of CPU power. c) Add some logging to attach >>>>>> listener. >>>>>> >>>>>> It's just a first step in this direction. Complete cleanup >>>>>> of attach code (remove LinuxThreads support and convert all >>>>>> printing to UL) is not a goal of this fix - I'll file a >>>>>> separate CR for it. >>>>> >>>>> I still think you need more logging now to aid in debugging >>>>> these cases. In particular we want to be able to verify that >>>>> the path of the attach file is what we expect in all cases ie >>>>> whether we find the .attach_pid file in cwd or whether we >>>>> are looking in temp directory, and whether we ultimately >>>>> succeed or fail. >>>>> >>>>> Plus whatever you do now should be done consistently for all >>>>> platforms. >>>>> >>>>> Thanks, David >>>>> >>>>>> -Dmitry >>>>>> >>>> >>>> >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Mon Aug 15 08:49:33 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 15 Aug 2016 11:49:33 +0300 Subject: RFR(S): *NEED SECOND* JDK-8157236 - attach on ARMv7 fails with com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file In-Reply-To: References: <1cf08a48-d7c0-1953-08ef-5d75f8225a3c@oracle.com> <9a8c4cdb-0389-5d0b-6b87-7167ed66c655@oracle.com> <0e789ca0-bd59-08ee-e9a8-da0646f06780@oracle.com> Message-ID: <432f2abb-4652-2e08-0af3-70e4ad308ea8@oracle.com> On 2016-08-12 12:04, Dmitry Samersoff wrote: > David, > > Updated webrev is: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.04/ > > Windows is absolutely different story that requires significant efforts > to reproduce error conditions and test changes. Also it has nothing to > do with ARMv7. > > So I would prefer to address windows issues separately either as a part > of JDK-8159799 or as a separate CR. > > -Dmitry > > On 2016-08-12 03:24, David Holmes wrote: >> Hi Dmitry, >> >> On 12/08/2016 2:55 AM, Dmitry Samersoff wrote: >>> David, >>> >>> Please see updated webrev. >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.03/ >>> >>> I didn't touch windows version because it quite different from *NIX one. >> >> Do we ever see failures on Windows? Is so we should add diagnostics >> there too even if they are different to *NIX. >> >> I would still like to see what file it is working with. We need some >> logging in here: >> >> bool AttachListener::is_init_trigger() { >> if (init_at_startup() || is_initialized()) { >> return false; // initialized at startup or already >> initialized >> } >> char fn[PATH_MAX+1]; >> sprintf(fn, ".attach_pid%d", os::current_process_id()); >> int ret; >> struct stat64 st; >> RESTARTABLE(::stat64(fn, &st), ret); >> if (ret == -1) { >> + log ("failed to find attach file: %s, trying alternate", fn) >> snprintf(fn, sizeof(fn), "%s/.attach_pid%d", >> os::get_temp_directory(), os::current_process_id()); >> RESTARTABLE(::stat64(fn, &st), ret); >> + if (ret == -1) { >> + log("failed to find attach file: %s", fn); >> + } >> } >> >> All failure paths need to show us what it was that failed. >> >> typos: trigerred -> triggered >> >> Thanks, >> David >> >>> -Dmitry >>> >>> On 2016-08-08 02:40, David Holmes wrote: >>>> Hi Dmitry, >>>> >>>> On 5/08/2016 7:25 PM, Dmitry Samersoff wrote: >>>>> Everybody, >>>>> >>>>> Please review the fix: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8157236/webrev.02/ >>>>> >>>>> Problem: >>>>> Tests fail intermittently because it can't attach to child process, >>>>> these attach failures is hard to debug because attach framework >>>>> doesn't provide enough diagnostic information. >>>>> >>>>> Solution: >>>>> >>>>> a) Increase attach timeout >>>>> b) Slightly change attach loop to save a bit of CPU power. >>>>> c) Add some logging to attach listener. >>>>> >>>>> It's just a first step in this direction. Complete cleanup of attach >>>>> code (remove LinuxThreads support and convert all printing to UL) is >>>>> not >>>>> a goal of this fix - I'll file a separate CR for it. >>>> >>>> I still think you need more logging now to aid in debugging these cases. >>>> In particular we want to be able to verify that the path of the attach >>>> file is what we expect in all cases ie whether we find the .attach_pid >>>> file in cwd or whether we are looking in temp directory, and whether we >>>> ultimately succeed or fail. >>>> >>>> Plus whatever you do now should be done consistently for all platforms. >>>> >>>> Thanks, >>>> David >>>> >>>>> -Dmitry >>>>> >>> >>> > > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From frederic.parain at oracle.com Mon Aug 15 13:26:17 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Mon, 15 Aug 2016 09:26:17 -0400 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> Message-ID: Thank you, Looks good to me. Fred On 08/12/2016 04:19 PM, dean.long at oracle.com wrote: > Sure: > > http://cr.openjdk.java.net/~dlong/8161598/webrev.1/ > > dl > > > On 8/12/16 6:46 AM, Frederic Parain wrote: >> Dean, >> >> In file macroAssembler_x86.cpp, could it be possible to >> get rid of the clear_pc argument? It seems completely >> useless now. >> >> Fred >> >> >> On 08/09/2016 01:39 PM, dean.long at oracle.com wrote: >>> Ping. >>> >>> dl >>> >>> >>> On 8/4/16 3:28 PM, dean.long at oracle.com wrote: >>>> https://bugs.openjdk.java.net/browse/JDK-8161598 >>>> >>>> http://cr.openjdk.java.net/~dlong/8161598/webrev/ >>>> >>>> Sorry, this issue is Confidential. The problem is similar to 8029441, >>>> where we suspend a thread and use pd_get_top_frame_for_profiling() to >>>> get the top frame for stack walking. The problem is "last Java frame" >>>> anchor frames on x86. In lots of places we do not store last_Java_pc. >>>> This is OK in the synchronous stack walk case done by the current >>>> thread. But in the asynchronous case, there are small windows where >>>> it's not always safe to get PC from sp[-1]. >>>> >>>> The solution is not to treat x86 anchor frames as "always walkable". >>>> Instead, we follow the example of sparc and make them walking by >>>> filling in last_Java_pc when it's safe. >>>> >>>> I went for the minimal fix, resetting clear_pc to true in >>>> reset_last_Java_frame() but not changing the API and all the callers. >>>> I can fix this if reviewers feel strongly about it. >>>> >>>> dl >>>> >>> > From dean.long at oracle.com Mon Aug 15 17:57:00 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 15 Aug 2016 10:57:00 -0700 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> Message-ID: <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> Thanks Fred. Still waiting for a Reviewer. dl On 8/15/16 6:26 AM, Frederic Parain wrote: > Thank you, > > Looks good to me. > > Fred > > On 08/12/2016 04:19 PM, dean.long at oracle.com wrote: >> Sure: >> >> http://cr.openjdk.java.net/~dlong/8161598/webrev.1/ >> >> dl >> >> >> On 8/12/16 6:46 AM, Frederic Parain wrote: >>> Dean, >>> >>> In file macroAssembler_x86.cpp, could it be possible to >>> get rid of the clear_pc argument? It seems completely >>> useless now. >>> >>> Fred >>> >>> >>> On 08/09/2016 01:39 PM, dean.long at oracle.com wrote: >>>> Ping. >>>> >>>> dl >>>> >>>> >>>> On 8/4/16 3:28 PM, dean.long at oracle.com wrote: >>>>> https://bugs.openjdk.java.net/browse/JDK-8161598 >>>>> >>>>> http://cr.openjdk.java.net/~dlong/8161598/webrev/ >>>>> >>>>> Sorry, this issue is Confidential. The problem is similar to >>>>> 8029441, >>>>> where we suspend a thread and use pd_get_top_frame_for_profiling() to >>>>> get the top frame for stack walking. The problem is "last Java >>>>> frame" >>>>> anchor frames on x86. In lots of places we do not store last_Java_pc. >>>>> This is OK in the synchronous stack walk case done by the current >>>>> thread. But in the asynchronous case, there are small windows where >>>>> it's not always safe to get PC from sp[-1]. >>>>> >>>>> The solution is not to treat x86 anchor frames as "always walkable". >>>>> Instead, we follow the example of sparc and make them walking by >>>>> filling in last_Java_pc when it's safe. >>>>> >>>>> I went for the minimal fix, resetting clear_pc to true in >>>>> reset_last_Java_frame() but not changing the API and all the callers. >>>>> I can fix this if reviewers feel strongly about it. >>>>> >>>>> dl >>>>> >>>> >> From aph at redhat.com Mon Aug 15 18:02:03 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 15 Aug 2016 19:02:03 +0100 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> Message-ID: <57B2039B.9050307@redhat.com> I've written an AArch64 version of this, but given that the bug is secret I can't figure out how to test it properly. Andrew. From dean.long at oracle.com Mon Aug 15 18:41:42 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 15 Aug 2016 11:41:42 -0700 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: <57B2039B.9050307@redhat.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> <57B2039B.9050307@redhat.com> Message-ID: <92074f8f-3a3c-b8af-2f3f-f5f47bdc6a18@oracle.com> On 8/15/16 11:02 AM, Andrew Haley wrote: > I've written an AArch64 version of this, but given that the bug is > secret I can't figure out how to test it properly. > > Andrew. Hi Andrew. Unfortunately, the test and the code that is calling JavaThread::pd_get_top_frame() are all closed. Does aarch64 have an tests that exercise JavaThread::pd_get_top_frame()? To test, you need to suspend a thread, check if the thread state is _thread_in_Java, get the context, call pd_get_top_frame(), then try to walk the stack. Or to get a deterministic way to test, I think you could insert illegal instructions in strategic places where the anchor frame is being set up (set_last_Java_frame() and C2 CallRuntimeDirect) and then call pd_get_top_frame() in the signal handler, try to walk backwards a few frames, patch the illegal instruction with a NOP, and return. dl From harold.seigel at oracle.com Mon Aug 15 19:34:52 2016 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 15 Aug 2016 15:34:52 -0400 Subject: RFR(S) 8030221: Checking for anonymous class should check for NULL as well as potential nesting Message-ID: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> Hi, Please review this fix for JDK-8030221. The fix makes sure that, if the specified host class for a VM anonymous is another anonymous class, that the actual host class is the specified host class's host class. For example, if named class N is the host class for anonymous class A1, and A1 is the specified host class for anonymous class A2, then the actual host class for A2 will be named class N. JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8030221 Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8030221/ The fix was tested with the JCK Lang and VM tests, the hotpot, and java/lang, java/util and other JTreg tests, the NSK quick tests, and with the RBT runtime nightly tests. Thanks, Harold From rachel.protacio at oracle.com Mon Aug 15 20:48:25 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Mon, 15 Aug 2016 16:48:25 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks Message-ID: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> Hello, Please review this change, which makes sure class file load hooks are not called for VM anonymous classes. See http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html for justification. Passes JPRT and RBT. Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ Thank you! Rachel From david.holmes at oracle.com Tue Aug 16 03:26:01 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Aug 2016 13:26:01 +1000 Subject: RFR(S) 8030221: Checking for anonymous class should check for NULL as well as potential nesting In-Reply-To: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> References: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> Message-ID: Hi Harold, On 16/08/2016 5:34 AM, harold seigel wrote: > Hi, > > Please review this fix for JDK-8030221. The fix makes sure that, if the > specified host class for a VM anonymous is another anonymous class, that > the actual host class is the specified host class's host class. For > example, if named class N is the host class for anonymous class A1, and > A1 is the specified host class for anonymous class A2, then the actual > host class for A2 will be named class N. > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8030221 > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8030221/ So basically instead of storing an anonymous class as host and later walking through to find the first non-anonymous "host" class, you now store the first non-anonymous class as the host and so don't need to walk later. Ok. src/share/vm/runtime/reflection.cpp I find this assertion: assert(!host_class->is_instance_klass() || !InstanceKlass::cast(host_class)->is_anonymous(), "host_class should not be anonymous"); somewhat hard to understand compared to the more direct: assert( !(host_class->is_instance_klass() && InstanceKlass::cast(host_class)->is_anonymous()), "host_class should not be anonymous"); Thanks, David > The fix was tested with the JCK Lang and VM tests, the hotpot, and > java/lang, java/util and other JTreg tests, the NSK quick tests, and > with the RBT runtime nightly tests. > > Thanks, Harold From david.holmes at oracle.com Tue Aug 16 03:36:58 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Aug 2016 13:36:58 +1000 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> Message-ID: <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> Hi Rachel, On 16/08/2016 6:48 AM, Rachel Protacio wrote: > Hello, > > Please review this change, which makes sure class file load hooks are > not called for VM anonymous classes. See > http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html > for justification. > > Passes JPRT and RBT. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 > Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ This: 112 // VM Anonymous classes - defined via unsafe.DefineAnonymousClass - should not 113 // call back to a CFLH 114 if (host_klass == NULL) { 115 stream = prologue(stream, suggests that "prologue" can only do CFLH related things. If that is true then it would be much clearer in my opinion if prologue were renamed to something more explicit - like check_class_file_load_hook ? Otherwise, the host_klass should be passed in to prologue and the anonymous class check internalized there. Also I don't think you need to explain where VM anonymous classes come from, it suffices to simply say "Skip class file load hook processing for VM anonymous classes"; or if the prologue is renamed then simply "Skip this processing for VM anonymous classes". :) Thanks, David > Thank you! > Rachel From lois.foltan at oracle.com Tue Aug 16 11:47:07 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 16 Aug 2016 07:47:07 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> Message-ID: <57B2FD3B.7090305@oracle.com> Hi Rachel, I think this looks good. You might consider adding a nested anonymous class test for this. I believe Harold has developed one for JDK-8030221. I think it would be good to make sure that the host_klass parameter is indeed NULL for a nested anonymous class and that the VM does not go forward into the CFLH code in that scenario. Thanks, Lois On 8/15/2016 4:48 PM, Rachel Protacio wrote: > Hello, > > Please review this change, which makes sure class file load hooks are > not called for VM anonymous classes. See > http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html > for justification. > > Passes JPRT and RBT. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 > Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ > > Thank you! > Rachel From lois.foltan at oracle.com Tue Aug 16 11:51:18 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 16 Aug 2016 07:51:18 -0400 Subject: RFR(S) 8030221: Checking for anonymous class should check for NULL as well as potential nesting In-Reply-To: References: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> Message-ID: <57B2FE36.2030106@oracle.com> +1. Thanks for the test! Nit, wrong copyright in the test. Lois On 8/15/2016 11:26 PM, David Holmes wrote: > Hi Harold, > > On 16/08/2016 5:34 AM, harold seigel wrote: >> Hi, >> >> Please review this fix for JDK-8030221. The fix makes sure that, if the >> specified host class for a VM anonymous is another anonymous class, that >> the actual host class is the specified host class's host class. For >> example, if named class N is the host class for anonymous class A1, and >> A1 is the specified host class for anonymous class A2, then the actual >> host class for A2 will be named class N. >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8030221 >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8030221/ > > So basically instead of storing an anonymous class as host and later > walking through to find the first non-anonymous "host" class, you now > store the first non-anonymous class as the host and so don't need to > walk later. Ok. > > src/share/vm/runtime/reflection.cpp > > I find this assertion: > > assert(!host_class->is_instance_klass() || > !InstanceKlass::cast(host_class)->is_anonymous(), > "host_class should not be anonymous"); > > somewhat hard to understand compared to the more direct: > > assert( !(host_class->is_instance_klass() && > InstanceKlass::cast(host_class)->is_anonymous()), > "host_class should not be anonymous"); > > Thanks, > David > >> The fix was tested with the JCK Lang and VM tests, the hotpot, and >> java/lang, java/util and other JTreg tests, the NSK quick tests, and >> with the RBT runtime nightly tests. >> >> Thanks, Harold From lois.foltan at oracle.com Tue Aug 16 12:02:54 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 16 Aug 2016 08:02:54 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> Message-ID: <57B300EE.3060607@oracle.com> Looks good Karen. Thank you for the mixed class file version test! Lois On 8/12/2016 3:28 PM, Karen Kinnear wrote: > Added a targeted test case for class files with different class file versions in the > inheritance hierarchy. > > http://cr.openjdk.java.net/~acorn/8163808.hs.1/webrev/ > > thanks, > Karen > >> On Aug 12, 2016, at 1:46 PM, Coleen Phillimore wrote: >> >> >> >> On 8/12/16 1:33 PM, Karen Kinnear wrote: >>> Coleen, >>> >>> Good catch - I will make that change. >>> Today this code is not called for arrays, but I totally appreciate you looking at the bigger picture >>> and preparing for potential other uses. >>> >>> >>> Here is the updated lines: >>> KlassHandle vtklass_h = vt->klass(); >>> Klass* vtklass = vtklass_h(); >>> if (vtklass->is_instance_klass() && >>> (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { >>> assert(method() != NULL, "must have set method"); >>> } >>> >> This looks good. >> Thanks, >> Coleen >> >>> Thanks! >>> Karen >>> >>>> On Aug 12, 2016, at 10:47 AM, Coleen Phillimore > wrote: >>>> >>>> >>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html >>>> >>>> void vtableEntry::verify(klassVtable* vt, outputStream* st) { >>>> NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); >>>> + KlassHandle vtklass_h = vt->klass(); >>>> + Klass* vtklass = vtklass_h(); >>>> + if (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { >>>> assert(method() != NULL, "must have set method"); >>>> + } >>>> >>>> I might be wrong but the vtable->klass() can be an ArrayKlass, so I think you have to do: >>>> >>>> if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) >>>> >>>> InstanceKlass::cast makes this assertion. Otherwise, the code looks good. >>>> >>>> Coleen >>>> >>>> On 8/11/16 5:07 PM, Karen Kinnear wrote: >>>>> Please review: >>>>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>>>> >>>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>>>> >>>>> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >>>>> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >>>>> for later classfile versions. >>>>> >>>>> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >>>>> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >>>>> >>>>> Testing: linux-x64-slowdebug >>>>> rbt hs-nightly-runtime.js >>>>> jck vm,lang, api.java.lang >>>>> small invocation tests >>>>> >>>>> thanks, >>>>> Karen From harold.seigel at oracle.com Tue Aug 16 13:24:40 2016 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 16 Aug 2016 09:24:40 -0400 Subject: RFR(S) 8030221: Checking for anonymous class should check for NULL as well as potential nesting In-Reply-To: References: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> Message-ID: <1851c71e-40c7-e2d9-159a-097cba02eee6@oracle.com> Thanks David, for the review. I'll change the assert before pushing the change. Harold On 8/15/2016 11:26 PM, David Holmes wrote: > Hi Harold, > > On 16/08/2016 5:34 AM, harold seigel wrote: >> Hi, >> >> Please review this fix for JDK-8030221. The fix makes sure that, if the >> specified host class for a VM anonymous is another anonymous class, that >> the actual host class is the specified host class's host class. For >> example, if named class N is the host class for anonymous class A1, and >> A1 is the specified host class for anonymous class A2, then the actual >> host class for A2 will be named class N. >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8030221 >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8030221/ > > So basically instead of storing an anonymous class as host and later > walking through to find the first non-anonymous "host" class, you now > store the first non-anonymous class as the host and so don't need to > walk later. Ok. > > src/share/vm/runtime/reflection.cpp > > I find this assertion: > > assert(!host_class->is_instance_klass() || > !InstanceKlass::cast(host_class)->is_anonymous(), > "host_class should not be anonymous"); > > somewhat hard to understand compared to the more direct: > > assert( !(host_class->is_instance_klass() && > InstanceKlass::cast(host_class)->is_anonymous()), > "host_class should not be anonymous"); > > Thanks, > David > >> The fix was tested with the JCK Lang and VM tests, the hotpot, and >> java/lang, java/util and other JTreg tests, the NSK quick tests, and >> with the RBT runtime nightly tests. >> >> Thanks, Harold From harold.seigel at oracle.com Tue Aug 16 13:25:28 2016 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 16 Aug 2016 09:25:28 -0400 Subject: RFR(S) 8030221: Checking for anonymous class should check for NULL as well as potential nesting In-Reply-To: <57B2FE36.2030106@oracle.com> References: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> <57B2FE36.2030106@oracle.com> Message-ID: <454a2af8-9451-7f2f-cbf4-d9472cb5e2fa@oracle.com> Thanks for the review Lois. I'll fix the copyright before pushing the change. Harold On 8/16/2016 7:51 AM, Lois Foltan wrote: > +1. Thanks for the test! Nit, wrong copyright in the test. > Lois > > On 8/15/2016 11:26 PM, David Holmes wrote: >> Hi Harold, >> >> On 16/08/2016 5:34 AM, harold seigel wrote: >>> Hi, >>> >>> Please review this fix for JDK-8030221. The fix makes sure that, if >>> the >>> specified host class for a VM anonymous is another anonymous class, >>> that >>> the actual host class is the specified host class's host class. For >>> example, if named class N is the host class for anonymous class A1, and >>> A1 is the specified host class for anonymous class A2, then the actual >>> host class for A2 will be named class N. >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8030221 >>> >>> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8030221/ >> >> So basically instead of storing an anonymous class as host and later >> walking through to find the first non-anonymous "host" class, you now >> store the first non-anonymous class as the host and so don't need to >> walk later. Ok. >> >> src/share/vm/runtime/reflection.cpp >> >> I find this assertion: >> >> assert(!host_class->is_instance_klass() || >> !InstanceKlass::cast(host_class)->is_anonymous(), >> "host_class should not be anonymous"); >> >> somewhat hard to understand compared to the more direct: >> >> assert( !(host_class->is_instance_klass() && >> InstanceKlass::cast(host_class)->is_anonymous()), >> "host_class should not be anonymous"); >> >> Thanks, >> David >> >>> The fix was tested with the JCK Lang and VM tests, the hotpot, and >>> java/lang, java/util and other JTreg tests, the NSK quick tests, and >>> with the RBT runtime nightly tests. >>> >>> Thanks, Harold > From coleen.phillimore at oracle.com Tue Aug 16 14:41:27 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 16 Aug 2016 10:41:27 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> Message-ID: http://cr.openjdk.java.net/~acorn/8163808.hs.1/webrev/test/runtime/TransitiveOverrideCFV50/TransitiveOverrideCFV50.java.html One tiny thing in case you haven't pushed it yet. The copyright dates should just say 2016, since it's a new test. Also, I'm not sure if you need @compile directive since I think jtreg adds -Dignore.symbol.file for you. Thanks, Coleen On 8/12/16 3:28 PM, Karen Kinnear wrote: > Added a targeted test case for class files with different class file > versions in the > inheritance hierarchy. > > http://cr.openjdk.java.net/~acorn/8163808.hs.1/webrev/ > > > thanks, > Karen > >> On Aug 12, 2016, at 1:46 PM, Coleen Phillimore >> > >> wrote: >> >> >> >> On 8/12/16 1:33 PM, Karen Kinnear wrote: >>> Coleen, >>> >>> Good catch - I will make that change. >>> Today this code is not called for arrays, but I totally appreciate >>> you looking at the bigger picture >>> and preparing for potential other uses. >>> >>> >>> Here is the updated lines: >>> KlassHandle vtklass_h = vt->klass(); >>> Klass* vtklass = vtklass_h(); >>> if (vtklass->is_instance_klass() && >>> (InstanceKlass::cast(vtklass)->major_version() >= >>> klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { >>> assert(method() != NULL, "must have set method"); >>> } >>> >> This looks good. >> Thanks, >> Coleen >> >>> Thanks! >>> Karen >>> >>>> On Aug 12, 2016, at 10:47 AM, Coleen Phillimore >>>> >>> > wrote: >>>> >>>> >>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html >>>> >>>> >>>> void vtableEntry::verify(klassVtable* vt, outputStream* st) { >>>> NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); >>>> + KlassHandle vtklass_h = vt->klass(); >>>> + Klass* vtklass = vtklass_h(); >>>> + if (InstanceKlass::cast(vtklass)->major_version() >= >>>> klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { >>>> assert(method() != NULL, "must have set method"); >>>> + } >>>> >>>> I might be wrong but the vtable->klass() can be an ArrayKlass, so I >>>> think you have to do: >>>> >>>> if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) >>>> >>>> InstanceKlass::cast makes this assertion. Otherwise, the code >>>> looks good. >>>> >>>> Coleen >>>> >>>> On 8/11/16 5:07 PM, Karen Kinnear wrote: >>>>> Please review: >>>>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>>>> >>>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>>>> >>>>> Bug: For classfiles before class file version 51, JVMS did not >>>>> support transitive over-ride behavior. >>>>> Implementation needed to check this in three places, not just one. >>>>> Vtable size calculation is only exact >>>>> for later classfile versions. >>>>> >>>>> Also fixed vtable logging output - since the method name-and-sig >>>>> printing was changed to also print >>>>> the holder?s class name, we do not need to print the holder?s >>>>> class name separately - it was printing twice. >>>>> >>>>> Testing: linux-x64-slowdebug >>>>> rbt hs-nightly-runtime.js >>>>> jck vm,lang, api.java.lang >>>>> small invocation tests >>>>> >>>>> thanks, >>>>> Karen >>>> >>> >> > From coleen.phillimore at oracle.com Tue Aug 16 14:48:38 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 16 Aug 2016 10:48:38 -0400 Subject: RFR(S) 8030221: Checking for anonymous class should check for NULL as well as potential nesting In-Reply-To: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> References: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> Message-ID: <0fc7732e-9298-9e4a-305f-1e1d4d8cf079@oracle.com> Can host class actually be an array class? It seems like more error if the host class is a primitive, array, or if it is another anonymous class. Then you can pass it as InstanceKlass as host_class everwhere and not have these casts. Coleen On 8/15/16 3:34 PM, harold seigel wrote: > Hi, > > Please review this fix for JDK-8030221. The fix makes sure that, if > the specified host class for a VM anonymous is another anonymous > class, that the actual host class is the specified host class's host > class. For example, if named class N is the host class for anonymous > class A1, and A1 is the specified host class for anonymous class A2, > then the actual host class for A2 will be named class N. > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8030221 > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8030221/ > > The fix was tested with the JCK Lang and VM tests, the hotpot, and > java/lang, java/util and other JTreg tests, the NSK quick tests, and > with the RBT runtime nightly tests. > > Thanks, Harold From coleen.phillimore at oracle.com Tue Aug 16 14:52:02 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 16 Aug 2016 10:52:02 -0400 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> Message-ID: I think this looks good. Coleen On 8/15/16 1:57 PM, dean.long at oracle.com wrote: > Thanks Fred. > > Still waiting for a Reviewer. > > dl > > > On 8/15/16 6:26 AM, Frederic Parain wrote: >> Thank you, >> >> Looks good to me. >> >> Fred >> >> On 08/12/2016 04:19 PM, dean.long at oracle.com wrote: >>> Sure: >>> >>> http://cr.openjdk.java.net/~dlong/8161598/webrev.1/ >>> >>> dl >>> >>> >>> On 8/12/16 6:46 AM, Frederic Parain wrote: >>>> Dean, >>>> >>>> In file macroAssembler_x86.cpp, could it be possible to >>>> get rid of the clear_pc argument? It seems completely >>>> useless now. >>>> >>>> Fred >>>> >>>> >>>> On 08/09/2016 01:39 PM, dean.long at oracle.com wrote: >>>>> Ping. >>>>> >>>>> dl >>>>> >>>>> >>>>> On 8/4/16 3:28 PM, dean.long at oracle.com wrote: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8161598 >>>>>> >>>>>> http://cr.openjdk.java.net/~dlong/8161598/webrev/ >>>>>> >>>>>> Sorry, this issue is Confidential. The problem is similar to >>>>>> 8029441, >>>>>> where we suspend a thread and use >>>>>> pd_get_top_frame_for_profiling() to >>>>>> get the top frame for stack walking. The problem is "last Java >>>>>> frame" >>>>>> anchor frames on x86. In lots of places we do not store >>>>>> last_Java_pc. >>>>>> This is OK in the synchronous stack walk case done by the current >>>>>> thread. But in the asynchronous case, there are small windows where >>>>>> it's not always safe to get PC from sp[-1]. >>>>>> >>>>>> The solution is not to treat x86 anchor frames as "always walkable". >>>>>> Instead, we follow the example of sparc and make them walking by >>>>>> filling in last_Java_pc when it's safe. >>>>>> >>>>>> I went for the minimal fix, resetting clear_pc to true in >>>>>> reset_last_Java_frame() but not changing the API and all the >>>>>> callers. >>>>>> I can fix this if reviewers feel strongly about it. >>>>>> >>>>>> dl >>>>>> >>>>> >>> > From aph at redhat.com Tue Aug 16 15:43:20 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 16 Aug 2016 16:43:20 +0100 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: <92074f8f-3a3c-b8af-2f3f-f5f47bdc6a18@oracle.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> <57B2039B.9050307@redhat.com> <92074f8f-3a3c-b8af-2f3f-f5f47bdc6a18@oracle.com> Message-ID: <57B33498.3040107@redhat.com> On 15/08/16 19:41, dean.long at oracle.com wrote: > Hi Andrew. Unfortunately, the test and the code that is calling > JavaThread::pd_get_top_frame() are all closed. Does aarch64 have an > tests that exercise JavaThread::pd_get_top_frame()? To test, you need > to suspend a thread, check if the thread state is _thread_in_Java, get > the context, call pd_get_top_frame(), then try to walk the stack. Okay. The patch looks reasonable and works well: it's cleaner and easier to understand than what we had before. That's enough for me. Thanks, Andrew. From aph at redhat.com Tue Aug 16 15:54:12 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 16 Aug 2016 16:54:12 +0100 Subject: RFR: AArch64: follow-up the fix for 8161598 In-Reply-To: <92074f8f-3a3c-b8af-2f3f-f5f47bdc6a18@oracle.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> <57B2039B.9050307@redhat.com> <92074f8f-3a3c-b8af-2f3f-f5f47bdc6a18@oracle.com> Message-ID: <57B33724.60500@redhat.com> All pretty straightforward: http://cr.openjdk.java.net/~aph/8164113/ OK? Andrew. From harold.seigel at oracle.com Tue Aug 16 15:54:43 2016 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 16 Aug 2016 11:54:43 -0400 Subject: RFR(S) 8030221: Checking for anonymous class should check for NULL as well as potential nesting In-Reply-To: <0fc7732e-9298-9e4a-305f-1e1d4d8cf079@oracle.com> References: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> <0fc7732e-9298-9e4a-305f-1e1d4d8cf079@oracle.com> Message-ID: <5945f635-cbd6-692d-d323-c1656e8cf06a@oracle.com> Hi Coleen, Thanks for the comments. In a different fix, for bug JDK-8058575 , I plan to require the host class to be an instance class and change the type of _host_klass to instanceKlass. Thanks, Harold On 8/16/2016 10:48 AM, Coleen Phillimore wrote: > > Can host class actually be an array class? > > It seems like more error if the host class is a primitive, array, or > if it is another anonymous class. > > Then you can pass it as InstanceKlass as host_class everwhere and not > have these casts. > > Coleen > > On 8/15/16 3:34 PM, harold seigel wrote: >> Hi, >> >> Please review this fix for JDK-8030221. The fix makes sure that, if >> the specified host class for a VM anonymous is another anonymous >> class, that the actual host class is the specified host class's host >> class. For example, if named class N is the host class for anonymous >> class A1, and A1 is the specified host class for anonymous class A2, >> then the actual host class for A2 will be named class N. >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8030221 >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8030221/ >> >> The fix was tested with the JCK Lang and VM tests, the hotpot, and >> java/lang, java/util and other JTreg tests, the NSK quick tests, and >> with the RBT runtime nightly tests. >> >> Thanks, Harold > From dean.long at oracle.com Tue Aug 16 16:00:57 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 16 Aug 2016 09:00:57 -0700 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> Message-ID: <2b8eb97e-3e8d-31ac-7507-af7489855f43@oracle.com> Thanks Coleen. dl On 8/16/16 7:52 AM, Coleen Phillimore wrote: > > I think this looks good. > > Coleen > > > On 8/15/16 1:57 PM, dean.long at oracle.com wrote: >> Thanks Fred. >> >> Still waiting for a Reviewer. >> >> dl >> >> >> On 8/15/16 6:26 AM, Frederic Parain wrote: >>> Thank you, >>> >>> Looks good to me. >>> >>> Fred >>> >>> On 08/12/2016 04:19 PM, dean.long at oracle.com wrote: >>>> Sure: >>>> >>>> http://cr.openjdk.java.net/~dlong/8161598/webrev.1/ >>>> >>>> dl >>>> >>>> >>>> On 8/12/16 6:46 AM, Frederic Parain wrote: >>>>> Dean, >>>>> >>>>> In file macroAssembler_x86.cpp, could it be possible to >>>>> get rid of the clear_pc argument? It seems completely >>>>> useless now. >>>>> >>>>> Fred >>>>> >>>>> >>>>> On 08/09/2016 01:39 PM, dean.long at oracle.com wrote: >>>>>> Ping. >>>>>> >>>>>> dl >>>>>> >>>>>> >>>>>> On 8/4/16 3:28 PM, dean.long at oracle.com wrote: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161598 >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dlong/8161598/webrev/ >>>>>>> >>>>>>> Sorry, this issue is Confidential. The problem is similar to >>>>>>> 8029441, >>>>>>> where we suspend a thread and use >>>>>>> pd_get_top_frame_for_profiling() to >>>>>>> get the top frame for stack walking. The problem is "last Java >>>>>>> frame" >>>>>>> anchor frames on x86. In lots of places we do not store >>>>>>> last_Java_pc. >>>>>>> This is OK in the synchronous stack walk case done by the current >>>>>>> thread. But in the asynchronous case, there are small windows >>>>>>> where >>>>>>> it's not always safe to get PC from sp[-1]. >>>>>>> >>>>>>> The solution is not to treat x86 anchor frames as "always >>>>>>> walkable". >>>>>>> Instead, we follow the example of sparc and make them walking by >>>>>>> filling in last_Java_pc when it's safe. >>>>>>> >>>>>>> I went for the minimal fix, resetting clear_pc to true in >>>>>>> reset_last_Java_frame() but not changing the API and all the >>>>>>> callers. >>>>>>> I can fix this if reviewers feel strongly about it. >>>>>>> >>>>>>> dl >>>>>>> >>>>>> >>>> >> > From dean.long at oracle.com Tue Aug 16 16:02:26 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 16 Aug 2016 09:02:26 -0700 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: <57B33498.3040107@redhat.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> <57B2039B.9050307@redhat.com> <92074f8f-3a3c-b8af-2f3f-f5f47bdc6a18@oracle.com> <57B33498.3040107@redhat.com> Message-ID: <96e613a7-6cd8-9a6a-0458-7c45cc6ea7c4@oracle.com> On 8/16/16 8:43 AM, Andrew Haley wrote: > On 15/08/16 19:41, dean.long at oracle.com wrote: >> Hi Andrew. Unfortunately, the test and the code that is calling >> JavaThread::pd_get_top_frame() are all closed. Does aarch64 have an >> tests that exercise JavaThread::pd_get_top_frame()? To test, you need >> to suspend a thread, check if the thread state is _thread_in_Java, get >> the context, call pd_get_top_frame(), then try to walk the stack. > Okay. The patch looks reasonable and works well: it's cleaner and > easier to understand than what we had before. That's enough for me. > > Thanks, > > Andrew. > Thanks Andrew. dl From dean.long at oracle.com Tue Aug 16 16:16:18 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 16 Aug 2016 09:16:18 -0700 Subject: RFR: AArch64: follow-up the fix for 8161598 In-Reply-To: <57B33724.60500@redhat.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> <57B2039B.9050307@redhat.com> <92074f8f-3a3c-b8af-2f3f-f5f47bdc6a18@oracle.com> <57B33724.60500@redhat.com> Message-ID: <3bacebce-1fc8-d3e5-41a4-86565e4ee44b@oracle.com> Looks good. Do you want to update the "Always walkable" comment in javaFrameAnchor_aarch64.hpp? dl On 8/16/16 8:54 AM, Andrew Haley wrote: > All pretty straightforward: > > http://cr.openjdk.java.net/~aph/8164113/ > > OK? > > Andrew. > From aph at redhat.com Tue Aug 16 17:03:19 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 16 Aug 2016 18:03:19 +0100 Subject: RFR: AArch64: follow-up the fix for 8161598 In-Reply-To: <3bacebce-1fc8-d3e5-41a4-86565e4ee44b@oracle.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> <57B2039B.9050307@redhat.com> <92074f8f-3a3c-b8af-2f3f-f5f47bdc6a18@oracle.com> <57B33724.60500@redhat.com> <3bacebce-1fc8-d3e5-41a4-86565e4ee44b@oracle.com> Message-ID: <57B34757.1060806@redhat.com> On 16/08/16 17:16, dean.long at oracle.com wrote: > Looks good. Do you want to update the "Always walkable" comment in > javaFrameAnchor_aarch64.hpp? Duh. :-) OK, done. Thanks, Andrew. From coleen.phillimore at oracle.com Tue Aug 16 18:52:19 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 16 Aug 2016 14:52:19 -0400 Subject: RFR: AArch64: follow-up the fix for 8161598 In-Reply-To: <57B34757.1060806@redhat.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> <57B2039B.9050307@redhat.com> <92074f8f-3a3c-b8af-2f3f-f5f47bdc6a18@oracle.com> <57B33724.60500@redhat.com> <3bacebce-1fc8-d3e5-41a4-86565e4ee44b@oracle.com> <57B34757.1060806@redhat.com> Message-ID: <75bf40a6-3694-d3d9-5cae-ad63cb7c989a@oracle.com> Looks good to me also. Coleen On 8/16/16 1:03 PM, Andrew Haley wrote: > On 16/08/16 17:16, dean.long at oracle.com wrote: >> Looks good. Do you want to update the "Always walkable" comment in >> javaFrameAnchor_aarch64.hpp? > Duh. :-) > > OK, done. > > Thanks, > > Andrew. > > From coleen.phillimore at oracle.com Tue Aug 16 18:53:03 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 16 Aug 2016 14:53:03 -0400 Subject: RFR(S) 8030221: Checking for anonymous class should check for NULL as well as potential nesting In-Reply-To: <5945f635-cbd6-692d-d323-c1656e8cf06a@oracle.com> References: <312c9a5f-90f5-c5d1-f1b4-7f65f98016bd@oracle.com> <0fc7732e-9298-9e4a-305f-1e1d4d8cf079@oracle.com> <5945f635-cbd6-692d-d323-c1656e8cf06a@oracle.com> Message-ID: <38b8ea87-6f99-ca24-262d-5e4a1e688e68@oracle.com> On 8/16/16 11:54 AM, harold seigel wrote: > Hi Coleen, > > Thanks for the comments. In a different fix, for bug JDK-8058575 > , I plan to require > the host class to be an instance class and change the type of > _host_klass to instanceKlass. Okay, Thanks! Coleen > > Thanks, Harold > > > On 8/16/2016 10:48 AM, Coleen Phillimore wrote: >> >> Can host class actually be an array class? >> >> It seems like more error if the host class is a primitive, array, or >> if it is another anonymous class. >> >> Then you can pass it as InstanceKlass as host_class everwhere and not >> have these casts. >> >> Coleen >> >> On 8/15/16 3:34 PM, harold seigel wrote: >>> Hi, >>> >>> Please review this fix for JDK-8030221. The fix makes sure that, if >>> the specified host class for a VM anonymous is another anonymous >>> class, that the actual host class is the specified host class's host >>> class. For example, if named class N is the host class for >>> anonymous class A1, and A1 is the specified host class for anonymous >>> class A2, then the actual host class for A2 will be named class N. >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8030221 >>> >>> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8030221/ >>> >>> The fix was tested with the JCK Lang and VM tests, the hotpot, and >>> java/lang, java/util and other JTreg tests, the NSK quick tests, and >>> with the RBT runtime nightly tests. >>> >>> Thanks, Harold >> > From rachel.protacio at oracle.com Tue Aug 16 19:08:03 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Tue, 16 Aug 2016 15:08:03 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <57B2FD3B.7090305@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <57B2FD3B.7090305@oracle.com> Message-ID: <467482c1-09f4-a6d7-1196-f06cbd4ef879@oracle.com> Thanks for the review, Lois! As discussed offline, this change won't need an added test. Rachel On 8/16/2016 7:47 AM, Lois Foltan wrote: > Hi Rachel, > > I think this looks good. > > You might consider adding a nested anonymous class test for this. I > believe Harold has developed one for JDK-8030221. I think it would be > good to make sure that the host_klass parameter is indeed NULL for a > nested anonymous class and that the VM does not go forward into the > CFLH code in that scenario. > > Thanks, > Lois > > On 8/15/2016 4:48 PM, Rachel Protacio wrote: >> Hello, >> >> Please review this change, which makes sure class file load hooks are >> not called for VM anonymous classes. See >> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >> for justification. >> >> Passes JPRT and RBT. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >> >> Thank you! >> Rachel > From rachel.protacio at oracle.com Tue Aug 16 20:21:25 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Tue, 16 Aug 2016 16:21:25 -0400 Subject: RFR: 8148854: Class names "SomeClass" and "LSomeClass;" treated by JVM as an equivalent Message-ID: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> Hi, Bug summary: fuzzing a class file so that the class name "SomeClass" is instead "LSomeClass;" passed unnoticed through the VM because it was not format checked by default and the L; were stripped off before lookup. This fix makes sure that all class names loaded by the app class loader are format checked by default. The Verifier::relax_verify_for() function that was previously used for both format checking (setting _relax_verify) and reflection (as an access check) has been renamed to relax_access_for() specifically for its use in reflection.cpp. A relax_format_check_for() function has been added to classFileParser.cpp to address the format checking, only "relaxing" the check if loaded by the boot loader or platform class loader. This fix adds a jtreg test, and the change passes JCK vm tests and WLS tests, in addition to JPRT and RBT hotspot_all and non-colo tests. A compatibility request has been approved for this change. Bug: https://bugs.openjdk.java.net/browse/JDK-8148854 Open webrev: http://cr.openjdk.java.net/~rprotacio/8148854.00/ Thanks! Rachel From leelamohan.venati at gmail.com Tue Aug 16 20:57:52 2016 From: leelamohan.venati at gmail.com (Leela Mohan) Date: Tue, 16 Aug 2016 13:57:52 -0700 Subject: Setting JVMTI Capabilities when VM is in "Live Phase" In-Reply-To: <716a0ec1-b6d9-0c18-4a55-ffb889dcc633@oracle.com> References: <3d486133-a635-638c-fcc2-18230c694b2f@oracle.com> <716a0ec1-b6d9-0c18-4a55-ffb889dcc633@oracle.com> Message-ID: Your thoughts on this will be helpful. Thanks, Leela [ Removed "serviceability-dev at openjdk.net" mailing alias added "hotspot-dev" since i am getting failed mail delivery notification ] On Fri, Aug 12, 2016 at 4:33 PM, Daniel D. Daugherty < daniel.daugherty at oracle.com> wrote: > On 8/12/16 5:27 PM, Leela Mohan wrote: > > Hi Daniel, > > Actually, I was thinking about the case where compiler choose not to have > complete "de-opt" state. For ex: Local pruning. I can also think of other > cases which need callback events like, posting exceptions to the agent. > > JVMTI requests for examining/changing the stack frame would conservatively > de-optimize the compile methods but not all de-optimizable locations can > restore the java state user expect. > > What are the expectations for VM for these cases ? > > > I think we'll have to wait for someone more current in how the compilers > interact with JVM/TI to chime in here. I stopped actively working on > JVM/TI back in 2010 or so... :-) > > Dan > > > > Thanks, > Leela > > On Fri, Aug 12, 2016 at 4:04 PM, Daniel D. Daugherty < > daniel.daugherty at oracle.com> wrote: > >> On 8/12/16 4:21 PM, Leela Mohan wrote: >> >>> Hi experts, >>> >>> It looks like, we don't disallow setting capabilities when VM is in >>> "JVMTI_PHASE_LIVE". And, I notice, for every new compilation of method, >>> ciEnv caches the JVMTI state and expects those assumptions to be true >>> during the compilation. Otherwise, dump the compiled method. >>> >>> However, we don't seem to do anything with the methods which were >>> compiled >>> before setting the capability. >>> >>> What is the understanding? >>> >>> Thanks, >>> Leela >>> >> >> Hi Leela, >> >> I'm guessing that you are talking about this capability: >> >> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >> ml#jvmtiCapabilities.can_generate_compiled_method_load_events >> >> and this event: >> >> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >> ml#CompiledMethodLoad >> >> >> The can_generate_compiled_method_load_events capability needs to be added >> in order to generate CompiledMethodLoad events. Capabilities are added via >> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >> ml#AddCapabilities >> which can be called from different JVM/TI phases. Different VM >> implementations >> can require that certain capabilities can only be added in certain JVM/TI >> phases. >> However, if AddCapabilities() does not return a JVM/TI error when a >> capability >> is added in a phase, e.g., the live phase, then you can safely assume that >> the capability has been added. >> >> >> In your example, it sounds like the capability is added in the live phase >> because you are seeing events generated for newly compiled methods. In >> order >> to see synthetic events for methods that were compiled before you added >> the >> capability, your agent needs to use a different function: >> >> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >> ml#GenerateEvents >> >> The documentation for Compiled Method Load has this line: >> >> > These events can be sent after their initial occurrence with >> GenerateEvents. >> >> and that sounds just like your situation. >> >> Hope this helps. >> >> Dan >> >> > > From coleen.phillimore at oracle.com Wed Aug 17 02:00:51 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 16 Aug 2016 22:00:51 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <57B2FD3B.7090305@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <57B2FD3B.7090305@oracle.com> Message-ID: <0405d404-2f54-fac7-3e01-59588ec0824f@oracle.com> On 8/16/16 7:47 AM, Lois Foltan wrote: > Hi Rachel, > > I think this looks good. > > You might consider adding a nested anonymous class test for this. I > believe Harold has developed one for JDK-8030221. I think it would be > good to make sure that the host_klass parameter is indeed NULL for a > nested anonymous class and that the VM does not go forward into the > CFLH code in that scenario. Shouldn't the host_class be non NULL for a nested anonymous class too? thanks, Coleen > > Thanks, > Lois > > On 8/15/2016 4:48 PM, Rachel Protacio wrote: >> Hello, >> >> Please review this change, which makes sure class file load hooks are >> not called for VM anonymous classes. See >> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >> for justification. >> >> Passes JPRT and RBT. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >> >> Thank you! >> Rachel > From coleen.phillimore at oracle.com Wed Aug 17 02:04:03 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 16 Aug 2016 22:04:03 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> Message-ID: On 8/15/16 11:36 PM, David Holmes wrote: > Hi Rachel, > > On 16/08/2016 6:48 AM, Rachel Protacio wrote: >> Hello, >> >> Please review this change, which makes sure class file load hooks are >> not called for VM anonymous classes. See >> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >> >> for justification. >> >> Passes JPRT and RBT. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ > > This: > > 112 // VM Anonymous classes - defined via > unsafe.DefineAnonymousClass - should not > 113 // call back to a CFLH > 114 if (host_klass == NULL) { > 115 stream = prologue(stream, > > suggests that "prologue" can only do CFLH related things. If that is > true then it would be much clearer in my opinion if prologue were > renamed to something more explicit - like check_class_file_load_hook ? > Otherwise, the host_klass should be passed in to prologue and the > anonymous class check internalized there. I agree with David here. This was sort of bothering me about your change when we talked about it before. If the prologue did more than call the CFLH then you'd have to pass host_klass down, since we know that's all it does, the name of the function should be changed. David's name looks good to me. > > Also I don't think you need to explain where VM anonymous classes come > from, it suffices to simply say "Skip class file load hook processing > for VM anonymous classes"; or if the prologue is renamed then simply > "Skip this processing for VM anonymous classes". :) Thanks, Coleen > > Thanks, > David > >> Thank you! >> Rachel From david.holmes at oracle.com Wed Aug 17 03:01:21 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 17 Aug 2016 13:01:21 +1000 Subject: Setting JVMTI Capabilities when VM is in "Live Phase" In-Reply-To: References: <3d486133-a635-638c-fcc2-18230c694b2f@oracle.com> <716a0ec1-b6d9-0c18-4a55-ffb889dcc633@oracle.com> Message-ID: <352ff2e6-c8f4-cfe3-fa67-4e63387b6f80@oracle.com> On 17/08/2016 6:57 AM, Leela Mohan wrote: > Your thoughts on this will be helpful. > > Thanks, > Leela > > [ Removed "serviceability-dev at openjdk.net" mailing alias added > "hotspot-dev" since i am getting failed mail delivery notification ] Added back - it is openjdk.java.net, not openjdk.net David ----- > On Fri, Aug 12, 2016 at 4:33 PM, Daniel D. Daugherty < > daniel.daugherty at oracle.com> wrote: > >> On 8/12/16 5:27 PM, Leela Mohan wrote: >> >> Hi Daniel, >> >> Actually, I was thinking about the case where compiler choose not to have >> complete "de-opt" state. For ex: Local pruning. I can also think of other >> cases which need callback events like, posting exceptions to the agent. >> >> JVMTI requests for examining/changing the stack frame would conservatively >> de-optimize the compile methods but not all de-optimizable locations can >> restore the java state user expect. >> >> What are the expectations for VM for these cases ? >> >> >> I think we'll have to wait for someone more current in how the compilers >> interact with JVM/TI to chime in here. I stopped actively working on >> JVM/TI back in 2010 or so... :-) >> >> Dan >> >> >> >> Thanks, >> Leela >> >> On Fri, Aug 12, 2016 at 4:04 PM, Daniel D. Daugherty < >> daniel.daugherty at oracle.com> wrote: >> >>> On 8/12/16 4:21 PM, Leela Mohan wrote: >>> >>>> Hi experts, >>>> >>>> It looks like, we don't disallow setting capabilities when VM is in >>>> "JVMTI_PHASE_LIVE". And, I notice, for every new compilation of method, >>>> ciEnv caches the JVMTI state and expects those assumptions to be true >>>> during the compilation. Otherwise, dump the compiled method. >>>> >>>> However, we don't seem to do anything with the methods which were >>>> compiled >>>> before setting the capability. >>>> >>>> What is the understanding? >>>> >>>> Thanks, >>>> Leela >>>> >>> >>> Hi Leela, >>> >>> I'm guessing that you are talking about this capability: >>> >>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >>> ml#jvmtiCapabilities.can_generate_compiled_method_load_events >>> >>> and this event: >>> >>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >>> ml#CompiledMethodLoad >>> >>> >>> The can_generate_compiled_method_load_events capability needs to be added >>> in order to generate CompiledMethodLoad events. Capabilities are added via >>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >>> ml#AddCapabilities >>> which can be called from different JVM/TI phases. Different VM >>> implementations >>> can require that certain capabilities can only be added in certain JVM/TI >>> phases. >>> However, if AddCapabilities() does not return a JVM/TI error when a >>> capability >>> is added in a phase, e.g., the live phase, then you can safely assume that >>> the capability has been added. >>> >>> >>> In your example, it sounds like the capability is added in the live phase >>> because you are seeing events generated for newly compiled methods. In >>> order >>> to see synthetic events for methods that were compiled before you added >>> the >>> capability, your agent needs to use a different function: >>> >>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >>> ml#GenerateEvents >>> >>> The documentation for Compiled Method Load has this line: >>> >>>> These events can be sent after their initial occurrence with >>> GenerateEvents. >>> >>> and that sounds just like your situation. >>> >>> Hope this helps. >>> >>> Dan >>> >>> >> >> From leelamohan.venati at gmail.com Wed Aug 17 04:13:51 2016 From: leelamohan.venati at gmail.com (Leela Mohan) Date: Tue, 16 Aug 2016 21:13:51 -0700 Subject: Setting JVMTI Capabilities when VM is in "Live Phase" In-Reply-To: <352ff2e6-c8f4-cfe3-fa67-4e63387b6f80@oracle.com> References: <3d486133-a635-638c-fcc2-18230c694b2f@oracle.com> <716a0ec1-b6d9-0c18-4a55-ffb889dcc633@oracle.com> <352ff2e6-c8f4-cfe3-fa67-4e63387b6f80@oracle.com> Message-ID: Thanks for correcting it, David. On Tue, Aug 16, 2016 at 8:01 PM, David Holmes wrote: > On 17/08/2016 6:57 AM, Leela Mohan wrote: > >> Your thoughts on this will be helpful. >> >> Thanks, >> Leela >> >> [ Removed "serviceability-dev at openjdk.net" mailing alias added >> "hotspot-dev" since i am getting failed mail delivery notification ] >> > > Added back - it is openjdk.java.net, not openjdk.net > > David > ----- > > On Fri, Aug 12, 2016 at 4:33 PM, Daniel D. Daugherty < >> daniel.daugherty at oracle.com> wrote: >> >> On 8/12/16 5:27 PM, Leela Mohan wrote: >>> >>> Hi Daniel, >>> >>> Actually, I was thinking about the case where compiler choose not to >>> have >>> complete "de-opt" state. For ex: Local pruning. I can also think of other >>> cases which need callback events like, posting exceptions to the agent. >>> >>> JVMTI requests for examining/changing the stack frame would >>> conservatively >>> de-optimize the compile methods but not all de-optimizable locations can >>> restore the java state user expect. >>> >>> What are the expectations for VM for these cases ? >>> >>> >>> I think we'll have to wait for someone more current in how the compilers >>> interact with JVM/TI to chime in here. I stopped actively working on >>> JVM/TI back in 2010 or so... :-) >>> >>> Dan >>> >>> >>> >>> Thanks, >>> Leela >>> >>> On Fri, Aug 12, 2016 at 4:04 PM, Daniel D. Daugherty < >>> daniel.daugherty at oracle.com> wrote: >>> >>> On 8/12/16 4:21 PM, Leela Mohan wrote: >>>> >>>> Hi experts, >>>>> >>>>> It looks like, we don't disallow setting capabilities when VM is in >>>>> "JVMTI_PHASE_LIVE". And, I notice, for every new compilation of method, >>>>> ciEnv caches the JVMTI state and expects those assumptions to be true >>>>> during the compilation. Otherwise, dump the compiled method. >>>>> >>>>> However, we don't seem to do anything with the methods which were >>>>> compiled >>>>> before setting the capability. >>>>> >>>>> What is the understanding? >>>>> >>>>> Thanks, >>>>> Leela >>>>> >>>>> >>>> Hi Leela, >>>> >>>> I'm guessing that you are talking about this capability: >>>> >>>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >>>> ml#jvmtiCapabilities.can_generate_compiled_method_load_events >>>> >>>> and this event: >>>> >>>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >>>> ml#CompiledMethodLoad >>>> >>>> >>>> The can_generate_compiled_method_load_events capability needs to be >>>> added >>>> in order to generate CompiledMethodLoad events. Capabilities are added >>>> via >>>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >>>> ml#AddCapabilities >>>> which can be called from different JVM/TI phases. Different VM >>>> implementations >>>> can require that certain capabilities can only be added in certain >>>> JVM/TI >>>> phases. >>>> However, if AddCapabilities() does not return a JVM/TI error when a >>>> capability >>>> is added in a phase, e.g., the live phase, then you can safely assume >>>> that >>>> the capability has been added. >>>> >>>> >>>> In your example, it sounds like the capability is added in the live >>>> phase >>>> because you are seeing events generated for newly compiled methods. In >>>> order >>>> to see synthetic events for methods that were compiled before you added >>>> the >>>> capability, your agent needs to use a different function: >>>> >>>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.ht >>>> ml#GenerateEvents >>>> >>>> The documentation for Compiled Method Load has this line: >>>> >>>> These events can be sent after their initial occurrence with >>>>> >>>> GenerateEvents. >>>> >>>> and that sounds just like your situation. >>>> >>>> Hope this helps. >>>> >>>> Dan >>>> >>>> >>>> >>> >>> From david.holmes at oracle.com Wed Aug 17 07:45:55 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 17 Aug 2016 17:45:55 +1000 Subject: (XS) RFR: 8152849: share/vm/runtime/mutex.cpp:1161 assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed Message-ID: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8152849 webrev: http://cr.openjdk.java.net/~dholmes/8152849/webrev/ This is a rare assertion failure that has proven to unreproducible even by directly trying to exercise the theoretical race conditions mentioned in the bug report. All I can do for now is augment the assert to print out the various values so we can at least see where things are failing, next time it happens. Example output: # Internal Error (/scratch/dh198349/jdk9-hs/hotspot/src/share/vm/runtime/mutex.cpp:1157), pid=21732, tid=21734 # assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed: _owner(0x0000000000000000)|_LockWord(0x0000000000000000)|_EntryList(0x0000000000000000)|_WaitSet(0x0000000000000000)|_OnDeck(0x00000000deaddead) != 0 Thanks, David From dmitry.samersoff at oracle.com Wed Aug 17 07:51:40 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 17 Aug 2016 10:51:40 +0300 Subject: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools Message-ID: Everybody, Please review the changes: http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.01/ -Dmitry -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From lois.foltan at oracle.com Wed Aug 17 12:35:42 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 17 Aug 2016 08:35:42 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <0405d404-2f54-fac7-3e01-59588ec0824f@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <57B2FD3B.7090305@oracle.com> <0405d404-2f54-fac7-3e01-59588ec0824f@oracle.com> Message-ID: <57B45A1E.3030202@oracle.com> On 8/16/2016 10:00 PM, Coleen Phillimore wrote: > > > On 8/16/16 7:47 AM, Lois Foltan wrote: >> Hi Rachel, >> >> I think this looks good. >> >> You might consider adding a nested anonymous class test for this. I >> believe Harold has developed one for JDK-8030221. I think it would >> be good to make sure that the host_klass parameter is indeed NULL for >> a nested anonymous class and that the VM does not go forward into the >> CFLH code in that scenario. > > Shouldn't the host_class be non NULL for a nested anonymous class too? Yes it should be, it would just be nice to have a test that tests this. Lois > > thanks, > Coleen >> >> Thanks, >> Lois >> >> On 8/15/2016 4:48 PM, Rachel Protacio wrote: >>> Hello, >>> >>> Please review this change, which makes sure class file load hooks >>> are not called for VM anonymous classes. See >>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >>> for justification. >>> >>> Passes JPRT and RBT. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >>> >>> Thank you! >>> Rachel >> > From harold.seigel at oracle.com Wed Aug 17 12:39:10 2016 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 17 Aug 2016 08:39:10 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <57B45A1E.3030202@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <57B2FD3B.7090305@oracle.com> <0405d404-2f54-fac7-3e01-59588ec0824f@oracle.com> <57B45A1E.3030202@oracle.com> Message-ID: >> Yes it should be, it would just be nice to have a test that tests this. A null host class will be detected in unsafe.cpp in Unsafe_DefineAnonymousClass_impl(). No test is needed because this is Unsafe. Harold On 8/17/2016 8:35 AM, Lois Foltan wrote: > > On 8/16/2016 10:00 PM, Coleen Phillimore wrote: >> >> >> On 8/16/16 7:47 AM, Lois Foltan wrote: >>> Hi Rachel, >>> >>> I think this looks good. >>> >>> You might consider adding a nested anonymous class test for this. I >>> believe Harold has developed one for JDK-8030221. I think it would >>> be good to make sure that the host_klass parameter is indeed NULL >>> for a nested anonymous class and that the VM does not go forward >>> into the CFLH code in that scenario. >> >> Shouldn't the host_class be non NULL for a nested anonymous class too? > > Yes it should be, it would just be nice to have a test that tests this. > Lois > >> >> thanks, >> Coleen >>> >>> Thanks, >>> Lois >>> >>> On 8/15/2016 4:48 PM, Rachel Protacio wrote: >>>> Hello, >>>> >>>> Please review this change, which makes sure class file load hooks >>>> are not called for VM anonymous classes. See >>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >>>> for justification. >>>> >>>> Passes JPRT and RBT. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >>>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >>>> >>>> Thank you! >>>> Rachel >>> >> > From daniel.daugherty at oracle.com Wed Aug 17 12:59:02 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 17 Aug 2016 06:59:02 -0600 Subject: (XS) RFR: 8152849: share/vm/runtime/mutex.cpp:1161 assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed In-Reply-To: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> References: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> Message-ID: On 8/17/16 1:45 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8152849 > > webrev: http://cr.openjdk.java.net/~dholmes/8152849/webrev/ src/share/vm/runtime/mutex.cpp If the problem is a racy clearing of one of the fields, then the assert can still fire and the extra info might still show zeroes since they are two different queries of the fields. For this diagnostic to be always accurate you need to save a copy of each field, assert() that all the copies or'ed together are == 0, and have the extra info printed from the copies. Dan > > This is a rare assertion failure that has proven to unreproducible > even by directly trying to exercise the theoretical race conditions > mentioned in the bug report. All I can do for now is augment the > assert to print out the various values so we can at least see where > things are failing, next time it happens. > > Example output: > > # Internal Error > (/scratch/dh198349/jdk9-hs/hotspot/src/share/vm/runtime/mutex.cpp:1157), > pid=21732, tid=21734 > # > assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) > == 0) failed: > _owner(0x0000000000000000)|_LockWord(0x0000000000000000)|_EntryList(0x0000000000000000)|_WaitSet(0x0000000000000000)|_OnDeck(0x00000000deaddead) > != 0 > > Thanks, > David From coleen.phillimore at oracle.com Wed Aug 17 14:14:06 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 17 Aug 2016 10:14:06 -0400 Subject: RFR: 8148854: Class names "SomeClass" and "LSomeClass;" treated by JVM as an equivalent In-Reply-To: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> References: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> Message-ID: <9cb43343-a1c5-91a2-91b7-e633af4888e5@oracle.com> Hi Rachel, I really like how you separated out relax_access_check_for and relax_format_check_for cases since they're different. This code change looks really good. Coleen On 8/16/16 4:21 PM, Rachel Protacio wrote: > Hi, > > Bug summary: fuzzing a class file so that the class name "SomeClass" > is instead "LSomeClass;" passed unnoticed through the VM because it > was not format checked by default and the L; were stripped off before > lookup. > > This fix makes sure that all class names loaded by the app class > loader are format checked by default. The Verifier::relax_verify_for() > function that was previously used for both format checking (setting > _relax_verify) and reflection (as an access check) has been renamed to > relax_access_for() specifically for its use in reflection.cpp. A > relax_format_check_for() function has been added to > classFileParser.cpp to address the format checking, only "relaxing" > the check if loaded by the boot loader or platform class loader. > > This fix adds a jtreg test, and the change passes JCK vm tests and WLS > tests, in addition to JPRT and RBT hotspot_all and non-colo tests. A > compatibility request has been approved for this change. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8148854 > Open webrev: http://cr.openjdk.java.net/~rprotacio/8148854.00/ > > Thanks! > Rachel From dmitry.dmitriev at oracle.com Wed Aug 17 14:19:50 2016 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Wed, 17 Aug 2016 17:19:50 +0300 Subject: RFR: 8148854: Class names "SomeClass" and "LSomeClass;" treated by JVM as an equivalent In-Reply-To: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> References: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> Message-ID: Hi Rachel, Can comment only test. FormatCheckingTest.java file: 1) I think that @build instructions are not needed for this test as Christian wrote in review request for JDK-8157957 "ClassNotFoundException: jdk.test.lib.JDKToolFinder"(i.e. "If you run only that test in a clean jtwork folder and it passes, then there's no need for @build.") 2) Test can be run in the same vm, i.e. you can remove "othervm" from run action. Or test should be run in othervm? Thank you, Dmitry On 16.08.2016 23:21, Rachel Protacio wrote: > Hi, > > Bug summary: fuzzing a class file so that the class name "SomeClass" > is instead "LSomeClass;" passed unnoticed through the VM because it > was not format checked by default and the L; were stripped off before > lookup. > > This fix makes sure that all class names loaded by the app class > loader are format checked by default. The Verifier::relax_verify_for() > function that was previously used for both format checking (setting > _relax_verify) and reflection (as an access check) has been renamed to > relax_access_for() specifically for its use in reflection.cpp. A > relax_format_check_for() function has been added to > classFileParser.cpp to address the format checking, only "relaxing" > the check if loaded by the boot loader or platform class loader. > > This fix adds a jtreg test, and the change passes JCK vm tests and WLS > tests, in addition to JPRT and RBT hotspot_all and non-colo tests. A > compatibility request has been approved for this change. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8148854 > Open webrev: http://cr.openjdk.java.net/~rprotacio/8148854.00/ > > Thanks! > Rachel From rachel.protacio at oracle.com Wed Aug 17 15:56:14 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Wed, 17 Aug 2016 11:56:14 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> Message-ID: Hi David and Coleen, Thank you for the reviews. I've updated the change as requested: http://cr.openjdk.java.net/~rprotacio/8163973.01/ Rachel On 8/16/2016 10:04 PM, Coleen Phillimore wrote: > > > On 8/15/16 11:36 PM, David Holmes wrote: >> Hi Rachel, >> >> On 16/08/2016 6:48 AM, Rachel Protacio wrote: >>> Hello, >>> >>> Please review this change, which makes sure class file load hooks are >>> not called for VM anonymous classes. See >>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >>> >>> for justification. >>> >>> Passes JPRT and RBT. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >> >> This: >> >> 112 // VM Anonymous classes - defined via >> unsafe.DefineAnonymousClass - should not >> 113 // call back to a CFLH >> 114 if (host_klass == NULL) { >> 115 stream = prologue(stream, >> >> suggests that "prologue" can only do CFLH related things. If that is >> true then it would be much clearer in my opinion if prologue were >> renamed to something more explicit - like check_class_file_load_hook >> ? Otherwise, the host_klass should be passed in to prologue and the >> anonymous class check internalized there. > > I agree with David here. This was sort of bothering me about your > change when we talked about it before. If the prologue did more than > call the CFLH then you'd have to pass host_klass down, since we know > that's all it does, the name of the function should be changed. > David's name looks good to me. >> >> Also I don't think you need to explain where VM anonymous classes >> come from, it suffices to simply say "Skip class file load hook >> processing for VM anonymous classes"; or if the prologue is renamed >> then simply "Skip this processing for VM anonymous classes". :) > > Thanks, > Coleen >> >> Thanks, >> David >> >>> Thank you! >>> Rachel > From rachel.protacio at oracle.com Wed Aug 17 15:56:50 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Wed, 17 Aug 2016 11:56:50 -0400 Subject: RFR: 8148854: Class names "SomeClass" and "LSomeClass;" treated by JVM as an equivalent In-Reply-To: <9cb43343-a1c5-91a2-91b7-e633af4888e5@oracle.com> References: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> <9cb43343-a1c5-91a2-91b7-e633af4888e5@oracle.com> Message-ID: <20eabd79-6337-6270-2ee9-b212e3cdb10c@oracle.com> Thanks for the review, Coleen! Rachel On 8/17/2016 10:14 AM, Coleen Phillimore wrote: > > Hi Rachel, > > I really like how you separated out relax_access_check_for and > relax_format_check_for cases since they're different. > > This code change looks really good. > > Coleen > > On 8/16/16 4:21 PM, Rachel Protacio wrote: >> Hi, >> >> Bug summary: fuzzing a class file so that the class name "SomeClass" >> is instead "LSomeClass;" passed unnoticed through the VM because it >> was not format checked by default and the L; were stripped off before >> lookup. >> >> This fix makes sure that all class names loaded by the app class >> loader are format checked by default. The >> Verifier::relax_verify_for() function that was previously used for >> both format checking (setting _relax_verify) and reflection (as an >> access check) has been renamed to relax_access_for() specifically for >> its use in reflection.cpp. A relax_format_check_for() function has >> been added to classFileParser.cpp to address the format checking, >> only "relaxing" the check if loaded by the boot loader or platform >> class loader. >> >> This fix adds a jtreg test, and the change passes JCK vm tests and >> WLS tests, in addition to JPRT and RBT hotspot_all and non-colo >> tests. A compatibility request has been approved for this change. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8148854 >> Open webrev: http://cr.openjdk.java.net/~rprotacio/8148854.00/ >> >> Thanks! >> Rachel > From rachel.protacio at oracle.com Wed Aug 17 16:24:42 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Wed, 17 Aug 2016 12:24:42 -0400 Subject: RFR: 8148854: Class names "SomeClass" and "LSomeClass;" treated by JVM as an equivalent In-Reply-To: References: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> Message-ID: <5e9db2b3-a344-0eab-a8d5-0c29cbc07658@oracle.com> Thanks for the comments - I've fixed as requested: http://cr.openjdk.java.net/~rprotacio/8148854.01/ Rachel On 8/17/2016 10:19 AM, Dmitry Dmitriev wrote: > Hi Rachel, > > Can comment only test. > FormatCheckingTest.java file: > 1) I think that @build instructions are not needed for this test as > Christian wrote in review request for JDK-8157957 > "ClassNotFoundException: jdk.test.lib.JDKToolFinder"(i.e. "If you run > only that test in a clean jtwork folder and it passes, then > there's no need for @build.") > 2) Test can be run in the same vm, i.e. you can remove "othervm" from > run action. Or test should be run in othervm? > > Thank you, > Dmitry > > On 16.08.2016 23:21, Rachel Protacio wrote: >> Hi, >> >> Bug summary: fuzzing a class file so that the class name "SomeClass" >> is instead "LSomeClass;" passed unnoticed through the VM because it >> was not format checked by default and the L; were stripped off before >> lookup. >> >> This fix makes sure that all class names loaded by the app class >> loader are format checked by default. The >> Verifier::relax_verify_for() function that was previously used for >> both format checking (setting _relax_verify) and reflection (as an >> access check) has been renamed to relax_access_for() specifically for >> its use in reflection.cpp. A relax_format_check_for() function has >> been added to classFileParser.cpp to address the format checking, >> only "relaxing" the check if loaded by the boot loader or platform >> class loader. >> >> This fix adds a jtreg test, and the change passes JCK vm tests and >> WLS tests, in addition to JPRT and RBT hotspot_all and non-colo >> tests. A compatibility request has been approved for this change. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8148854 >> Open webrev: http://cr.openjdk.java.net/~rprotacio/8148854.00/ >> >> Thanks! >> Rachel > From dmitry.dmitriev at oracle.com Wed Aug 17 16:35:48 2016 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Wed, 17 Aug 2016 19:35:48 +0300 Subject: RFR: 8148854: Class names "SomeClass" and "LSomeClass;" treated by JVM as an equivalent In-Reply-To: <5e9db2b3-a344-0eab-a8d5-0c29cbc07658@oracle.com> References: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> <5e9db2b3-a344-0eab-a8d5-0c29cbc07658@oracle.com> Message-ID: <862c7370-7ee9-e4a2-056e-67c22c3664bd@oracle.com> Rachel, thank you! Test looks good. Dmitry On 17.08.2016 19:24, Rachel Protacio wrote: > Thanks for the comments - I've fixed as requested: > http://cr.openjdk.java.net/~rprotacio/8148854.01/ > Rachel > > > On 8/17/2016 10:19 AM, Dmitry Dmitriev wrote: >> Hi Rachel, >> >> Can comment only test. >> FormatCheckingTest.java file: >> 1) I think that @build instructions are not needed for this test as >> Christian wrote in review request for JDK-8157957 >> "ClassNotFoundException: jdk.test.lib.JDKToolFinder"(i.e. "If you run >> only that test in a clean jtwork folder and it passes, then >> there's no need for @build.") >> 2) Test can be run in the same vm, i.e. you can remove "othervm" from >> run action. Or test should be run in othervm? >> >> Thank you, >> Dmitry >> >> On 16.08.2016 23:21, Rachel Protacio wrote: >>> Hi, >>> >>> Bug summary: fuzzing a class file so that the class name "SomeClass" >>> is instead "LSomeClass;" passed unnoticed through the VM because it >>> was not format checked by default and the L; were stripped off >>> before lookup. >>> >>> This fix makes sure that all class names loaded by the app class >>> loader are format checked by default. The >>> Verifier::relax_verify_for() function that was previously used for >>> both format checking (setting _relax_verify) and reflection (as an >>> access check) has been renamed to relax_access_for() specifically >>> for its use in reflection.cpp. A relax_format_check_for() function >>> has been added to classFileParser.cpp to address the format >>> checking, only "relaxing" the check if loaded by the boot loader or >>> platform class loader. >>> >>> This fix adds a jtreg test, and the change passes JCK vm tests and >>> WLS tests, in addition to JPRT and RBT hotspot_all and non-colo >>> tests. A compatibility request has been approved for this change. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8148854 >>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8148854.00/ >>> >>> Thanks! >>> Rachel >> > From coleen.phillimore at oracle.com Wed Aug 17 21:36:58 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 17 Aug 2016 17:36:58 -0400 Subject: RFR (xs) 8037138: x86: problem with JVMTI breakpoint Message-ID: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> Summary: do aload(0) after rewriting aload bytecodes to fast version for frequent pairs. This is more of a cleanup since RewriteFrequentPairs is turned off for breakpoint debugging, because if frequent pairs of bytecodes are rewritten, you can't set a breakpoint on the second bci of a pair because it's not executed with this optimization. Now all the platforms are the same as ppc for this. Tested with jprt since aload(0) in the interpreter is executed a many times. open webrev at http://cr.openjdk.java.net/~coleenp/8037138.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8037138 See bug for how this came up on the openjdk thread. Thanks, Coleen From david.holmes at oracle.com Thu Aug 18 00:04:16 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Aug 2016 10:04:16 +1000 Subject: (XS) RFR: 8152849: share/vm/runtime/mutex.cpp:1161 assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed In-Reply-To: References: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> Message-ID: <425f88af-b248-e786-ac60-8f9c9d7d1a13@oracle.com> Hi Dan, Thanks for looking at this. On 17/08/2016 10:59 PM, Daniel D. Daugherty wrote: > On 8/17/16 1:45 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8152849 >> >> webrev: http://cr.openjdk.java.net/~dholmes/8152849/webrev/ > > src/share/vm/runtime/mutex.cpp > If the problem is a racy clearing of one of the fields, then the > assert can still fire and the extra info might still show zeroes > since they are two different queries of the fields. > > For this diagnostic to be always accurate you need to save a copy > of each field, assert() that all the copies or'ed together are == 0, > and have the extra info printed from the copies. You are right of course. Don't know what I was thinking. :( http://cr.openjdk.java.net/~dholmes/8152849/webrev.v2/ Thanks, David > Dan > > >> >> This is a rare assertion failure that has proven to unreproducible >> even by directly trying to exercise the theoretical race conditions >> mentioned in the bug report. All I can do for now is augment the >> assert to print out the various values so we can at least see where >> things are failing, next time it happens. >> >> Example output: >> >> # Internal Error >> (/scratch/dh198349/jdk9-hs/hotspot/src/share/vm/runtime/mutex.cpp:1157), >> pid=21732, tid=21734 >> # >> assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) >> == 0) failed: >> _owner(0x0000000000000000)|_LockWord(0x0000000000000000)|_EntryList(0x0000000000000000)|_WaitSet(0x0000000000000000)|_OnDeck(0x00000000deaddead) >> != 0 >> >> Thanks, >> David > From david.holmes at oracle.com Thu Aug 18 00:56:11 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Aug 2016 10:56:11 +1000 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> Message-ID: <28bb7513-28e6-b61f-6aa5-2e2b6a46716f@oracle.com> Hi Rachel, This looks good to me. Thanks, David On 18/08/2016 1:56 AM, Rachel Protacio wrote: > Hi David and Coleen, > > Thank you for the reviews. I've updated the change as requested: > http://cr.openjdk.java.net/~rprotacio/8163973.01/ > > Rachel > > On 8/16/2016 10:04 PM, Coleen Phillimore wrote: >> >> >> On 8/15/16 11:36 PM, David Holmes wrote: >>> Hi Rachel, >>> >>> On 16/08/2016 6:48 AM, Rachel Protacio wrote: >>>> Hello, >>>> >>>> Please review this change, which makes sure class file load hooks are >>>> not called for VM anonymous classes. See >>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >>>> >>>> for justification. >>>> >>>> Passes JPRT and RBT. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >>>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >>> >>> This: >>> >>> 112 // VM Anonymous classes - defined via >>> unsafe.DefineAnonymousClass - should not >>> 113 // call back to a CFLH >>> 114 if (host_klass == NULL) { >>> 115 stream = prologue(stream, >>> >>> suggests that "prologue" can only do CFLH related things. If that is >>> true then it would be much clearer in my opinion if prologue were >>> renamed to something more explicit - like check_class_file_load_hook >>> ? Otherwise, the host_klass should be passed in to prologue and the >>> anonymous class check internalized there. >> >> I agree with David here. This was sort of bothering me about your >> change when we talked about it before. If the prologue did more than >> call the CFLH then you'd have to pass host_klass down, since we know >> that's all it does, the name of the function should be changed. >> David's name looks good to me. >>> >>> Also I don't think you need to explain where VM anonymous classes >>> come from, it suffices to simply say "Skip class file load hook >>> processing for VM anonymous classes"; or if the prologue is renamed >>> then simply "Skip this processing for VM anonymous classes". :) >> >> Thanks, >> Coleen >>> >>> Thanks, >>> David >>> >>>> Thank you! >>>> Rachel >> > From dean.long at oracle.com Thu Aug 18 01:59:57 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 17 Aug 2016 18:59:57 -0700 Subject: RFR (xs) 8037138: x86: problem with JVMTI breakpoint In-Reply-To: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> References: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> Message-ID: Looks good. dl On 8/17/16 2:36 PM, Coleen Phillimore wrote: > Summary: do aload(0) after rewriting aload bytecodes to fast version > for frequent pairs. > > This is more of a cleanup since RewriteFrequentPairs is turned off for > breakpoint debugging, because if frequent pairs of bytecodes are > rewritten, you can't set a breakpoint on the second bci of a pair > because it's not executed with this optimization. > > Now all the platforms are the same as ppc for this. > > Tested with jprt since aload(0) in the interpreter is executed a many > times. > > open webrev at http://cr.openjdk.java.net/~coleenp/8037138.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8037138 > > > See bug for how this came up on the openjdk thread. > > Thanks, > Coleen > From david.holmes at oracle.com Thu Aug 18 02:50:13 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Aug 2016 12:50:13 +1000 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code Message-ID: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8158854 webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ Generally speaking release_store should be paired with load_acquire to ensure correct memory visibility and ordering in lock-free code (often the read path is what is lock-free). So based on some observations from earlier bug fixes this bug was intended to examine the use of release_store and see if we have the appropriate load_acquire as well. The bug report lists all of the cases that were examined - some clear cut correct, some complex correct, some fixed here and some split out into separate issues. Here's a summary of the actual changes in the webrev: src/share/vm/classfile/classLoader.hpp - next() accessor needs to use load_acquire. --- src/share/vm/classfile/verifier.cpp - load of _verify_byte_codes_fn needs to load_acquire to pair with use of release_store - release_store of _is_new_verify_byte_codes_fn is not needed --- src/share/vm/oops/arrayKlass.hpp src/share/vm/oops/instanceKlass.cpp src/share/vm/oops/instanceKlass.hpp src/share/vm/oops/instanceKlass.inline.hpp src/share/vm/oops/objArrayKlass.cpp src/share/vm/oops/typeArrayKlass.cpp The logic for storing dimensions values was using a storeStore barrier between the lower and higher dimensions. This is converted to use a release-store setter for higher-dimension, with paired load-acquire accessor. Plus the accessed fields are declared volatile. The methods_jmethod_ids_acquire() and its paired release_set_methods_jmethod_ids(), are moved to the .inline.hpp file where they belong. --- src/share/vm/runtime/vmStructs.cpp Updated declaration for _array_klasses now it is volatile. --- Thanks, David From david.holmes at oracle.com Thu Aug 18 02:59:58 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Aug 2016 12:59:58 +1000 Subject: RFR (xs) 8037138: x86: problem with JVMTI breakpoint In-Reply-To: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> References: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> Message-ID: Looks good to me. Thanks, David On 18/08/2016 7:36 AM, Coleen Phillimore wrote: > Summary: do aload(0) after rewriting aload bytecodes to fast version for > frequent pairs. > > This is more of a cleanup since RewriteFrequentPairs is turned off for > breakpoint debugging, because if frequent pairs of bytecodes are > rewritten, you can't set a breakpoint on the second bci of a pair > because it's not executed with this optimization. > > Now all the platforms are the same as ppc for this. > > Tested with jprt since aload(0) in the interpreter is executed a many > times. > > open webrev at http://cr.openjdk.java.net/~coleenp/8037138.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8037138 > > > See bug for how this came up on the openjdk thread. > > Thanks, > Coleen > From coleen.phillimore at oracle.com Thu Aug 18 12:44:36 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 18 Aug 2016 08:44:36 -0400 Subject: RFR (xs) 8037138: x86: problem with JVMTI breakpoint In-Reply-To: References: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> Message-ID: Thanks Dean! Coleen On 8/17/16 9:59 PM, dean.long at oracle.com wrote: > Looks good. > > dl > > > On 8/17/16 2:36 PM, Coleen Phillimore wrote: >> Summary: do aload(0) after rewriting aload bytecodes to fast version >> for frequent pairs. >> >> This is more of a cleanup since RewriteFrequentPairs is turned off >> for breakpoint debugging, because if frequent pairs of bytecodes are >> rewritten, you can't set a breakpoint on the second bci of a pair >> because it's not executed with this optimization. >> >> Now all the platforms are the same as ppc for this. >> >> Tested with jprt since aload(0) in the interpreter is executed a many >> times. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8037138.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8037138 >> >> >> See bug for how this came up on the openjdk thread. >> >> Thanks, >> Coleen >> > From coleen.phillimore at oracle.com Thu Aug 18 12:44:46 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 18 Aug 2016 08:44:46 -0400 Subject: RFR (xs) 8037138: x86: problem with JVMTI breakpoint In-Reply-To: References: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> Message-ID: <2a2bc45f-386b-27a2-e769-319341958a74@oracle.com> Thanks David, Coleen On 8/17/16 10:59 PM, David Holmes wrote: > Looks good to me. > > Thanks, > David > > On 18/08/2016 7:36 AM, Coleen Phillimore wrote: >> Summary: do aload(0) after rewriting aload bytecodes to fast version for >> frequent pairs. >> >> This is more of a cleanup since RewriteFrequentPairs is turned off for >> breakpoint debugging, because if frequent pairs of bytecodes are >> rewritten, you can't set a breakpoint on the second bci of a pair >> because it's not executed with this optimization. >> >> Now all the platforms are the same as ppc for this. >> >> Tested with jprt since aload(0) in the interpreter is executed a many >> times. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8037138.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8037138 >> >> >> See bug for how this came up on the openjdk thread. >> >> Thanks, >> Coleen >> From daniel.daugherty at oracle.com Thu Aug 18 13:44:46 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 18 Aug 2016 07:44:46 -0600 Subject: RFR (xs) 8037138: x86: problem with JVMTI breakpoint In-Reply-To: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> References: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> Message-ID: On 8/17/16 3:36 PM, Coleen Phillimore wrote: > Summary: do aload(0) after rewriting aload bytecodes to fast version > for frequent pairs. > > This is more of a cleanup since RewriteFrequentPairs is turned off for > breakpoint debugging, because if frequent pairs of bytecodes are > rewritten, you can't set a breakpoint on the second bci of a pair > because it's not executed with this optimization. > > Now all the platforms are the same as ppc for this. > > Tested with jprt since aload(0) in the interpreter is executed a many > times. > > open webrev at http://cr.openjdk.java.net/~coleenp/8037138.01/webrev src/cpu/aarch64/vm/templateTable_aarch64.cpp No comments. src/cpu/sparc/vm/templateTable_sparc.cpp No comments. src/cpu/x86/vm/templateTable_x86.cpp No comments. Thumbs up. Just to be clear... this bad oop problem (and the loss of the tos value) only happens when -XX:+RewriteFrequentPairs is specified on the command line... Do I have this right? Dan > bug link https://bugs.openjdk.java.net/browse/JDK-8037138 > > > See bug for how this came up on the openjdk thread. > > Thanks, > Coleen > From daniel.daugherty at oracle.com Thu Aug 18 14:04:51 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 18 Aug 2016 08:04:51 -0600 Subject: (XS) RFR: 8152849: share/vm/runtime/mutex.cpp:1161 assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed In-Reply-To: <425f88af-b248-e786-ac60-8f9c9d7d1a13@oracle.com> References: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> <425f88af-b248-e786-ac60-8f9c9d7d1a13@oracle.com> Message-ID: On 8/17/16 6:04 PM, David Holmes wrote: > Hi Dan, > > Thanks for looking at this. No problem. I'm pretty sure I once chased the older version of this bug... :-) > > On 17/08/2016 10:59 PM, Daniel D. Daugherty wrote: >> On 8/17/16 1:45 AM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8152849 >>> >>> webrev: http://cr.openjdk.java.net/~dholmes/8152849/webrev/ >> >> src/share/vm/runtime/mutex.cpp >> If the problem is a racy clearing of one of the fields, then the >> assert can still fire and the extra info might still show zeroes >> since they are two different queries of the fields. >> >> For this diagnostic to be always accurate you need to save a copy >> of each field, assert() that all the copies or'ed together are == 0, >> and have the extra info printed from the copies. > > You are right of course. Don't know what I was thinking. :( > > http://cr.openjdk.java.net/~dholmes/8152849/webrev.v2/ src/share/vm/runtime/mutex.cpp No comments. Thumbs up! Dan > > Thanks, > David > >> Dan >> >> >>> >>> This is a rare assertion failure that has proven to unreproducible >>> even by directly trying to exercise the theoretical race conditions >>> mentioned in the bug report. All I can do for now is augment the >>> assert to print out the various values so we can at least see where >>> things are failing, next time it happens. >>> >>> Example output: >>> >>> # Internal Error >>> (/scratch/dh198349/jdk9-hs/hotspot/src/share/vm/runtime/mutex.cpp:1157), >>> >>> pid=21732, tid=21734 >>> # >>> assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) >>> >>> == 0) failed: >>> _owner(0x0000000000000000)|_LockWord(0x0000000000000000)|_EntryList(0x0000000000000000)|_WaitSet(0x0000000000000000)|_OnDeck(0x00000000deaddead) >>> >>> != 0 >>> >>> Thanks, >>> David >> From coleen.phillimore at oracle.com Thu Aug 18 14:03:46 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 18 Aug 2016 10:03:46 -0400 Subject: RFR (xs) 8037138: x86: problem with JVMTI breakpoint In-Reply-To: References: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> Message-ID: <8485ec5b-c8c6-9e78-c0ba-b20f3452d769@oracle.com> On 8/18/16 9:44 AM, Daniel D. Daugherty wrote: > On 8/17/16 3:36 PM, Coleen Phillimore wrote: >> Summary: do aload(0) after rewriting aload bytecodes to fast version >> for frequent pairs. >> >> This is more of a cleanup since RewriteFrequentPairs is turned off >> for breakpoint debugging, because if frequent pairs of bytecodes are >> rewritten, you can't set a breakpoint on the second bci of a pair >> because it's not executed with this optimization. >> >> Now all the platforms are the same as ppc for this. >> >> Tested with jprt since aload(0) in the interpreter is executed a many >> times. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8037138.01/webrev > > src/cpu/aarch64/vm/templateTable_aarch64.cpp > No comments. > > src/cpu/sparc/vm/templateTable_sparc.cpp > No comments. > > src/cpu/x86/vm/templateTable_x86.cpp > No comments. > > Thumbs up. > > Just to be clear... this bad oop problem (and the loss of the > tos value) only happens when -XX:+RewriteFrequentPairs is > specified on the command line... Do I have this right? It's also tied to capabilities. When you update the capability if (avail.can_generate_breakpoint_events) { RewriteFrequentPairs = false; } It is turned off. From what I can tell when we update the jvmti capabilities it's turned on but that could be after the interpreter is generated, but RewriteFrequentPairs is on by default. So this seems wrong too. :( Coleen > > > Dan > > >> bug link https://bugs.openjdk.java.net/browse/JDK-8037138 >> >> >> See bug for how this came up on the openjdk thread. >> >> Thanks, >> Coleen >> > From aleksey.shipilev at gmail.com Thu Aug 18 14:15:01 2016 From: aleksey.shipilev at gmail.com (Aleksey Shipilev) Date: Thu, 18 Aug 2016 17:15:01 +0300 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> Message-ID: On 08/18/2016 05:50 AM, David Holmes wrote: > webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ Looks good to me. Minor nit: *) inline declarations have different indenting, is that our code style? 370 Klass* array_klasses() const { return _array_klasses; } 371 inline Klass* array_klasses_acquire() const; // load with acquire semantics 372 void set_array_klasses(Klass* k) { _array_klasses = k; } 373 inline void release_set_array_klasses(Klass* k); // store with release semantics Thanks, -Aleksey From coleen.phillimore at oracle.com Thu Aug 18 14:36:50 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 18 Aug 2016 10:36:50 -0400 Subject: RFR (xs) 8037138: x86: problem with JVMTI breakpoint In-Reply-To: References: <65a840dd-3a57-b99c-41b5-5da672b2039a@oracle.com> Message-ID: On 8/18/16 9:44 AM, Daniel D. Daugherty wrote: > On 8/17/16 3:36 PM, Coleen Phillimore wrote: >> Summary: do aload(0) after rewriting aload bytecodes to fast version >> for frequent pairs. >> >> This is more of a cleanup since RewriteFrequentPairs is turned off >> for breakpoint debugging, because if frequent pairs of bytecodes are >> rewritten, you can't set a breakpoint on the second bci of a pair >> because it's not executed with this optimization. >> >> Now all the platforms are the same as ppc for this. >> >> Tested with jprt since aload(0) in the interpreter is executed a many >> times. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8037138.01/webrev > > src/cpu/aarch64/vm/templateTable_aarch64.cpp > No comments. > > src/cpu/sparc/vm/templateTable_sparc.cpp > No comments. > > src/cpu/x86/vm/templateTable_x86.cpp > No comments. > > Thumbs up. > > Just to be clear... this bad oop problem (and the loss of the > tos value) only happens when -XX:+RewriteFrequentPairs is > specified on the command line... Do I have this right? From the code, can_generate_breakpoint_events *looks like* a capability that you can only set in OnLoad phase, even though I don't see it in the spec which capabilities you can set when. http://cr.openjdk.java.net/~coleenp/jvmti.html#AddCapabilities The flag -XX:+RewriteBytecodes is default in most cases, but when you set can_generate_breakpoint_events capability, the vm code does: if (avail.can_generate_breakpoint_events) { RewriteFrequentPairs = false; } The OnLoad phase SetCapabilities is called before the Interpreter is generated. So this isn't wrong, it's just obscure. Coleen > > > Dan > > >> bug link https://bugs.openjdk.java.net/browse/JDK-8037138 >> >> >> See bug for how this came up on the openjdk thread. >> >> Thanks, >> Coleen >> > From rachel.protacio at oracle.com Thu Aug 18 14:44:39 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Thu, 18 Aug 2016 10:44:39 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <28bb7513-28e6-b61f-6aa5-2e2b6a46716f@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> <28bb7513-28e6-b61f-6aa5-2e2b6a46716f@oracle.com> Message-ID: <2736b470-6f45-7a3b-030b-6f94fe2a2039@oracle.com> Thank you, David! Rachel On 8/17/2016 8:56 PM, David Holmes wrote: > Hi Rachel, > > This looks good to me. > > Thanks, > David > > On 18/08/2016 1:56 AM, Rachel Protacio wrote: >> Hi David and Coleen, >> >> Thank you for the reviews. I've updated the change as requested: >> http://cr.openjdk.java.net/~rprotacio/8163973.01/ >> >> Rachel >> >> On 8/16/2016 10:04 PM, Coleen Phillimore wrote: >>> >>> >>> On 8/15/16 11:36 PM, David Holmes wrote: >>>> Hi Rachel, >>>> >>>> On 16/08/2016 6:48 AM, Rachel Protacio wrote: >>>>> Hello, >>>>> >>>>> Please review this change, which makes sure class file load hooks are >>>>> not called for VM anonymous classes. See >>>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >>>>> >>>>> >>>>> for justification. >>>>> >>>>> Passes JPRT and RBT. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >>>>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >>>> >>>> This: >>>> >>>> 112 // VM Anonymous classes - defined via >>>> unsafe.DefineAnonymousClass - should not >>>> 113 // call back to a CFLH >>>> 114 if (host_klass == NULL) { >>>> 115 stream = prologue(stream, >>>> >>>> suggests that "prologue" can only do CFLH related things. If that is >>>> true then it would be much clearer in my opinion if prologue were >>>> renamed to something more explicit - like check_class_file_load_hook >>>> ? Otherwise, the host_klass should be passed in to prologue and the >>>> anonymous class check internalized there. >>> >>> I agree with David here. This was sort of bothering me about your >>> change when we talked about it before. If the prologue did more than >>> call the CFLH then you'd have to pass host_klass down, since we know >>> that's all it does, the name of the function should be changed. >>> David's name looks good to me. >>>> >>>> Also I don't think you need to explain where VM anonymous classes >>>> come from, it suffices to simply say "Skip class file load hook >>>> processing for VM anonymous classes"; or if the prologue is renamed >>>> then simply "Skip this processing for VM anonymous classes". :) >>> >>> Thanks, >>> Coleen >>>> >>>> Thanks, >>>> David >>>> >>>>> Thank you! >>>>> Rachel >>> >> From aph at redhat.com Thu Aug 18 15:47:44 2016 From: aph at redhat.com (Andrew Haley) Date: Thu, 18 Aug 2016 16:47:44 +0100 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: Message-ID: <03fbcd1d-0ad2-51d0-979d-dc1186ad1c08@redhat.com> On 02/08/16 21:31, Chris Plummer wrote: > Hello, > > Please review the following: > > webrev: > http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ > > Bugs fixed: > > JDK-8133749: os::current_frame() is not returning the proper frame on > ARM and solaris-x64 > https://bugs.openjdk.java.net/browse/JDK-8133749 > > JDK-8133747: NMT includes an extra stack frame due to assumption NMT is > making on tail calls being used > https://bugs.openjdk.java.net/browse/JDK-8133747 > > JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds includes > NativeCallStack::NativeCallStack() frame in backtrace > https://bugs.openjdk.java.net/browse/JDK-8133740 I don't seem to get any test failures on AArch64. I take it that the above patch is in jdk9/hs for other architectures; the test case certainly is. Andrew. From dmitry.samersoff at oracle.com Thu Aug 18 16:18:10 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Thu, 18 Aug 2016 19:18:10 +0300 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: Message-ID: Chris, 1. (general) intptr_t* _get_previous_fp() I'm not sure we should rely on the fact that _get_previous_fp is inlined. AFAIK, gcc doesn't inline function if it has inline assembly. So it might be better to mark it explicitly by __attribute__ ((noinline)) 2. os_solaris_x86.cpp:299 Should we check os::is_first_C_frame(&myframe) before asking for caller frame ? -Dmitry On 2016-08-02 23:31, Chris Plummer wrote: > Hello, > > Please review the following: > > webrev: > http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ > > > Bugs fixed: > > JDK-8133749: os::current_frame() is not returning the proper frame on > ARM and solaris-x64 > https://bugs.openjdk.java.net/browse/JDK-8133749 > > JDK-8133747: NMT includes an extra stack frame due to assumption NMT is > making on tail calls being used > https://bugs.openjdk.java.net/browse/JDK-8133747 > > JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds includes > NativeCallStack::NativeCallStack() frame in backtrace > https://bugs.openjdk.java.net/browse/JDK-8133740 > > The above bugs all result in the NMT detail stack traces including extra > frames in the stack traces. Certain frames are suppose to be skipped, > but sometimes are not. The frames that show up are: > > NativeCallStack::NativeCallStack > os::get_native_stack > > These are both methods used to generate the stack trace, and therefore > should not be included it. However, under some (most) circumstances, > they were. > > Also, there was no test to make sure that any NMT detail output is > generated, or that it is correct. I've added one with this webrev. Of > the 27 possible builds (9 platforms * 3 build flavors), only 9 of the 27 > initially passed this new test. They were the product and fastdebug > builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug > builds for solaris-x64, windows-x86, and windows-x64. All the rest > failed. They now all pass with my fixes in place. > > Here's a summary of the changes: > > src/os/posix/vm/os_posix.cpp > src/os/windows/vm/os_windows.cpp > > JDK-8133747 fixes: There was some frame skipping logic here which was > sort of correct, but was misplace. There are no extra frames being added > in os::get_native_stack() due to lack of inlining or lack of a tail > call, so no need for toSkip++ here. The logic has been moved to > NativeCallStack::NativeCallStack, which is where the tail call is > (sometimes) made, and also corrected (see nativeCallStack.cpp below). > > src/share/vm/utilities/nativeCallStack.cpp > > JDK-8133747 fixes: The frame skipping logic that was moved here assumed > that NativeCallStack::NativeCallStack would not appear in the call stack > (due to a tail call be using to call os::get_native_stack) except in > slow debug builds. However, some platforms also don't use a tail call > even when optimized. From what I can tell that is the case for 32-bit > platforms and for windows. > > src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp > src/os_cpu/windows_x86/vm/os_windows_x86.cpp > src/os_cpu/linux_x86/vm/os_linux_x86.cpp > > JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to skip > one extra frame > > src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp > > JDK-8133749 fixes: os:current_frame() was not consistent with other > platforms and needs to skip one more frame. This means it returns the > frame for the caller's caller. So when called by os:get_native_stack(), > it returns the frame for whoever called os::get_native_stack(). Although > not intuitive, this is what os:get_native_stack() expects. Probably a > method rename and/or a behavior change is justified here, but I would > prefer to do that with a followup CR if anyone has a good suggestion on > what to do. > > test/runtime/NMT/CheckForProperDetailStackTrace.java > > This is the new NTM detail test. It checks for frames that shouldn't be > present and validates at least one stack trace is what is expected. > > I verified that the above test now passes on all supported platforms, > and also did a full jprt "-testset hotpot" run. I plan on doing some RBT > testing with NMT detail enabled before committing. > > Regarding the community contributed ports that Oracle does not support, > I didn't make any changes there, but it looks like some of these bugs do > exist. Notably: > > -linux-aarch64: Looks like it suffers from JDK-8133740. The changes done > to the > os_linux_x86.cp should also be applied here. > -linux-ppc: Hard to say for sure since the implementation of > os::current_frame is > different than others, but it looks to me like it suffers from both > JDK-8133749 > and JDK-8133740. > -aix-ppc: Looks to be the same implementation as linux-ppc, so would > need the > same changes. > > These ports may also be suffering from JDK-8133747, but that fix is in > shared code (nativeCallStack.cpp). My changes there will need some > tweaking for these ports they don't use a tail call to call > os::get_native_stack(). > > If the maintainers of these ports could send me some NMT detail output, > I can advise better on what changes are needed. Then you can implement > and test them, and then send them back to me and I'll include them with > my changes. What I need is the following command run on product and > slowdebug builds. Initially run without any of my changes applied. If > needed I may followup with a request that they be run with the changes > applied: > > bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=detail > -XX:+PrintNMTStatistics -version > > thanks, > > Chris > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Thu Aug 18 16:25:05 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Thu, 18 Aug 2016 19:25:05 +0300 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: Message-ID: <2b5776a7-8588-94ed-7716-41ef9dd35d18@oracle.com> PS: (hit send too early, sorry!) 3. nativeCallStack.cpp 43 #if (defined(_NMT_NOINLINE_) && defined(BSD) && defined(_LP64)) Is it OS X specific behavior of clang specific one? Should we change define to __APPLE__ or __clang__, not BSD ? -Dmitry On 2016-08-02 23:31, Chris Plummer wrote: > Hello, > > Please review the following: > > webrev: > http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ > > > Bugs fixed: > > JDK-8133749: os::current_frame() is not returning the proper frame on > ARM and solaris-x64 > https://bugs.openjdk.java.net/browse/JDK-8133749 > > JDK-8133747: NMT includes an extra stack frame due to assumption NMT is > making on tail calls being used > https://bugs.openjdk.java.net/browse/JDK-8133747 > > JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds includes > NativeCallStack::NativeCallStack() frame in backtrace > https://bugs.openjdk.java.net/browse/JDK-8133740 > > The above bugs all result in the NMT detail stack traces including extra > frames in the stack traces. Certain frames are suppose to be skipped, > but sometimes are not. The frames that show up are: > > NativeCallStack::NativeCallStack > os::get_native_stack > > These are both methods used to generate the stack trace, and therefore > should not be included it. However, under some (most) circumstances, > they were. > > Also, there was no test to make sure that any NMT detail output is > generated, or that it is correct. I've added one with this webrev. Of > the 27 possible builds (9 platforms * 3 build flavors), only 9 of the 27 > initially passed this new test. They were the product and fastdebug > builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug > builds for solaris-x64, windows-x86, and windows-x64. All the rest > failed. They now all pass with my fixes in place. > > Here's a summary of the changes: > > src/os/posix/vm/os_posix.cpp > src/os/windows/vm/os_windows.cpp > > JDK-8133747 fixes: There was some frame skipping logic here which was > sort of correct, but was misplace. There are no extra frames being added > in os::get_native_stack() due to lack of inlining or lack of a tail > call, so no need for toSkip++ here. The logic has been moved to > NativeCallStack::NativeCallStack, which is where the tail call is > (sometimes) made, and also corrected (see nativeCallStack.cpp below). > > src/share/vm/utilities/nativeCallStack.cpp > > JDK-8133747 fixes: The frame skipping logic that was moved here assumed > that NativeCallStack::NativeCallStack would not appear in the call stack > (due to a tail call be using to call os::get_native_stack) except in > slow debug builds. However, some platforms also don't use a tail call > even when optimized. From what I can tell that is the case for 32-bit > platforms and for windows. > > src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp > src/os_cpu/windows_x86/vm/os_windows_x86.cpp > src/os_cpu/linux_x86/vm/os_linux_x86.cpp > > JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to skip > one extra frame > > src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp > > JDK-8133749 fixes: os:current_frame() was not consistent with other > platforms and needs to skip one more frame. This means it returns the > frame for the caller's caller. So when called by os:get_native_stack(), > it returns the frame for whoever called os::get_native_stack(). Although > not intuitive, this is what os:get_native_stack() expects. Probably a > method rename and/or a behavior change is justified here, but I would > prefer to do that with a followup CR if anyone has a good suggestion on > what to do. > > test/runtime/NMT/CheckForProperDetailStackTrace.java > > This is the new NTM detail test. It checks for frames that shouldn't be > present and validates at least one stack trace is what is expected. > > I verified that the above test now passes on all supported platforms, > and also did a full jprt "-testset hotpot" run. I plan on doing some RBT > testing with NMT detail enabled before committing. > > Regarding the community contributed ports that Oracle does not support, > I didn't make any changes there, but it looks like some of these bugs do > exist. Notably: > > -linux-aarch64: Looks like it suffers from JDK-8133740. The changes done > to the > os_linux_x86.cp should also be applied here. > -linux-ppc: Hard to say for sure since the implementation of > os::current_frame is > different than others, but it looks to me like it suffers from both > JDK-8133749 > and JDK-8133740. > -aix-ppc: Looks to be the same implementation as linux-ppc, so would > need the > same changes. > > These ports may also be suffering from JDK-8133747, but that fix is in > shared code (nativeCallStack.cpp). My changes there will need some > tweaking for these ports they don't use a tail call to call > os::get_native_stack(). > > If the maintainers of these ports could send me some NMT detail output, > I can advise better on what changes are needed. Then you can implement > and test them, and then send them back to me and I'll include them with > my changes. What I need is the following command run on product and > slowdebug builds. Initially run without any of my changes applied. If > needed I may followup with a request that they be run with the changes > applied: > > bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=detail > -XX:+PrintNMTStatistics -version > > thanks, > > Chris > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From dean.long at oracle.com Thu Aug 18 18:26:41 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 18 Aug 2016 11:26:41 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: Message-ID: <9f44e59e-02b9-c4c6-cc36-a5883e24e5fa@oracle.com> For gcc, would it be useful to use __builtin_frame_address()? https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html dl On 8/18/16 9:18 AM, Dmitry Samersoff wrote: > Chris, > > 1. (general) intptr_t* _get_previous_fp() > > I'm not sure we should rely on the fact that _get_previous_fp is > inlined. AFAIK, gcc doesn't inline function if it has inline assembly. > > So it might be better to mark it explicitly by > __attribute__ ((noinline)) > > 2. os_solaris_x86.cpp:299 > > Should we check os::is_first_C_frame(&myframe) before asking for caller > frame ? > > -Dmitry > > > On 2016-08-02 23:31, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> webrev: >> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >> >> >> Bugs fixed: >> >> JDK-8133749: os::current_frame() is not returning the proper frame on >> ARM and solaris-x64 >> https://bugs.openjdk.java.net/browse/JDK-8133749 >> >> JDK-8133747: NMT includes an extra stack frame due to assumption NMT is >> making on tail calls being used >> https://bugs.openjdk.java.net/browse/JDK-8133747 >> >> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds includes >> NativeCallStack::NativeCallStack() frame in backtrace >> https://bugs.openjdk.java.net/browse/JDK-8133740 >> >> The above bugs all result in the NMT detail stack traces including extra >> frames in the stack traces. Certain frames are suppose to be skipped, >> but sometimes are not. The frames that show up are: >> >> NativeCallStack::NativeCallStack >> os::get_native_stack >> >> These are both methods used to generate the stack trace, and therefore >> should not be included it. However, under some (most) circumstances, >> they were. >> >> Also, there was no test to make sure that any NMT detail output is >> generated, or that it is correct. I've added one with this webrev. Of >> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the 27 >> initially passed this new test. They were the product and fastdebug >> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >> builds for solaris-x64, windows-x86, and windows-x64. All the rest >> failed. They now all pass with my fixes in place. >> >> Here's a summary of the changes: >> >> src/os/posix/vm/os_posix.cpp >> src/os/windows/vm/os_windows.cpp >> >> JDK-8133747 fixes: There was some frame skipping logic here which was >> sort of correct, but was misplace. There are no extra frames being added >> in os::get_native_stack() due to lack of inlining or lack of a tail >> call, so no need for toSkip++ here. The logic has been moved to >> NativeCallStack::NativeCallStack, which is where the tail call is >> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >> >> src/share/vm/utilities/nativeCallStack.cpp >> >> JDK-8133747 fixes: The frame skipping logic that was moved here assumed >> that NativeCallStack::NativeCallStack would not appear in the call stack >> (due to a tail call be using to call os::get_native_stack) except in >> slow debug builds. However, some platforms also don't use a tail call >> even when optimized. From what I can tell that is the case for 32-bit >> platforms and for windows. >> >> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >> >> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to skip >> one extra frame >> >> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >> >> JDK-8133749 fixes: os:current_frame() was not consistent with other >> platforms and needs to skip one more frame. This means it returns the >> frame for the caller's caller. So when called by os:get_native_stack(), >> it returns the frame for whoever called os::get_native_stack(). Although >> not intuitive, this is what os:get_native_stack() expects. Probably a >> method rename and/or a behavior change is justified here, but I would >> prefer to do that with a followup CR if anyone has a good suggestion on >> what to do. >> >> test/runtime/NMT/CheckForProperDetailStackTrace.java >> >> This is the new NTM detail test. It checks for frames that shouldn't be >> present and validates at least one stack trace is what is expected. >> >> I verified that the above test now passes on all supported platforms, >> and also did a full jprt "-testset hotpot" run. I plan on doing some RBT >> testing with NMT detail enabled before committing. >> >> Regarding the community contributed ports that Oracle does not support, >> I didn't make any changes there, but it looks like some of these bugs do >> exist. Notably: >> >> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes done >> to the >> os_linux_x86.cp should also be applied here. >> -linux-ppc: Hard to say for sure since the implementation of >> os::current_frame is >> different than others, but it looks to me like it suffers from both >> JDK-8133749 >> and JDK-8133740. >> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >> need the >> same changes. >> >> These ports may also be suffering from JDK-8133747, but that fix is in >> shared code (nativeCallStack.cpp). My changes there will need some >> tweaking for these ports they don't use a tail call to call >> os::get_native_stack(). >> >> If the maintainers of these ports could send me some NMT detail output, >> I can advise better on what changes are needed. Then you can implement >> and test them, and then send them back to me and I'll include them with >> my changes. What I need is the following command run on product and >> slowdebug builds. Initially run without any of my changes applied. If >> needed I may followup with a request that they be run with the changes >> applied: >> >> bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=detail >> -XX:+PrintNMTStatistics -version >> >> thanks, >> >> Chris >> > From chris.plummer at oracle.com Thu Aug 18 19:42:55 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 18 Aug 2016 12:42:55 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: References: Message-ID: <1a4ef573-3924-3d2b-73c5-4829b7e32b9c@oracle.com> Hi Dmitry, These changes were already pushed last week. Chris was just responding to a request I had to see if any of the open ports also needed any of the fixes that were done for other platforms. On 8/18/16 9:18 AM, Dmitry Samersoff wrote: > Chris, > > 1. (general) intptr_t* _get_previous_fp() > > I'm not sure we should rely on the fact that _get_previous_fp is > inlined. It would be nice to not rely on gcc for inlining (or not inlining), but unfortunately the main theme of these bugs is fixing cases were we are assuming, and the assumption is wrong. The following CR was filed to address in general the fragility of relying on what compilers are doing for inlining: JDK-8163899 NMT frame skipping code is fragile > AFAIK, gcc doesn't inline function if it has inline assembly. That wasn't my observation. If it were true, the test wouldn't be passing on both debug and non-debug builds. > > So it might be better to mark it explicitly by > __attribute__ ((noinline)) I think what actually would be best is to get rid of _get_previous_fp() and move it's logic into os:current_frame(). That's pointed out in another related CR I filed: JDK-8163900 os::current_frame has a misleading name > > 2. os_solaris_x86.cpp:299 > > Should we check os::is_first_C_frame(&myframe) before asking for caller > frame ? The other platforms don't do this. At best maybe they all should do so as an assert to make sure the logic isn't flawed. If the logic is flawed, it will crash, so the assert would just catch the reason earlier. thanks, Chris > > -Dmitry > > > On 2016-08-02 23:31, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> webrev: >> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >> >> >> Bugs fixed: >> >> JDK-8133749: os::current_frame() is not returning the proper frame on >> ARM and solaris-x64 >> https://bugs.openjdk.java.net/browse/JDK-8133749 >> >> JDK-8133747: NMT includes an extra stack frame due to assumption NMT is >> making on tail calls being used >> https://bugs.openjdk.java.net/browse/JDK-8133747 >> >> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds includes >> NativeCallStack::NativeCallStack() frame in backtrace >> https://bugs.openjdk.java.net/browse/JDK-8133740 >> >> The above bugs all result in the NMT detail stack traces including extra >> frames in the stack traces. Certain frames are suppose to be skipped, >> but sometimes are not. The frames that show up are: >> >> NativeCallStack::NativeCallStack >> os::get_native_stack >> >> These are both methods used to generate the stack trace, and therefore >> should not be included it. However, under some (most) circumstances, >> they were. >> >> Also, there was no test to make sure that any NMT detail output is >> generated, or that it is correct. I've added one with this webrev. Of >> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the 27 >> initially passed this new test. They were the product and fastdebug >> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >> builds for solaris-x64, windows-x86, and windows-x64. All the rest >> failed. They now all pass with my fixes in place. >> >> Here's a summary of the changes: >> >> src/os/posix/vm/os_posix.cpp >> src/os/windows/vm/os_windows.cpp >> >> JDK-8133747 fixes: There was some frame skipping logic here which was >> sort of correct, but was misplace. There are no extra frames being added >> in os::get_native_stack() due to lack of inlining or lack of a tail >> call, so no need for toSkip++ here. The logic has been moved to >> NativeCallStack::NativeCallStack, which is where the tail call is >> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >> >> src/share/vm/utilities/nativeCallStack.cpp >> >> JDK-8133747 fixes: The frame skipping logic that was moved here assumed >> that NativeCallStack::NativeCallStack would not appear in the call stack >> (due to a tail call be using to call os::get_native_stack) except in >> slow debug builds. However, some platforms also don't use a tail call >> even when optimized. From what I can tell that is the case for 32-bit >> platforms and for windows. >> >> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >> >> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to skip >> one extra frame >> >> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >> >> JDK-8133749 fixes: os:current_frame() was not consistent with other >> platforms and needs to skip one more frame. This means it returns the >> frame for the caller's caller. So when called by os:get_native_stack(), >> it returns the frame for whoever called os::get_native_stack(). Although >> not intuitive, this is what os:get_native_stack() expects. Probably a >> method rename and/or a behavior change is justified here, but I would >> prefer to do that with a followup CR if anyone has a good suggestion on >> what to do. >> >> test/runtime/NMT/CheckForProperDetailStackTrace.java >> >> This is the new NTM detail test. It checks for frames that shouldn't be >> present and validates at least one stack trace is what is expected. >> >> I verified that the above test now passes on all supported platforms, >> and also did a full jprt "-testset hotpot" run. I plan on doing some RBT >> testing with NMT detail enabled before committing. >> >> Regarding the community contributed ports that Oracle does not support, >> I didn't make any changes there, but it looks like some of these bugs do >> exist. Notably: >> >> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes done >> to the >> os_linux_x86.cp should also be applied here. >> -linux-ppc: Hard to say for sure since the implementation of >> os::current_frame is >> different than others, but it looks to me like it suffers from both >> JDK-8133749 >> and JDK-8133740. >> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >> need the >> same changes. >> >> These ports may also be suffering from JDK-8133747, but that fix is in >> shared code (nativeCallStack.cpp). My changes there will need some >> tweaking for these ports they don't use a tail call to call >> os::get_native_stack(). >> >> If the maintainers of these ports could send me some NMT detail output, >> I can advise better on what changes are needed. Then you can implement >> and test them, and then send them back to me and I'll include them with >> my changes. What I need is the following command run on product and >> slowdebug builds. Initially run without any of my changes applied. If >> needed I may followup with a request that they be run with the changes >> applied: >> >> bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=detail >> -XX:+PrintNMTStatistics -version >> >> thanks, >> >> Chris >> > From chris.plummer at oracle.com Thu Aug 18 19:49:16 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 18 Aug 2016 12:49:16 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <2b5776a7-8588-94ed-7716-41ef9dd35d18@oracle.com> References: <2b5776a7-8588-94ed-7716-41ef9dd35d18@oracle.com> Message-ID: <034f4cae-d164-2d7c-cdfc-e20a06e8a1e1@oracle.com> Hi Dmitry, On 8/18/16 9:25 AM, Dmitry Samersoff wrote: > PS: (hit send too early, sorry!) > > 3. nativeCallStack.cpp > > 43 #if (defined(_NMT_NOINLINE_) && defined(BSD) && defined(_LP64)) > > Is it OS X specific behavior of clang specific one? Should we change > define to __APPLE__ or __clang__, not BSD ? I know for the compilers we use for officially released builds, the code works with my changes. If changes are needed for other compilers, then the code will need to be updated for them. As of now I don't know which compiler we use for Mac OS X, and which others, if any, are also supported. I just know when submitting builds with JPRT, the code is correct for Mac OS X. Yes, these assumptions on what compilers are doing are ugly, but we already had these assumption before any of my changes. I've just fixed the assumptions that were incorrect, and added a test so we quickly find incorrect assumptions in the future. thanks, Chris > > -Dmitry > > On 2016-08-02 23:31, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> webrev: >> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >> >> >> Bugs fixed: >> >> JDK-8133749: os::current_frame() is not returning the proper frame on >> ARM and solaris-x64 >> https://bugs.openjdk.java.net/browse/JDK-8133749 >> >> JDK-8133747: NMT includes an extra stack frame due to assumption NMT is >> making on tail calls being used >> https://bugs.openjdk.java.net/browse/JDK-8133747 >> >> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds includes >> NativeCallStack::NativeCallStack() frame in backtrace >> https://bugs.openjdk.java.net/browse/JDK-8133740 >> >> The above bugs all result in the NMT detail stack traces including extra >> frames in the stack traces. Certain frames are suppose to be skipped, >> but sometimes are not. The frames that show up are: >> >> NativeCallStack::NativeCallStack >> os::get_native_stack >> >> These are both methods used to generate the stack trace, and therefore >> should not be included it. However, under some (most) circumstances, >> they were. >> >> Also, there was no test to make sure that any NMT detail output is >> generated, or that it is correct. I've added one with this webrev. Of >> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the 27 >> initially passed this new test. They were the product and fastdebug >> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >> builds for solaris-x64, windows-x86, and windows-x64. All the rest >> failed. They now all pass with my fixes in place. >> >> Here's a summary of the changes: >> >> src/os/posix/vm/os_posix.cpp >> src/os/windows/vm/os_windows.cpp >> >> JDK-8133747 fixes: There was some frame skipping logic here which was >> sort of correct, but was misplace. There are no extra frames being added >> in os::get_native_stack() due to lack of inlining or lack of a tail >> call, so no need for toSkip++ here. The logic has been moved to >> NativeCallStack::NativeCallStack, which is where the tail call is >> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >> >> src/share/vm/utilities/nativeCallStack.cpp >> >> JDK-8133747 fixes: The frame skipping logic that was moved here assumed >> that NativeCallStack::NativeCallStack would not appear in the call stack >> (due to a tail call be using to call os::get_native_stack) except in >> slow debug builds. However, some platforms also don't use a tail call >> even when optimized. From what I can tell that is the case for 32-bit >> platforms and for windows. >> >> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >> >> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to skip >> one extra frame >> >> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >> >> JDK-8133749 fixes: os:current_frame() was not consistent with other >> platforms and needs to skip one more frame. This means it returns the >> frame for the caller's caller. So when called by os:get_native_stack(), >> it returns the frame for whoever called os::get_native_stack(). Although >> not intuitive, this is what os:get_native_stack() expects. Probably a >> method rename and/or a behavior change is justified here, but I would >> prefer to do that with a followup CR if anyone has a good suggestion on >> what to do. >> >> test/runtime/NMT/CheckForProperDetailStackTrace.java >> >> This is the new NTM detail test. It checks for frames that shouldn't be >> present and validates at least one stack trace is what is expected. >> >> I verified that the above test now passes on all supported platforms, >> and also did a full jprt "-testset hotpot" run. I plan on doing some RBT >> testing with NMT detail enabled before committing. >> >> Regarding the community contributed ports that Oracle does not support, >> I didn't make any changes there, but it looks like some of these bugs do >> exist. Notably: >> >> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes done >> to the >> os_linux_x86.cp should also be applied here. >> -linux-ppc: Hard to say for sure since the implementation of >> os::current_frame is >> different than others, but it looks to me like it suffers from both >> JDK-8133749 >> and JDK-8133740. >> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >> need the >> same changes. >> >> These ports may also be suffering from JDK-8133747, but that fix is in >> shared code (nativeCallStack.cpp). My changes there will need some >> tweaking for these ports they don't use a tail call to call >> os::get_native_stack(). >> >> If the maintainers of these ports could send me some NMT detail output, >> I can advise better on what changes are needed. Then you can implement >> and test them, and then send them back to me and I'll include them with >> my changes. What I need is the following command run on product and >> slowdebug builds. Initially run without any of my changes applied. If >> needed I may followup with a request that they be run with the changes >> applied: >> >> bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=detail >> -XX:+PrintNMTStatistics -version >> >> thanks, >> >> Chris >> > From chris.plummer at oracle.com Thu Aug 18 19:50:50 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 18 Aug 2016 12:50:50 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <1a4ef573-3924-3d2b-73c5-4829b7e32b9c@oracle.com> References: <1a4ef573-3924-3d2b-73c5-4829b7e32b9c@oracle.com> Message-ID: On 8/18/16 12:42 PM, Chris Plummer wrote: > Hi Dmitry, > > These changes were already pushed last week. Chris was just responding > to a request I had to see if any of the open ports also needed any of > the fixes that were done for other platforms. Sorry, I meant Andrew, not Chris. Chris > > On 8/18/16 9:18 AM, Dmitry Samersoff wrote: >> Chris, >> >> 1. (general) intptr_t* _get_previous_fp() >> >> I'm not sure we should rely on the fact that _get_previous_fp is >> inlined. > It would be nice to not rely on gcc for inlining (or not inlining), > but unfortunately the main theme of these bugs is fixing cases were we > are assuming, and the assumption is wrong. The following CR was filed > to address in general the fragility of relying on what compilers are > doing for inlining: > > JDK-8163899 NMT frame skipping code is fragile >> AFAIK, gcc doesn't inline function if it has inline assembly. > > That wasn't my observation. If it were true, the test wouldn't be > passing on both debug and non-debug builds. > >> >> So it might be better to mark it explicitly by >> __attribute__ ((noinline)) > I think what actually would be best is to get rid of > _get_previous_fp() and move it's logic into os:current_frame(). That's > pointed out in another related CR I filed: > > JDK-8163900 os::current_frame has a misleading name >> >> 2. os_solaris_x86.cpp:299 >> >> Should we check os::is_first_C_frame(&myframe) before asking for caller >> frame ? > The other platforms don't do this. At best maybe they all should do so > as an assert to make sure the logic isn't flawed. If the logic is > flawed, it will crash, so the assert would just catch the reason earlier. > > thanks, > > Chris >> >> -Dmitry >> >> >> On 2016-08-02 23:31, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following: >>> >>> webrev: >>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>> >>> >>> >>> Bugs fixed: >>> >>> JDK-8133749: os::current_frame() is not returning the proper frame on >>> ARM and solaris-x64 >>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>> >>> JDK-8133747: NMT includes an extra stack frame due to assumption NMT is >>> making on tail calls being used >>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>> >>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>> includes >>> NativeCallStack::NativeCallStack() frame in backtrace >>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>> >>> The above bugs all result in the NMT detail stack traces including >>> extra >>> frames in the stack traces. Certain frames are suppose to be skipped, >>> but sometimes are not. The frames that show up are: >>> >>> NativeCallStack::NativeCallStack >>> os::get_native_stack >>> >>> These are both methods used to generate the stack trace, and therefore >>> should not be included it. However, under some (most) circumstances, >>> they were. >>> >>> Also, there was no test to make sure that any NMT detail output is >>> generated, or that it is correct. I've added one with this webrev. Of >>> the 27 possible builds (9 platforms * 3 build flavors), only 9 of >>> the 27 >>> initially passed this new test. They were the product and fastdebug >>> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >>> builds for solaris-x64, windows-x86, and windows-x64. All the rest >>> failed. They now all pass with my fixes in place. >>> >>> Here's a summary of the changes: >>> >>> src/os/posix/vm/os_posix.cpp >>> src/os/windows/vm/os_windows.cpp >>> >>> JDK-8133747 fixes: There was some frame skipping logic here which was >>> sort of correct, but was misplace. There are no extra frames being >>> added >>> in os::get_native_stack() due to lack of inlining or lack of a tail >>> call, so no need for toSkip++ here. The logic has been moved to >>> NativeCallStack::NativeCallStack, which is where the tail call is >>> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >>> >>> src/share/vm/utilities/nativeCallStack.cpp >>> >>> JDK-8133747 fixes: The frame skipping logic that was moved here assumed >>> that NativeCallStack::NativeCallStack would not appear in the call >>> stack >>> (due to a tail call be using to call os::get_native_stack) except in >>> slow debug builds. However, some platforms also don't use a tail call >>> even when optimized. From what I can tell that is the case for 32-bit >>> platforms and for windows. >>> >>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>> >>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >>> skip >>> one extra frame >>> >>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>> >>> JDK-8133749 fixes: os:current_frame() was not consistent with other >>> platforms and needs to skip one more frame. This means it returns the >>> frame for the caller's caller. So when called by os:get_native_stack(), >>> it returns the frame for whoever called os::get_native_stack(). >>> Although >>> not intuitive, this is what os:get_native_stack() expects. Probably a >>> method rename and/or a behavior change is justified here, but I would >>> prefer to do that with a followup CR if anyone has a good suggestion on >>> what to do. >>> >>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>> >>> This is the new NTM detail test. It checks for frames that shouldn't be >>> present and validates at least one stack trace is what is expected. >>> >>> I verified that the above test now passes on all supported platforms, >>> and also did a full jprt "-testset hotpot" run. I plan on doing some >>> RBT >>> testing with NMT detail enabled before committing. >>> >>> Regarding the community contributed ports that Oracle does not support, >>> I didn't make any changes there, but it looks like some of these >>> bugs do >>> exist. Notably: >>> >>> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >>> done >>> to the >>> os_linux_x86.cp should also be applied here. >>> -linux-ppc: Hard to say for sure since the implementation of >>> os::current_frame is >>> different than others, but it looks to me like it suffers from both >>> JDK-8133749 >>> and JDK-8133740. >>> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >>> need the >>> same changes. >>> >>> These ports may also be suffering from JDK-8133747, but that fix is in >>> shared code (nativeCallStack.cpp). My changes there will need some >>> tweaking for these ports they don't use a tail call to call >>> os::get_native_stack(). >>> >>> If the maintainers of these ports could send me some NMT detail output, >>> I can advise better on what changes are needed. Then you can implement >>> and test them, and then send them back to me and I'll include them with >>> my changes. What I need is the following command run on product and >>> slowdebug builds. Initially run without any of my changes applied. If >>> needed I may followup with a request that they be run with the changes >>> applied: >>> >>> bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=detail >>> -XX:+PrintNMTStatistics -version >>> >>> thanks, >>> >>> Chris >>> >> > From chris.plummer at oracle.com Thu Aug 18 20:02:57 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 18 Aug 2016 13:02:57 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <03fbcd1d-0ad2-51d0-979d-dc1186ad1c08@redhat.com> References: <03fbcd1d-0ad2-51d0-979d-dc1186ad1c08@redhat.com> Message-ID: On 8/18/16 8:47 AM, Andrew Haley wrote: > On 02/08/16 21:31, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> webrev: >> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >> >> Bugs fixed: >> >> JDK-8133749: os::current_frame() is not returning the proper frame on >> ARM and solaris-x64 >> https://bugs.openjdk.java.net/browse/JDK-8133749 >> >> JDK-8133747: NMT includes an extra stack frame due to assumption NMT is >> making on tail calls being used >> https://bugs.openjdk.java.net/browse/JDK-8133747 >> >> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds includes >> NativeCallStack::NativeCallStack() frame in backtrace >> https://bugs.openjdk.java.net/browse/JDK-8133740 > I don't seem to get any test failures on AArch64. I take it that the > above patch is in jdk9/hs for other architectures; the test case > certainly is. > > Andrew. > Hi Andrew, Did you try the new test with a slowdebug build? Is looked to me like it would fail on aarch64. Yes, this patch is only meant for jdk9. I'm not sure what it would take to backport to 8. thanks, Chris From chris.plummer at oracle.com Thu Aug 18 23:53:18 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 18 Aug 2016 16:53:18 -0700 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <9f44e59e-02b9-c4c6-cc36-a5883e24e5fa@oracle.com> References: <9f44e59e-02b9-c4c6-cc36-a5883e24e5fa@oracle.com> Message-ID: That might be a better choice than trying to use inline assembly to grab fp. I'll add a comment to JDK-8163900. thanks, Chris On 8/18/16 11:26 AM, dean.long at oracle.com wrote: > For gcc, would it be useful to use __builtin_frame_address()? > > https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html > > dl > > > On 8/18/16 9:18 AM, Dmitry Samersoff wrote: >> Chris, >> >> 1. (general) intptr_t* _get_previous_fp() >> >> I'm not sure we should rely on the fact that _get_previous_fp is >> inlined. AFAIK, gcc doesn't inline function if it has inline assembly. >> >> So it might be better to mark it explicitly by >> __attribute__ ((noinline)) >> >> 2. os_solaris_x86.cpp:299 >> >> Should we check os::is_first_C_frame(&myframe) before asking for caller >> frame ? >> >> -Dmitry >> >> >> On 2016-08-02 23:31, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following: >>> >>> webrev: >>> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/ >>> >>> >>> >>> Bugs fixed: >>> >>> JDK-8133749: os::current_frame() is not returning the proper frame on >>> ARM and solaris-x64 >>> https://bugs.openjdk.java.net/browse/JDK-8133749 >>> >>> JDK-8133747: NMT includes an extra stack frame due to assumption NMT is >>> making on tail calls being used >>> https://bugs.openjdk.java.net/browse/JDK-8133747 >>> >>> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds >>> includes >>> NativeCallStack::NativeCallStack() frame in backtrace >>> https://bugs.openjdk.java.net/browse/JDK-8133740 >>> >>> The above bugs all result in the NMT detail stack traces including >>> extra >>> frames in the stack traces. Certain frames are suppose to be skipped, >>> but sometimes are not. The frames that show up are: >>> >>> NativeCallStack::NativeCallStack >>> os::get_native_stack >>> >>> These are both methods used to generate the stack trace, and therefore >>> should not be included it. However, under some (most) circumstances, >>> they were. >>> >>> Also, there was no test to make sure that any NMT detail output is >>> generated, or that it is correct. I've added one with this webrev. Of >>> the 27 possible builds (9 platforms * 3 build flavors), only 9 of >>> the 27 >>> initially passed this new test. They were the product and fastdebug >>> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug >>> builds for solaris-x64, windows-x86, and windows-x64. All the rest >>> failed. They now all pass with my fixes in place. >>> >>> Here's a summary of the changes: >>> >>> src/os/posix/vm/os_posix.cpp >>> src/os/windows/vm/os_windows.cpp >>> >>> JDK-8133747 fixes: There was some frame skipping logic here which was >>> sort of correct, but was misplace. There are no extra frames being >>> added >>> in os::get_native_stack() due to lack of inlining or lack of a tail >>> call, so no need for toSkip++ here. The logic has been moved to >>> NativeCallStack::NativeCallStack, which is where the tail call is >>> (sometimes) made, and also corrected (see nativeCallStack.cpp below). >>> >>> src/share/vm/utilities/nativeCallStack.cpp >>> >>> JDK-8133747 fixes: The frame skipping logic that was moved here assumed >>> that NativeCallStack::NativeCallStack would not appear in the call >>> stack >>> (due to a tail call be using to call os::get_native_stack) except in >>> slow debug builds. However, some platforms also don't use a tail call >>> even when optimized. From what I can tell that is the case for 32-bit >>> platforms and for windows. >>> >>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp >>> src/os_cpu/windows_x86/vm/os_windows_x86.cpp >>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp >>> >>> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to >>> skip >>> one extra frame >>> >>> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp >>> >>> JDK-8133749 fixes: os:current_frame() was not consistent with other >>> platforms and needs to skip one more frame. This means it returns the >>> frame for the caller's caller. So when called by os:get_native_stack(), >>> it returns the frame for whoever called os::get_native_stack(). >>> Although >>> not intuitive, this is what os:get_native_stack() expects. Probably a >>> method rename and/or a behavior change is justified here, but I would >>> prefer to do that with a followup CR if anyone has a good suggestion on >>> what to do. >>> >>> test/runtime/NMT/CheckForProperDetailStackTrace.java >>> >>> This is the new NTM detail test. It checks for frames that shouldn't be >>> present and validates at least one stack trace is what is expected. >>> >>> I verified that the above test now passes on all supported platforms, >>> and also did a full jprt "-testset hotpot" run. I plan on doing some >>> RBT >>> testing with NMT detail enabled before committing. >>> >>> Regarding the community contributed ports that Oracle does not support, >>> I didn't make any changes there, but it looks like some of these >>> bugs do >>> exist. Notably: >>> >>> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes >>> done >>> to the >>> os_linux_x86.cp should also be applied here. >>> -linux-ppc: Hard to say for sure since the implementation of >>> os::current_frame is >>> different than others, but it looks to me like it suffers from both >>> JDK-8133749 >>> and JDK-8133740. >>> -aix-ppc: Looks to be the same implementation as linux-ppc, so would >>> need the >>> same changes. >>> >>> These ports may also be suffering from JDK-8133747, but that fix is in >>> shared code (nativeCallStack.cpp). My changes there will need some >>> tweaking for these ports they don't use a tail call to call >>> os::get_native_stack(). >>> >>> If the maintainers of these ports could send me some NMT detail output, >>> I can advise better on what changes are needed. Then you can implement >>> and test them, and then send them back to me and I'll include them with >>> my changes. What I need is the following command run on product and >>> slowdebug builds. Initially run without any of my changes applied. If >>> needed I may followup with a request that they be run with the changes >>> applied: >>> >>> bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=detail >>> -XX:+PrintNMTStatistics -version >>> >>> thanks, >>> >>> Chris >>> >> > From david.holmes at oracle.com Fri Aug 19 00:33:57 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Aug 2016 10:33:57 +1000 Subject: (XS) RFR: 8152849: share/vm/runtime/mutex.cpp:1161 assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed In-Reply-To: References: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> <425f88af-b248-e786-ac60-8f9c9d7d1a13@oracle.com> Message-ID: <827caf69-ded4-34a4-5458-e331dbcbae38@oracle.com> Thanks Dan! Still need a second reviewer please. It's really simple :) David On 19/08/2016 12:04 AM, Daniel D. Daugherty wrote: > On 8/17/16 6:04 PM, David Holmes wrote: >> Hi Dan, >> >> Thanks for looking at this. > > No problem. I'm pretty sure I once chased the older version > of this bug... :-) > > >> >> On 17/08/2016 10:59 PM, Daniel D. Daugherty wrote: >>> On 8/17/16 1:45 AM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8152849 >>>> >>>> webrev: http://cr.openjdk.java.net/~dholmes/8152849/webrev/ >>> >>> src/share/vm/runtime/mutex.cpp >>> If the problem is a racy clearing of one of the fields, then the >>> assert can still fire and the extra info might still show zeroes >>> since they are two different queries of the fields. >>> >>> For this diagnostic to be always accurate you need to save a copy >>> of each field, assert() that all the copies or'ed together are == 0, >>> and have the extra info printed from the copies. >> >> You are right of course. Don't know what I was thinking. :( >> >> http://cr.openjdk.java.net/~dholmes/8152849/webrev.v2/ > > src/share/vm/runtime/mutex.cpp > No comments. > > Thumbs up! > > Dan > > > >> >> Thanks, >> David >> >>> Dan >>> >>> >>>> >>>> This is a rare assertion failure that has proven to unreproducible >>>> even by directly trying to exercise the theoretical race conditions >>>> mentioned in the bug report. All I can do for now is augment the >>>> assert to print out the various values so we can at least see where >>>> things are failing, next time it happens. >>>> >>>> Example output: >>>> >>>> # Internal Error >>>> (/scratch/dh198349/jdk9-hs/hotspot/src/share/vm/runtime/mutex.cpp:1157), >>>> >>>> pid=21732, tid=21734 >>>> # >>>> assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) >>>> >>>> == 0) failed: >>>> _owner(0x0000000000000000)|_LockWord(0x0000000000000000)|_EntryList(0x0000000000000000)|_WaitSet(0x0000000000000000)|_OnDeck(0x00000000deaddead) >>>> >>>> != 0 >>>> >>>> Thanks, >>>> David >>> > From daniel.daugherty at oracle.com Fri Aug 19 00:40:45 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 18 Aug 2016 18:40:45 -0600 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: <2b8eb97e-3e8d-31ac-7507-af7489855f43@oracle.com> References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> <2b8eb97e-3e8d-31ac-7507-af7489855f43@oracle.com> Message-ID: This is very nicely done and a very good find! I think this will make AsyncGetCallTrace() on X86/X64 more stable. Dan On 8/16/16 10:00 AM, dean.long at oracle.com wrote: > Thanks Coleen. > > dl > > > On 8/16/16 7:52 AM, Coleen Phillimore wrote: >> >> I think this looks good. >> >> Coleen >> >> >> On 8/15/16 1:57 PM, dean.long at oracle.com wrote: >>> Thanks Fred. >>> >>> Still waiting for a Reviewer. >>> >>> dl >>> >>> >>> On 8/15/16 6:26 AM, Frederic Parain wrote: >>>> Thank you, >>>> >>>> Looks good to me. >>>> >>>> Fred >>>> >>>> On 08/12/2016 04:19 PM, dean.long at oracle.com wrote: >>>>> Sure: >>>>> >>>>> http://cr.openjdk.java.net/~dlong/8161598/webrev.1/ >>>>> >>>>> dl >>>>> >>>>> >>>>> On 8/12/16 6:46 AM, Frederic Parain wrote: >>>>>> Dean, >>>>>> >>>>>> In file macroAssembler_x86.cpp, could it be possible to >>>>>> get rid of the clear_pc argument? It seems completely >>>>>> useless now. >>>>>> >>>>>> Fred >>>>>> >>>>>> >>>>>> On 08/09/2016 01:39 PM, dean.long at oracle.com wrote: >>>>>>> Ping. >>>>>>> >>>>>>> dl >>>>>>> >>>>>>> >>>>>>> On 8/4/16 3:28 PM, dean.long at oracle.com wrote: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161598 >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dlong/8161598/webrev/ >>>>>>>> >>>>>>>> Sorry, this issue is Confidential. The problem is similar to >>>>>>>> 8029441, >>>>>>>> where we suspend a thread and use >>>>>>>> pd_get_top_frame_for_profiling() to >>>>>>>> get the top frame for stack walking. The problem is "last Java >>>>>>>> frame" >>>>>>>> anchor frames on x86. In lots of places we do not store >>>>>>>> last_Java_pc. >>>>>>>> This is OK in the synchronous stack walk case done by the current >>>>>>>> thread. But in the asynchronous case, there are small windows >>>>>>>> where >>>>>>>> it's not always safe to get PC from sp[-1]. >>>>>>>> >>>>>>>> The solution is not to treat x86 anchor frames as "always >>>>>>>> walkable". >>>>>>>> Instead, we follow the example of sparc and make them walking by >>>>>>>> filling in last_Java_pc when it's safe. >>>>>>>> >>>>>>>> I went for the minimal fix, resetting clear_pc to true in >>>>>>>> reset_last_Java_frame() but not changing the API and all the >>>>>>>> callers. >>>>>>>> I can fix this if reviewers feel strongly about it. >>>>>>>> >>>>>>>> dl >>>>>>>> >>>>>>> >>>>> >>> >> > > From daniel.daugherty at oracle.com Fri Aug 19 00:44:03 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 18 Aug 2016 18:44:03 -0600 Subject: (XS) RFR: 8152849: share/vm/runtime/mutex.cpp:1161 assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed In-Reply-To: <827caf69-ded4-34a4-5458-e331dbcbae38@oracle.com> References: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> <425f88af-b248-e786-ac60-8f9c9d7d1a13@oracle.com> <827caf69-ded4-34a4-5458-e331dbcbae38@oracle.com> Message-ID: <7d6e9399-829f-a435-a052-5dbca8b97b6e@oracle.com> I propose this falls under the HotSpot trivial change rule... :-) Dan On 8/18/16 6:33 PM, David Holmes wrote: > Thanks Dan! > > Still need a second reviewer please. It's really simple :) > > David > > On 19/08/2016 12:04 AM, Daniel D. Daugherty wrote: >> On 8/17/16 6:04 PM, David Holmes wrote: >>> Hi Dan, >>> >>> Thanks for looking at this. >> >> No problem. I'm pretty sure I once chased the older version >> of this bug... :-) >> >> >>> >>> On 17/08/2016 10:59 PM, Daniel D. Daugherty wrote: >>>> On 8/17/16 1:45 AM, David Holmes wrote: >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8152849 >>>>> >>>>> webrev: http://cr.openjdk.java.net/~dholmes/8152849/webrev/ >>>> >>>> src/share/vm/runtime/mutex.cpp >>>> If the problem is a racy clearing of one of the fields, then the >>>> assert can still fire and the extra info might still show zeroes >>>> since they are two different queries of the fields. >>>> >>>> For this diagnostic to be always accurate you need to save a copy >>>> of each field, assert() that all the copies or'ed together are >>>> == 0, >>>> and have the extra info printed from the copies. >>> >>> You are right of course. Don't know what I was thinking. :( >>> >>> http://cr.openjdk.java.net/~dholmes/8152849/webrev.v2/ >> >> src/share/vm/runtime/mutex.cpp >> No comments. >> >> Thumbs up! >> >> Dan >> >> >> >>> >>> Thanks, >>> David >>> >>>> Dan >>>> >>>> >>>>> >>>>> This is a rare assertion failure that has proven to unreproducible >>>>> even by directly trying to exercise the theoretical race conditions >>>>> mentioned in the bug report. All I can do for now is augment the >>>>> assert to print out the various values so we can at least see where >>>>> things are failing, next time it happens. >>>>> >>>>> Example output: >>>>> >>>>> # Internal Error >>>>> (/scratch/dh198349/jdk9-hs/hotspot/src/share/vm/runtime/mutex.cpp:1157), >>>>> >>>>> >>>>> pid=21732, tid=21734 >>>>> # >>>>> assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) >>>>> >>>>> >>>>> == 0) failed: >>>>> _owner(0x0000000000000000)|_LockWord(0x0000000000000000)|_EntryList(0x0000000000000000)|_WaitSet(0x0000000000000000)|_OnDeck(0x00000000deaddead) >>>>> >>>>> >>>>> != 0 >>>>> >>>>> Thanks, >>>>> David >>>> >> From david.holmes at oracle.com Fri Aug 19 01:09:46 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Aug 2016 11:09:46 +1000 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> Message-ID: <890aae8f-0cd7-43d2-3e34-f6f0131a8ceb@oracle.com> On 19/08/2016 12:15 AM, Aleksey Shipilev wrote: > On 08/18/2016 05:50 AM, David Holmes wrote: >> webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ > > Looks good to me. Minor nit: Thanks for the review Aleksey! > *) inline declarations have different indenting, is that our code style? No it's my dumb emacs that insists on tabbing to 4 instead of 2 when I'm not looking :) Indent fixed there and elsewhere. webrev updated in place for others. Thanks, David > 370 Klass* array_klasses() const { return _array_klasses; } > 371 inline Klass* array_klasses_acquire() const; // load with > acquire semantics > 372 void set_array_klasses(Klass* k) { _array_klasses = k; } > 373 inline void release_set_array_klasses(Klass* k); // store with > release semantics > > Thanks, > -Aleksey > From david.holmes at oracle.com Fri Aug 19 01:34:19 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Aug 2016 11:34:19 +1000 Subject: (XS) RFR: 8152849: share/vm/runtime/mutex.cpp:1161 assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed In-Reply-To: <7d6e9399-829f-a435-a052-5dbca8b97b6e@oracle.com> References: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> <425f88af-b248-e786-ac60-8f9c9d7d1a13@oracle.com> <827caf69-ded4-34a4-5458-e331dbcbae38@oracle.com> <7d6e9399-829f-a435-a052-5dbca8b97b6e@oracle.com> Message-ID: On 19/08/2016 10:44 AM, Daniel D. Daugherty wrote: > I propose this falls under the HotSpot trivial change rule... :-) I like that proposal! Done. :) Thanks, David > Dan > > On 8/18/16 6:33 PM, David Holmes wrote: >> Thanks Dan! >> >> Still need a second reviewer please. It's really simple :) >> >> David >> >> On 19/08/2016 12:04 AM, Daniel D. Daugherty wrote: >>> On 8/17/16 6:04 PM, David Holmes wrote: >>>> Hi Dan, >>>> >>>> Thanks for looking at this. >>> >>> No problem. I'm pretty sure I once chased the older version >>> of this bug... :-) >>> >>> >>>> >>>> On 17/08/2016 10:59 PM, Daniel D. Daugherty wrote: >>>>> On 8/17/16 1:45 AM, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8152849 >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8152849/webrev/ >>>>> >>>>> src/share/vm/runtime/mutex.cpp >>>>> If the problem is a racy clearing of one of the fields, then the >>>>> assert can still fire and the extra info might still show zeroes >>>>> since they are two different queries of the fields. >>>>> >>>>> For this diagnostic to be always accurate you need to save a copy >>>>> of each field, assert() that all the copies or'ed together are >>>>> == 0, >>>>> and have the extra info printed from the copies. >>>> >>>> You are right of course. Don't know what I was thinking. :( >>>> >>>> http://cr.openjdk.java.net/~dholmes/8152849/webrev.v2/ >>> >>> src/share/vm/runtime/mutex.cpp >>> No comments. >>> >>> Thumbs up! >>> >>> Dan >>> >>> >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> This is a rare assertion failure that has proven to unreproducible >>>>>> even by directly trying to exercise the theoretical race conditions >>>>>> mentioned in the bug report. All I can do for now is augment the >>>>>> assert to print out the various values so we can at least see where >>>>>> things are failing, next time it happens. >>>>>> >>>>>> Example output: >>>>>> >>>>>> # Internal Error >>>>>> (/scratch/dh198349/jdk9-hs/hotspot/src/share/vm/runtime/mutex.cpp:1157), >>>>>> >>>>>> >>>>>> pid=21732, tid=21734 >>>>>> # >>>>>> assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) >>>>>> >>>>>> >>>>>> == 0) failed: >>>>>> _owner(0x0000000000000000)|_LockWord(0x0000000000000000)|_EntryList(0x0000000000000000)|_WaitSet(0x0000000000000000)|_OnDeck(0x00000000deaddead) >>>>>> >>>>>> >>>>>> != 0 >>>>>> >>>>>> Thanks, >>>>>> David >>>>> >>> > From daniel.daugherty at oracle.com Fri Aug 19 02:02:11 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 18 Aug 2016 20:02:11 -0600 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> Message-ID: <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> On 8/17/16 8:50 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8158854 > > webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ src/share/vm/classfile/classLoader.hpp No comments. src/share/vm/classfile/verifier.cpp No comments. src/share/vm/oops/arrayKlass.hpp No comments. src/share/vm/oops/instanceKlass.cpp No comments. src/share/vm/oops/instanceKlass.hpp No comments. src/share/vm/oops/instanceKlass.inline.hpp No comments. src/share/vm/oops/objArrayKlass.cpp No comments. src/share/vm/oops/typeArrayKlass.cpp No comments. src/share/vm/runtime/vmStructs.cpp No comments. Thumbs up. Dan > > > Generally speaking release_store should be paired with load_acquire to > ensure correct memory visibility and ordering in lock-free code (often > the read path is what is lock-free). So based on some observations > from earlier bug fixes this bug was intended to examine the use of > release_store and see if we have the appropriate load_acquire as well. > The bug report lists all of the cases that were examined - some clear > cut correct, some complex correct, some fixed here and some split out > into separate issues. > > Here's a summary of the actual changes in the webrev: > > src/share/vm/classfile/classLoader.hpp > > - next() accessor needs to use load_acquire. > > --- > > src/share/vm/classfile/verifier.cpp > > - load of _verify_byte_codes_fn needs to load_acquire to pair with use > of release_store > - release_store of _is_new_verify_byte_codes_fn is not needed > > --- > > src/share/vm/oops/arrayKlass.hpp > src/share/vm/oops/instanceKlass.cpp > src/share/vm/oops/instanceKlass.hpp > src/share/vm/oops/instanceKlass.inline.hpp > src/share/vm/oops/objArrayKlass.cpp > src/share/vm/oops/typeArrayKlass.cpp > > The logic for storing dimensions values was using a storeStore barrier > between the lower and higher dimensions. This is converted to use a > release-store setter for higher-dimension, with paired load-acquire > accessor. Plus the accessed fields are declared volatile. > > The methods_jmethod_ids_acquire() and its paired > release_set_methods_jmethod_ids(), are moved to the .inline.hpp file > where they belong. > > --- > > src/share/vm/runtime/vmStructs.cpp > > Updated declaration for _array_klasses now it is volatile. > > --- > > Thanks, > David From david.holmes at oracle.com Fri Aug 19 02:05:30 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Aug 2016 12:05:30 +1000 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> Message-ID: Thanks Dan! David On 19/08/2016 12:02 PM, Daniel D. Daugherty wrote: > On 8/17/16 8:50 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8158854 >> >> webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ > > src/share/vm/classfile/classLoader.hpp > No comments. > > src/share/vm/classfile/verifier.cpp > No comments. > > src/share/vm/oops/arrayKlass.hpp > No comments. > > src/share/vm/oops/instanceKlass.cpp > No comments. > > src/share/vm/oops/instanceKlass.hpp > No comments. > > src/share/vm/oops/instanceKlass.inline.hpp > No comments. > > src/share/vm/oops/objArrayKlass.cpp > No comments. > > src/share/vm/oops/typeArrayKlass.cpp > No comments. > > src/share/vm/runtime/vmStructs.cpp > No comments. > > Thumbs up. > > Dan > > >> >> >> Generally speaking release_store should be paired with load_acquire to >> ensure correct memory visibility and ordering in lock-free code (often >> the read path is what is lock-free). So based on some observations >> from earlier bug fixes this bug was intended to examine the use of >> release_store and see if we have the appropriate load_acquire as well. >> The bug report lists all of the cases that were examined - some clear >> cut correct, some complex correct, some fixed here and some split out >> into separate issues. >> >> Here's a summary of the actual changes in the webrev: >> >> src/share/vm/classfile/classLoader.hpp >> >> - next() accessor needs to use load_acquire. >> >> --- >> >> src/share/vm/classfile/verifier.cpp >> >> - load of _verify_byte_codes_fn needs to load_acquire to pair with use >> of release_store >> - release_store of _is_new_verify_byte_codes_fn is not needed >> >> --- >> >> src/share/vm/oops/arrayKlass.hpp >> src/share/vm/oops/instanceKlass.cpp >> src/share/vm/oops/instanceKlass.hpp >> src/share/vm/oops/instanceKlass.inline.hpp >> src/share/vm/oops/objArrayKlass.cpp >> src/share/vm/oops/typeArrayKlass.cpp >> >> The logic for storing dimensions values was using a storeStore barrier >> between the lower and higher dimensions. This is converted to use a >> release-store setter for higher-dimension, with paired load-acquire >> accessor. Plus the accessed fields are declared volatile. >> >> The methods_jmethod_ids_acquire() and its paired >> release_set_methods_jmethod_ids(), are moved to the .inline.hpp file >> where they belong. >> >> --- >> >> src/share/vm/runtime/vmStructs.cpp >> >> Updated declaration for _array_klasses now it is volatile. >> >> --- >> >> Thanks, >> David > From gerald.thornbrugh at oracle.com Fri Aug 19 02:13:05 2016 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Thu, 18 Aug 2016 20:13:05 -0600 Subject: (XS) RFR: 8152849: share/vm/runtime/mutex.cpp:1161 assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed In-Reply-To: <827caf69-ded4-34a4-5458-e331dbcbae38@oracle.com> References: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> <425f88af-b248-e786-ac60-8f9c9d7d1a13@oracle.com> <827caf69-ded4-34a4-5458-e331dbcbae38@oracle.com> Message-ID: <57B66B31.7060309@oracle.com> Hi David, Your changes looks good. Jerry > Thanks Dan! > > Still need a second reviewer please. It's really simple :) > > David > > On 19/08/2016 12:04 AM, Daniel D. Daugherty wrote: >> On 8/17/16 6:04 PM, David Holmes wrote: >>> Hi Dan, >>> >>> Thanks for looking at this. >> >> No problem. I'm pretty sure I once chased the older version >> of this bug... :-) >> >> >>> >>> On 17/08/2016 10:59 PM, Daniel D. Daugherty wrote: >>>> On 8/17/16 1:45 AM, David Holmes wrote: >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8152849 >>>>> >>>>> webrev: http://cr.openjdk.java.net/~dholmes/8152849/webrev/ >>>> >>>> src/share/vm/runtime/mutex.cpp >>>> If the problem is a racy clearing of one of the fields, then the >>>> assert can still fire and the extra info might still show zeroes >>>> since they are two different queries of the fields. >>>> >>>> For this diagnostic to be always accurate you need to save a copy >>>> of each field, assert() that all the copies or'ed together are >>>> == 0, >>>> and have the extra info printed from the copies. >>> >>> You are right of course. Don't know what I was thinking. :( >>> >>> http://cr.openjdk.java.net/~dholmes/8152849/webrev.v2/ >> >> src/share/vm/runtime/mutex.cpp >> No comments. >> >> Thumbs up! >> >> Dan >> >> >> >>> >>> Thanks, >>> David >>> >>>> Dan >>>> >>>> >>>>> >>>>> This is a rare assertion failure that has proven to unreproducible >>>>> even by directly trying to exercise the theoretical race conditions >>>>> mentioned in the bug report. All I can do for now is augment the >>>>> assert to print out the various values so we can at least see where >>>>> things are failing, next time it happens. >>>>> >>>>> Example output: >>>>> >>>>> # Internal Error >>>>> (/scratch/dh198349/jdk9-hs/hotspot/src/share/vm/runtime/mutex.cpp:1157), >>>>> >>>>> >>>>> pid=21732, tid=21734 >>>>> # >>>>> assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) >>>>> >>>>> >>>>> == 0) failed: >>>>> _owner(0x0000000000000000)|_LockWord(0x0000000000000000)|_EntryList(0x0000000000000000)|_WaitSet(0x0000000000000000)|_OnDeck(0x00000000deaddead) >>>>> >>>>> >>>>> != 0 >>>>> >>>>> Thanks, >>>>> David >>>> >> From david.holmes at oracle.com Fri Aug 19 03:08:51 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Aug 2016 13:08:51 +1000 Subject: (XS) RFR: 8152849: share/vm/runtime/mutex.cpp:1161 assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) == 0) failed In-Reply-To: <57B66B31.7060309@oracle.com> References: <89c00180-0990-7a09-dd82-9ae69810c636@oracle.com> <425f88af-b248-e786-ac60-8f9c9d7d1a13@oracle.com> <827caf69-ded4-34a4-5458-e331dbcbae38@oracle.com> <57B66B31.7060309@oracle.com> Message-ID: <8efa2ae7-3168-f54c-e549-4fafbc9e377a@oracle.com> Thanks Jerry - afraid I already pushed however :) David On 19/08/2016 12:13 PM, Gerald Thornbrugh wrote: > Hi David, > > Your changes looks good. > > Jerry >> Thanks Dan! >> >> Still need a second reviewer please. It's really simple :) >> >> David >> >> On 19/08/2016 12:04 AM, Daniel D. Daugherty wrote: >>> On 8/17/16 6:04 PM, David Holmes wrote: >>>> Hi Dan, >>>> >>>> Thanks for looking at this. >>> >>> No problem. I'm pretty sure I once chased the older version >>> of this bug... :-) >>> >>> >>>> >>>> On 17/08/2016 10:59 PM, Daniel D. Daugherty wrote: >>>>> On 8/17/16 1:45 AM, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8152849 >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8152849/webrev/ >>>>> >>>>> src/share/vm/runtime/mutex.cpp >>>>> If the problem is a racy clearing of one of the fields, then the >>>>> assert can still fire and the extra info might still show zeroes >>>>> since they are two different queries of the fields. >>>>> >>>>> For this diagnostic to be always accurate you need to save a copy >>>>> of each field, assert() that all the copies or'ed together are >>>>> == 0, >>>>> and have the extra info printed from the copies. >>>> >>>> You are right of course. Don't know what I was thinking. :( >>>> >>>> http://cr.openjdk.java.net/~dholmes/8152849/webrev.v2/ >>> >>> src/share/vm/runtime/mutex.cpp >>> No comments. >>> >>> Thumbs up! >>> >>> Dan >>> >>> >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> This is a rare assertion failure that has proven to unreproducible >>>>>> even by directly trying to exercise the theoretical race conditions >>>>>> mentioned in the bug report. All I can do for now is augment the >>>>>> assert to print out the various values so we can at least see where >>>>>> things are failing, next time it happens. >>>>>> >>>>>> Example output: >>>>>> >>>>>> # Internal Error >>>>>> (/scratch/dh198349/jdk9-hs/hotspot/src/share/vm/runtime/mutex.cpp:1157), >>>>>> >>>>>> >>>>>> pid=21732, tid=21734 >>>>>> # >>>>>> assert(((uintptr_t(_owner))|(uintptr_t(_LockWord.FullWord))|(uintptr_t(_EntryList))|(uintptr_t(_WaitSet))|(uintptr_t(_OnDeck))) >>>>>> >>>>>> >>>>>> == 0) failed: >>>>>> _owner(0x0000000000000000)|_LockWord(0x0000000000000000)|_EntryList(0x0000000000000000)|_WaitSet(0x0000000000000000)|_OnDeck(0x00000000deaddead) >>>>>> >>>>>> >>>>>> != 0 >>>>>> >>>>>> Thanks, >>>>>> David >>>>> >>> > From dean.long at oracle.com Fri Aug 19 04:29:03 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 18 Aug 2016 21:29:03 -0700 Subject: RFR(S) 8161598,,Kitchensink fails: assert(nm->insts_contains(original_pc)) failed: original PC must be in nmethod/CompiledMethod In-Reply-To: References: <290ba061-6716-9496-1a97-8ba41f35dda3@oracle.com> <5891da53-22ff-953b-f6b8-a979a7338c24@oracle.com> <2b8eb97e-3e8d-31ac-7507-af7489855f43@oracle.com> Message-ID: <235d9c37-1e4e-5e41-7fe2-709b5405cfdf@oracle.com> Thanks Dan. dl On 8/18/16 5:40 PM, Daniel D. Daugherty wrote: > This is very nicely done and a very good find! > I think this will make AsyncGetCallTrace() on X86/X64 > more stable. > > Dan > > On 8/16/16 10:00 AM, dean.long at oracle.com wrote: >> Thanks Coleen. >> >> dl >> >> >> On 8/16/16 7:52 AM, Coleen Phillimore wrote: >>> >>> I think this looks good. >>> >>> Coleen >>> >>> >>> On 8/15/16 1:57 PM, dean.long at oracle.com wrote: >>>> Thanks Fred. >>>> >>>> Still waiting for a Reviewer. >>>> >>>> dl >>>> >>>> >>>> On 8/15/16 6:26 AM, Frederic Parain wrote: >>>>> Thank you, >>>>> >>>>> Looks good to me. >>>>> >>>>> Fred >>>>> >>>>> On 08/12/2016 04:19 PM, dean.long at oracle.com wrote: >>>>>> Sure: >>>>>> >>>>>> http://cr.openjdk.java.net/~dlong/8161598/webrev.1/ >>>>>> >>>>>> dl >>>>>> >>>>>> >>>>>> On 8/12/16 6:46 AM, Frederic Parain wrote: >>>>>>> Dean, >>>>>>> >>>>>>> In file macroAssembler_x86.cpp, could it be possible to >>>>>>> get rid of the clear_pc argument? It seems completely >>>>>>> useless now. >>>>>>> >>>>>>> Fred >>>>>>> >>>>>>> >>>>>>> On 08/09/2016 01:39 PM, dean.long at oracle.com wrote: >>>>>>>> Ping. >>>>>>>> >>>>>>>> dl >>>>>>>> >>>>>>>> >>>>>>>> On 8/4/16 3:28 PM, dean.long at oracle.com wrote: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161598 >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~dlong/8161598/webrev/ >>>>>>>>> >>>>>>>>> Sorry, this issue is Confidential. The problem is similar to >>>>>>>>> 8029441, >>>>>>>>> where we suspend a thread and use >>>>>>>>> pd_get_top_frame_for_profiling() to >>>>>>>>> get the top frame for stack walking. The problem is "last >>>>>>>>> Java frame" >>>>>>>>> anchor frames on x86. In lots of places we do not store >>>>>>>>> last_Java_pc. >>>>>>>>> This is OK in the synchronous stack walk case done by the current >>>>>>>>> thread. But in the asynchronous case, there are small windows >>>>>>>>> where >>>>>>>>> it's not always safe to get PC from sp[-1]. >>>>>>>>> >>>>>>>>> The solution is not to treat x86 anchor frames as "always >>>>>>>>> walkable". >>>>>>>>> Instead, we follow the example of sparc and make them walking by >>>>>>>>> filling in last_Java_pc when it's safe. >>>>>>>>> >>>>>>>>> I went for the minimal fix, resetting clear_pc to true in >>>>>>>>> reset_last_Java_frame() but not changing the API and all the >>>>>>>>> callers. >>>>>>>>> I can fix this if reviewers feel strongly about it. >>>>>>>>> >>>>>>>>> dl >>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> >> > From vladimir.x.ivanov at oracle.com Fri Aug 19 08:49:35 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 19 Aug 2016 11:49:35 +0300 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> Message-ID: <98703d95-2872-c43e-7555-7c461deeecad@oracle.com> Looks good! Best regards, Vladimir Ivanov On 8/17/16 6:56 PM, Rachel Protacio wrote: > Hi David and Coleen, > > Thank you for the reviews. I've updated the change as requested: > http://cr.openjdk.java.net/~rprotacio/8163973.01/ > > Rachel > > On 8/16/2016 10:04 PM, Coleen Phillimore wrote: >> >> >> On 8/15/16 11:36 PM, David Holmes wrote: >>> Hi Rachel, >>> >>> On 16/08/2016 6:48 AM, Rachel Protacio wrote: >>>> Hello, >>>> >>>> Please review this change, which makes sure class file load hooks are >>>> not called for VM anonymous classes. See >>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >>>> >>>> for justification. >>>> >>>> Passes JPRT and RBT. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >>>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >>> >>> This: >>> >>> 112 // VM Anonymous classes - defined via >>> unsafe.DefineAnonymousClass - should not >>> 113 // call back to a CFLH >>> 114 if (host_klass == NULL) { >>> 115 stream = prologue(stream, >>> >>> suggests that "prologue" can only do CFLH related things. If that is >>> true then it would be much clearer in my opinion if prologue were >>> renamed to something more explicit - like check_class_file_load_hook >>> ? Otherwise, the host_klass should be passed in to prologue and the >>> anonymous class check internalized there. >> >> I agree with David here. This was sort of bothering me about your >> change when we talked about it before. If the prologue did more than >> call the CFLH then you'd have to pass host_klass down, since we know >> that's all it does, the name of the function should be changed. >> David's name looks good to me. >>> >>> Also I don't think you need to explain where VM anonymous classes >>> come from, it suffices to simply say "Skip class file load hook >>> processing for VM anonymous classes"; or if the prologue is renamed >>> then simply "Skip this processing for VM anonymous classes". :) >> >> Thanks, >> Coleen >>> >>> Thanks, >>> David >>> >>>> Thank you! >>>> Rachel >> > From aph at redhat.com Fri Aug 19 08:53:14 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 19 Aug 2016 09:53:14 +0100 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <1a4ef573-3924-3d2b-73c5-4829b7e32b9c@oracle.com> References: <1a4ef573-3924-3d2b-73c5-4829b7e32b9c@oracle.com> Message-ID: <6bfa841a-7fe7-a605-b91e-aed3e7c3a9d6@redhat.com> On 18/08/16 20:42, Chris Plummer wrote: >> AFAIK, gcc doesn't inline function if it has inline assembly. > That wasn't my observation. It isn't true. Inline asm functions are a very important feature of GCC. Andrew. From dmitry.samersoff at oracle.com Fri Aug 19 10:53:24 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 19 Aug 2016 13:53:24 +0300 Subject: RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup In-Reply-To: <6bfa841a-7fe7-a605-b91e-aed3e7c3a9d6@redhat.com> References: <1a4ef573-3924-3d2b-73c5-4829b7e32b9c@oracle.com> <6bfa841a-7fe7-a605-b91e-aed3e7c3a9d6@redhat.com> Message-ID: <0aafa593-5f13-b60c-d8f4-4699f92c9faf@oracle.com> Andrew, > It isn't true. Inline asm functions are a very important feature of > GCC. I was wrong (verified). Sorry! -Dmitry On 2016-08-19 11:53, Andrew Haley wrote: > On 18/08/16 20:42, Chris Plummer wrote: >>> AFAIK, gcc doesn't inline function if it has inline assembly. >> That wasn't my observation. > > It isn't true. Inline asm functions are a very important feature of GCC. > > Andrew. > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.simms at oracle.com Fri Aug 19 11:20:02 2016 From: david.simms at oracle.com (David Simms) Date: Fri, 19 Aug 2016 13:20:02 +0200 Subject: RFR (S): JDK-8164086: Checked JNI pending exception check should be cleared when returning to Java frame Message-ID: Greetings, JDK-8043224 Added warnings when using -Xcheck:jni when native code using the JNI API fails to check for exceptions. Returning to a Java frame should implicitly clear the need for checking for exceptions. Current JVM code does not, leading to a fair amount of false warnings from JDK core libraries. Bug: https://bugs.openjdk.java.net/browse/JDK-8164086 Webrev: http://cr.openjdk.java.net/~dsimms/8164086/webrev0/ Testing: added a new jtreg test including some JNI use cases which should and should not produce warnings, passes on all platforms. /David Simms From david.holmes at oracle.com Fri Aug 19 12:29:12 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Aug 2016 22:29:12 +1000 Subject: RFR (S): JDK-8164086: Checked JNI pending exception check should be cleared when returning to Java frame In-Reply-To: References: Message-ID: Hi David, On 19/08/2016 9:20 PM, David Simms wrote: > Greetings, > > JDK-8043224 Added warnings when using -Xcheck:jni when native code using > the JNI API fails to check for exceptions. Returning to a Java frame > should implicitly clear the need for checking for exceptions. Current > JVM code does not, leading to a fair amount of false warnings from JDK > core libraries. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8164086 > > Webrev: http://cr.openjdk.java.net/~dsimms/8164086/webrev0/ The changes in the native wrapper seem okay though I'm not an expert on the machine specific encodings. I'm a little surprised there are not more things that need changing though. Does the JIT use those wrappers too? Can we transition from Java to VM to native and then back - and if so might we need to clear the pending exception check? (I'm not sure if from in the VM a native call could actually be a JNI call, or will only be a direct native call?). Did you intend to leave in the changes to jdk/src/java.base/share/native/libjli/java.c? It looks like debug/test code to me. The test I'm finding a bit hard to follow but don't you need to check for pending exceptions here: 29 static jmethodID get_method_id(JNIEnv *env, jclass clz, jstring jname, jstring jsig) { 30 jmethodID mid; 31 const char *name, *sig; 32 name = (*env)->GetStringUTFChars(env, jname, NULL); 33 sig = (*env)->GetStringUTFChars(env, jsig, NULL); 34 mid = (*env)->GetMethodID(env, clz, name, sig); to avoid triggering the warning? Thanks, David H. (PS. calling it a night) > Testing: added a new jtreg test including some JNI use cases which > should and should not produce warnings, passes on all platforms. > > /David Simms > From rachel.protacio at oracle.com Fri Aug 19 13:57:02 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Fri, 19 Aug 2016 09:57:02 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <98703d95-2872-c43e-7555-7c461deeecad@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> <98703d95-2872-c43e-7555-7c461deeecad@oracle.com> Message-ID: <7e2ddc3c-f575-ff4f-e957-2ee2fc26b899@oracle.com> Thank you, Vladimir! Rachel On 8/19/2016 4:49 AM, Vladimir Ivanov wrote: > Looks good! > > Best regards, > Vladimir Ivanov > > On 8/17/16 6:56 PM, Rachel Protacio wrote: >> Hi David and Coleen, >> >> Thank you for the reviews. I've updated the change as requested: >> http://cr.openjdk.java.net/~rprotacio/8163973.01/ >> >> Rachel >> >> On 8/16/2016 10:04 PM, Coleen Phillimore wrote: >>> >>> >>> On 8/15/16 11:36 PM, David Holmes wrote: >>>> Hi Rachel, >>>> >>>> On 16/08/2016 6:48 AM, Rachel Protacio wrote: >>>>> Hello, >>>>> >>>>> Please review this change, which makes sure class file load hooks are >>>>> not called for VM anonymous classes. See >>>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >>>>> >>>>> >>>>> for justification. >>>>> >>>>> Passes JPRT and RBT. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >>>>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >>>> >>>> This: >>>> >>>> 112 // VM Anonymous classes - defined via >>>> unsafe.DefineAnonymousClass - should not >>>> 113 // call back to a CFLH >>>> 114 if (host_klass == NULL) { >>>> 115 stream = prologue(stream, >>>> >>>> suggests that "prologue" can only do CFLH related things. If that is >>>> true then it would be much clearer in my opinion if prologue were >>>> renamed to something more explicit - like check_class_file_load_hook >>>> ? Otherwise, the host_klass should be passed in to prologue and the >>>> anonymous class check internalized there. >>> >>> I agree with David here. This was sort of bothering me about your >>> change when we talked about it before. If the prologue did more than >>> call the CFLH then you'd have to pass host_klass down, since we know >>> that's all it does, the name of the function should be changed. >>> David's name looks good to me. >>>> >>>> Also I don't think you need to explain where VM anonymous classes >>>> come from, it suffices to simply say "Skip class file load hook >>>> processing for VM anonymous classes"; or if the prologue is renamed >>>> then simply "Skip this processing for VM anonymous classes". :) >>> >>> Thanks, >>> Coleen >>>> >>>> Thanks, >>>> David >>>> >>>>> Thank you! >>>>> Rachel >>> >> From karen.kinnear at oracle.com Fri Aug 19 15:56:15 2016 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 19 Aug 2016 11:56:15 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> Message-ID: <80E4F62B-C946-4EEB-A67C-1A7E7E0A62CA@oracle.com> Rachel - Looks good. Thank you for changing the name. David - I had asked Rachel to explain where VM anonymous classes came from since I do not see the term ?VM anonymous? class anywhere. No problem taking that comment out. I?ll ask Harold to add it to the Unsafe.DefineAnonymousClass code as part of a separate bug fix. thanks, Karen > On Aug 17, 2016, at 11:56 AM, Rachel Protacio wrote: > > Hi David and Coleen, > > Thank you for the reviews. I've updated the change as requested: http://cr.openjdk.java.net/~rprotacio/8163973.01/ > > Rachel > > On 8/16/2016 10:04 PM, Coleen Phillimore wrote: >> >> >> On 8/15/16 11:36 PM, David Holmes wrote: >>> Hi Rachel, >>> >>> On 16/08/2016 6:48 AM, Rachel Protacio wrote: >>>> Hello, >>>> >>>> Please review this change, which makes sure class file load hooks are >>>> not called for VM anonymous classes. See >>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >>>> for justification. >>>> >>>> Passes JPRT and RBT. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >>>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >>> >>> This: >>> >>> 112 // VM Anonymous classes - defined via unsafe.DefineAnonymousClass - should not >>> 113 // call back to a CFLH >>> 114 if (host_klass == NULL) { >>> 115 stream = prologue(stream, >>> >>> suggests that "prologue" can only do CFLH related things. If that is true then it would be much clearer in my opinion if prologue were renamed to something more explicit - like check_class_file_load_hook ? Otherwise, the host_klass should be passed in to prologue and the anonymous class check internalized there. >> >> I agree with David here. This was sort of bothering me about your change when we talked about it before. If the prologue did more than call the CFLH then you'd have to pass host_klass down, since we know that's all it does, the name of the function should be changed. David's name looks good to me. >>> >>> Also I don't think you need to explain where VM anonymous classes come from, it suffices to simply say "Skip class file load hook processing for VM anonymous classes"; or if the prologue is renamed then simply "Skip this processing for VM anonymous classes". :) >> >> Thanks, >> Coleen >>> >>> Thanks, >>> David >>> >>>> Thank you! >>>> Rachel >> > From karen.kinnear at oracle.com Fri Aug 19 16:04:26 2016 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 19 Aug 2016 12:04:26 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: <57B300EE.3060607@oracle.com> References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> <57B300EE.3060607@oracle.com> Message-ID: Thank you Lois for the quick review and suggestions. thanks, Karen > On Aug 16, 2016, at 8:02 AM, Lois Foltan wrote: > > Looks good Karen. Thank you for the mixed class file version test! > Lois > > On 8/12/2016 3:28 PM, Karen Kinnear wrote: >> Added a targeted test case for class files with different class file versions in the >> inheritance hierarchy. >> >> http://cr.openjdk.java.net/~acorn/8163808.hs.1/webrev/ >> >> thanks, >> Karen >> >>> On Aug 12, 2016, at 1:46 PM, Coleen Phillimore wrote: >>> >>> >>> >>> On 8/12/16 1:33 PM, Karen Kinnear wrote: >>>> Coleen, >>>> >>>> Good catch - I will make that change. >>>> Today this code is not called for arrays, but I totally appreciate you looking at the bigger picture >>>> and preparing for potential other uses. >>>> >>>> >>>> Here is the updated lines: >>>> KlassHandle vtklass_h = vt->klass(); >>>> Klass* vtklass = vtklass_h(); >>>> if (vtklass->is_instance_klass() && >>>> (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { >>>> assert(method() != NULL, "must have set method"); >>>> } >>>> >>> This looks good. >>> Thanks, >>> Coleen >>> >>>> Thanks! >>>> Karen >>>> >>>>> On Aug 12, 2016, at 10:47 AM, Coleen Phillimore > wrote: >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html >>>>> >>>>> void vtableEntry::verify(klassVtable* vt, outputStream* st) { >>>>> NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); >>>>> + KlassHandle vtklass_h = vt->klass(); >>>>> + Klass* vtklass = vtklass_h(); >>>>> + if (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { >>>>> assert(method() != NULL, "must have set method"); >>>>> + } >>>>> >>>>> I might be wrong but the vtable->klass() can be an ArrayKlass, so I think you have to do: >>>>> >>>>> if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) >>>>> >>>>> InstanceKlass::cast makes this assertion. Otherwise, the code looks good. >>>>> >>>>> Coleen >>>>> >>>>> On 8/11/16 5:07 PM, Karen Kinnear wrote: >>>>>> Please review: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>>>>> >>>>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>>>>> >>>>>> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >>>>>> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >>>>>> for later classfile versions. >>>>>> >>>>>> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >>>>>> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >>>>>> >>>>>> Testing: linux-x64-slowdebug >>>>>> rbt hs-nightly-runtime.js >>>>>> jck vm,lang, api.java.lang >>>>>> small invocation tests >>>>>> >>>>>> thanks, >>>>>> Karen > From karen.kinnear at oracle.com Fri Aug 19 16:07:32 2016 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 19 Aug 2016 12:07:32 -0400 Subject: RFR: JDK-8163808 fix vtable assertion and logging for older classfiles In-Reply-To: References: <6BDB7CCF-7BAA-4CF5-A398-9EAEAE245C6C@oracle.com> <987cd1ba-f9f9-7bc5-1e5c-51099d1ff263@oracle.com> <1FBD1B76-5629-4090-8674-0432BD4B57A2@oracle.com> Message-ID: <53412AFA-61E1-4A25-8B5A-1C9E6765E9E9@oracle.com> Good timing. Fixed the copyright. And you are right that I did not need the compiler directive - with jigsaw we don?t need the -Dignore.symbol.file anymore because we handle this with module boundaries. jtreg added an -XaddExports:java.bae/jdk.internal.org.objectweb.asm=ALL_UNNAMED for me. thanks, Karen > On Aug 16, 2016, at 10:41 AM, Coleen Phillimore wrote: > > > http://cr.openjdk.java.net/~acorn/8163808.hs.1/webrev/test/runtime/TransitiveOverrideCFV50/TransitiveOverrideCFV50.java.html > > One tiny thing in case you haven't pushed it yet. The copyright dates should just say 2016, since it's a new test. > > Also, I'm not sure if you need @compile directive since I think jtreg adds -Dignore.symbol.file for you. > > Thanks, > Coleen > > On 8/12/16 3:28 PM, Karen Kinnear wrote: >> Added a targeted test case for class files with different class file versions in the >> inheritance hierarchy. >> >> http://cr.openjdk.java.net/~acorn/8163808.hs.1/webrev/ >> >> thanks, >> Karen >> >>> On Aug 12, 2016, at 1:46 PM, Coleen Phillimore > wrote: >>> >>> >>> >>> On 8/12/16 1:33 PM, Karen Kinnear wrote: >>>> Coleen, >>>> >>>> Good catch - I will make that change. >>>> Today this code is not called for arrays, but I totally appreciate you looking at the bigger picture >>>> and preparing for potential other uses. >>>> >>>> >>>> Here is the updated lines: >>>> KlassHandle vtklass_h = vt->klass(); >>>> Klass* vtklass = vtklass_h(); >>>> if (vtklass->is_instance_klass() && >>>> (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION)) { >>>> assert(method() != NULL, "must have set method"); >>>> } >>>> >>> This looks good. >>> Thanks, >>> Coleen >>> >>>> Thanks! >>>> Karen >>>> >>>>> On Aug 12, 2016, at 10:47 AM, Coleen Phillimore > wrote: >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev/src/share/vm/oops/klassVtable.cpp.udiff.html >>>>> >>>>> void vtableEntry::verify(klassVtable* vt, outputStream* st) { >>>>> NOT_PRODUCT(FlagSetting fs(IgnoreLockingAssertions, true)); >>>>> + KlassHandle vtklass_h = vt->klass(); >>>>> + Klass* vtklass = vtklass_h(); >>>>> + if (InstanceKlass::cast(vtklass)->major_version() >= klassVtable::VTABLE_TRANSITIVE_OVERRIDE_VERSION) { >>>>> assert(method() != NULL, "must have set method"); >>>>> + } >>>>> >>>>> I might be wrong but the vtable->klass() can be an ArrayKlass, so I think you have to do: >>>>> >>>>> if (vtklass->oop_is_instance() && InstanceKlass::cast(vtklass) ...) >>>>> >>>>> InstanceKlass::cast makes this assertion. Otherwise, the code looks good. >>>>> >>>>> Coleen >>>>> >>>>> On 8/11/16 5:07 PM, Karen Kinnear wrote: >>>>>> Please review: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8163808 >>>>>> >>>>>> http://cr.openjdk.java.net/~acorn/8163808.hs/webrev >>>>>> >>>>>> Bug: For classfiles before class file version 51, JVMS did not support transitive over-ride behavior. >>>>>> Implementation needed to check this in three places, not just one. Vtable size calculation is only exact >>>>>> for later classfile versions. >>>>>> >>>>>> Also fixed vtable logging output - since the method name-and-sig printing was changed to also print >>>>>> the holder?s class name, we do not need to print the holder?s class name separately - it was printing twice. >>>>>> >>>>>> Testing: linux-x64-slowdebug >>>>>> rbt hs-nightly-runtime.js >>>>>> jck vm,lang, api.java.lang >>>>>> small invocation tests >>>>>> >>>>>> thanks, >>>>>> Karen >>>>> >>>> >>> >> > From rachel.protacio at oracle.com Fri Aug 19 16:09:36 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Fri, 19 Aug 2016 12:09:36 -0400 Subject: RFR(XS): 8163973: VM Anonymous classes should not call Class File Load Hooks In-Reply-To: <80E4F62B-C946-4EEB-A67C-1A7E7E0A62CA@oracle.com> References: <2bd17dcc-ca9f-0ff3-1c9f-21d962df0b4a@oracle.com> <3374d2e1-0742-1f31-2662-146b9c956cba@oracle.com> <80E4F62B-C946-4EEB-A67C-1A7E7E0A62CA@oracle.com> Message-ID: <8e38fcda-5af8-f29a-c459-dd5ec6d4f680@oracle.com> Thanks, Karen! Rachel On 8/19/2016 11:56 AM, Karen Kinnear wrote: > Rachel - > > Looks good. Thank you for changing the name. > > David - I had asked Rachel to explain where VM anonymous classes came from since I do not see the term > ?VM anonymous? class anywhere. No problem taking that comment out. I?ll ask Harold to add it to the > Unsafe.DefineAnonymousClass code as part of a separate bug fix. > > thanks, > Karen > >> On Aug 17, 2016, at 11:56 AM, Rachel Protacio wrote: >> >> Hi David and Coleen, >> >> Thank you for the reviews. I've updated the change as requested: http://cr.openjdk.java.net/~rprotacio/8163973.01/ >> >> Rachel >> >> On 8/16/2016 10:04 PM, Coleen Phillimore wrote: >>> >>> On 8/15/16 11:36 PM, David Holmes wrote: >>>> Hi Rachel, >>>> >>>> On 16/08/2016 6:48 AM, Rachel Protacio wrote: >>>>> Hello, >>>>> >>>>> Please review this change, which makes sure class file load hooks are >>>>> not called for VM anonymous classes. See >>>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038353.html >>>>> for justification. >>>>> >>>>> Passes JPRT and RBT. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8163973 >>>>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8163973.00/ >>>> This: >>>> >>>> 112 // VM Anonymous classes - defined via unsafe.DefineAnonymousClass - should not >>>> 113 // call back to a CFLH >>>> 114 if (host_klass == NULL) { >>>> 115 stream = prologue(stream, >>>> >>>> suggests that "prologue" can only do CFLH related things. If that is true then it would be much clearer in my opinion if prologue were renamed to something more explicit - like check_class_file_load_hook ? Otherwise, the host_klass should be passed in to prologue and the anonymous class check internalized there. >>> I agree with David here. This was sort of bothering me about your change when we talked about it before. If the prologue did more than call the CFLH then you'd have to pass host_klass down, since we know that's all it does, the name of the function should be changed. David's name looks good to me. >>>> Also I don't think you need to explain where VM anonymous classes come from, it suffices to simply say "Skip class file load hook processing for VM anonymous classes"; or if the prologue is renamed then simply "Skip this processing for VM anonymous classes". :) >>> Thanks, >>> Coleen >>>> Thanks, >>>> David >>>> >>>>> Thank you! >>>>> Rachel From david.holmes at oracle.com Mon Aug 22 04:16:05 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 22 Aug 2016 14:16:05 +1000 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> Message-ID: I went to push this and realized I hadn't hg add'ed the new src/share/vm/oops/arrayKlass.inline.hpp which is also missing from the webrev (but now updated in place). Thanks, David On 19/08/2016 12:02 PM, Daniel D. Daugherty wrote: > On 8/17/16 8:50 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8158854 >> >> webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ > > src/share/vm/classfile/classLoader.hpp > No comments. > > src/share/vm/classfile/verifier.cpp > No comments. > > src/share/vm/oops/arrayKlass.hpp > No comments. > > src/share/vm/oops/instanceKlass.cpp > No comments. > > src/share/vm/oops/instanceKlass.hpp > No comments. > > src/share/vm/oops/instanceKlass.inline.hpp > No comments. > > src/share/vm/oops/objArrayKlass.cpp > No comments. > > src/share/vm/oops/typeArrayKlass.cpp > No comments. > > src/share/vm/runtime/vmStructs.cpp > No comments. > > Thumbs up. > > Dan > > >> >> >> Generally speaking release_store should be paired with load_acquire to >> ensure correct memory visibility and ordering in lock-free code (often >> the read path is what is lock-free). So based on some observations >> from earlier bug fixes this bug was intended to examine the use of >> release_store and see if we have the appropriate load_acquire as well. >> The bug report lists all of the cases that were examined - some clear >> cut correct, some complex correct, some fixed here and some split out >> into separate issues. >> >> Here's a summary of the actual changes in the webrev: >> >> src/share/vm/classfile/classLoader.hpp >> >> - next() accessor needs to use load_acquire. >> >> --- >> >> src/share/vm/classfile/verifier.cpp >> >> - load of _verify_byte_codes_fn needs to load_acquire to pair with use >> of release_store >> - release_store of _is_new_verify_byte_codes_fn is not needed >> >> --- >> >> src/share/vm/oops/arrayKlass.hpp >> src/share/vm/oops/instanceKlass.cpp >> src/share/vm/oops/instanceKlass.hpp >> src/share/vm/oops/instanceKlass.inline.hpp >> src/share/vm/oops/objArrayKlass.cpp >> src/share/vm/oops/typeArrayKlass.cpp >> >> The logic for storing dimensions values was using a storeStore barrier >> between the lower and higher dimensions. This is converted to use a >> release-store setter for higher-dimension, with paired load-acquire >> accessor. Plus the accessed fields are declared volatile. >> >> The methods_jmethod_ids_acquire() and its paired >> release_set_methods_jmethod_ids(), are moved to the .inline.hpp file >> where they belong. >> >> --- >> >> src/share/vm/runtime/vmStructs.cpp >> >> Updated declaration for _array_klasses now it is volatile. >> >> --- >> >> Thanks, >> David > From david.holmes at oracle.com Mon Aug 22 07:54:03 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 22 Aug 2016 17:54:03 +1000 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8157904 webrev: http://cr.openjdk.java.net/~dholmes/8157904/webrev/ An earlier code review noticed that the default shared implementation of Atomic::cmpxchg(jbyte*) was missing the required post-memory-barrier in case of an initial failure: while (cur_as_bytes[offset] == comparand) { jint res = cmpxchg(new_val, dest_int, cur, order); ... } if we never enter the while loop we don't do a cmpxchg and so we have no memory barrier. The simple fix is to invert things and use a do { } while () loop. That way we always execute at least one real cmpxchg and so get the required memory barrier. For that to work we also have to preload comparand into the initial jint value we expect to see - as Kim pointed out. Additionally Kim updated the code to get rid of C-style casts and direct pointer arithmetic. I kept some of the intermediate locals for improved readability. Testing: The only platform that potentially uses the shared implementation is solaris_sparc (the others all have specialized asm variants). However as I don't have a sparc system to readily test on I had to tweak an atomic_linux_x86.hpp to use this on linux-x86. Then it turns out that we don't have any existing callers of the cmpxchg(byte) variant, because at the Java-level in Unsafe it has also been written in terms of cmpxchg(int) [this will probably be fixed so that intrinsics can be used]. So I had to modify Unsafe.java and unsafe.cpp to add in a variant that uses Atomic::cmpxchg(jbyte*), and then I modified the VarHandles code to use those new variants and then ran the VarHandles tests. Those tests passed with both the updated shared variant and the linux-x86 specific version. Thanks, David From aph at redhat.com Mon Aug 22 08:25:15 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 22 Aug 2016 09:25:15 +0100 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: References: Message-ID: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> On 22/08/16 08:54, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8157904 > > webrev: http://cr.openjdk.java.net/~dholmes/8157904/webrev/ The casting away volatile is confusing and ugly. Is the problem that we don't have volatile cmpxchg() ? Or that we don't have a pointer_delta() which accepts a volatile? Andrew. From david.holmes at oracle.com Mon Aug 22 10:59:17 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 22 Aug 2016 20:59:17 +1000 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> Message-ID: <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> On 22/08/2016 6:25 PM, Andrew Haley wrote: > On 22/08/16 08:54, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8157904 >> >> webrev: http://cr.openjdk.java.net/~dholmes/8157904/webrev/ > > The casting away volatile is confusing and ugly. Is the problem > that we don't have volatile cmpxchg() ? Or that we don't have a > pointer_delta() which accepts a volatile? The latter - pointer_delta takes const args. David > Andrew. > > From zgu at redhat.com Mon Aug 22 13:13:10 2016 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 22 Aug 2016 09:13:10 -0400 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> Message-ID: Hi David, The changes look good to me. Just a minor comment: I saw you made InstanceKlass::_array_klasses pointer "volatile", but not some other places. I know that probably it has not effort, but should we make all these pointers "volatile" just for consistency? Thanks, -Zhengyu On 08/22/2016 12:16 AM, David Holmes wrote: > I went to push this and realized I hadn't hg add'ed the new > > src/share/vm/oops/arrayKlass.inline.hpp > > which is also missing from the webrev (but now updated in place). > > Thanks, > David > > On 19/08/2016 12:02 PM, Daniel D. Daugherty wrote: >> On 8/17/16 8:50 PM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8158854 >>> >>> webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ >> >> src/share/vm/classfile/classLoader.hpp >> No comments. >> >> src/share/vm/classfile/verifier.cpp >> No comments. >> >> src/share/vm/oops/arrayKlass.hpp >> No comments. >> >> src/share/vm/oops/instanceKlass.cpp >> No comments. >> >> src/share/vm/oops/instanceKlass.hpp >> No comments. >> >> src/share/vm/oops/instanceKlass.inline.hpp >> No comments. >> >> src/share/vm/oops/objArrayKlass.cpp >> No comments. >> >> src/share/vm/oops/typeArrayKlass.cpp >> No comments. >> >> src/share/vm/runtime/vmStructs.cpp >> No comments. >> >> Thumbs up. >> >> Dan >> >> >>> >>> >>> Generally speaking release_store should be paired with load_acquire to >>> ensure correct memory visibility and ordering in lock-free code (often >>> the read path is what is lock-free). So based on some observations >>> from earlier bug fixes this bug was intended to examine the use of >>> release_store and see if we have the appropriate load_acquire as well. >>> The bug report lists all of the cases that were examined - some clear >>> cut correct, some complex correct, some fixed here and some split out >>> into separate issues. >>> >>> Here's a summary of the actual changes in the webrev: >>> >>> src/share/vm/classfile/classLoader.hpp >>> >>> - next() accessor needs to use load_acquire. >>> >>> --- >>> >>> src/share/vm/classfile/verifier.cpp >>> >>> - load of _verify_byte_codes_fn needs to load_acquire to pair with use >>> of release_store >>> - release_store of _is_new_verify_byte_codes_fn is not needed >>> >>> --- >>> >>> src/share/vm/oops/arrayKlass.hpp >>> src/share/vm/oops/instanceKlass.cpp >>> src/share/vm/oops/instanceKlass.hpp >>> src/share/vm/oops/instanceKlass.inline.hpp >>> src/share/vm/oops/objArrayKlass.cpp >>> src/share/vm/oops/typeArrayKlass.cpp >>> >>> The logic for storing dimensions values was using a storeStore barrier >>> between the lower and higher dimensions. This is converted to use a >>> release-store setter for higher-dimension, with paired load-acquire >>> accessor. Plus the accessed fields are declared volatile. >>> >>> The methods_jmethod_ids_acquire() and its paired >>> release_set_methods_jmethod_ids(), are moved to the .inline.hpp file >>> where they belong. >>> >>> --- >>> >>> src/share/vm/runtime/vmStructs.cpp >>> >>> Updated declaration for _array_klasses now it is volatile. >>> >>> --- >>> >>> Thanks, >>> David >> From volker.simonis at gmail.com Mon Aug 22 14:27:38 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 22 Aug 2016 16:27:38 +0200 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> Message-ID: Hi, I don't particularly like the const_casts as well. Why not change pointer_delta to accept pointers to volatiles as well: pointer_delta(const volatile void* left, const volatile void* right, Notice that "const volatile void*" means a pointer to a value which might change unexpectedly but which can not be changed by the program itself. As the function doesn't really dereferences the pointers (i.e. reads the values pointed to by the pointers) but only computes the delta of the pointers themselves, this shouldn't do any harm regarding the optimization possibilities. Because there are standard conversion from "*" to "volatile *", the new version will still work for callers which use non-volatile arguments. Regards, Volker On Mon, Aug 22, 2016 at 12:59 PM, David Holmes wrote: > On 22/08/2016 6:25 PM, Andrew Haley wrote: >> >> On 22/08/16 08:54, David Holmes wrote: >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8157904 >>> >>> webrev: http://cr.openjdk.java.net/~dholmes/8157904/webrev/ >> >> >> The casting away volatile is confusing and ugly. Is the problem >> that we don't have volatile cmpxchg() ? Or that we don't have a >> pointer_delta() which accepts a volatile? > > > The latter - pointer_delta takes const args. > > David > >> Andrew. >> >> > From christian.tornqvist at oracle.com Mon Aug 22 19:03:15 2016 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Mon, 22 Aug 2016 15:03:15 -0400 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM Message-ID: <10e801d1fca7$d8710ac0$89532040$@oracle.com> Hi everyone, Please review this change that adds a set of tests for the Minimal variant of the JVM. The Minimal JVM is a subset that excludes some functionality, the tests here are intended to test verify that trying to use this excluded functionality doesn't lead to any unexpected errors/crashes. Verified by running the tests on Linux ARMv7 and Linux x86. Webrev: http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ Bug (unfortunately not visible): https://bugs.openjdk.java.net/browse/JDK-8155964 Thanks, Christian From chris.plummer at oracle.com Mon Aug 22 21:14:33 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 22 Aug 2016 14:14:33 -0700 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: <10e801d1fca7$d8710ac0$89532040$@oracle.com> References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> Message-ID: Hi Christian, Overall it looks good. Thanks for adding these tests. Just a few questions: Why does Instrumentation.java not have @requires minimal? Why does JMX.java have: 28 * @run main/othervm -minimal JMX Have you tested with a jre that only has minimalvm and java.base? thanks, Chris On 8/22/16 12:03 PM, Christian Tornqvist wrote: > Hi everyone, > > > > Please review this change that adds a set of tests for the Minimal variant > of the JVM. The Minimal JVM is a subset that excludes some functionality, > the tests here are intended to test verify that trying to use this excluded > functionality doesn't lead to any unexpected errors/crashes. > > > > Verified by running the tests on Linux ARMv7 and Linux x86. > > > > Webrev: > > http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ > > > > Bug (unfortunately not visible): > > https://bugs.openjdk.java.net/browse/JDK-8155964 > > > > Thanks, > > Christian > > > From david.holmes at oracle.com Mon Aug 22 22:09:07 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Aug 2016 08:09:07 +1000 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: <10e801d1fca7$d8710ac0$89532040$@oracle.com> References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> Message-ID: Hi Christian, Quick feedback ... On 23/08/2016 5:03 AM, Christian Tornqvist wrote: > Hi everyone, > > Please review this change that adds a set of tests for the Minimal variant > of the JVM. The Minimal JVM is a subset that excludes some functionality, > the tests here are intended to test verify that trying to use this excluded > functionality doesn't lead to any unexpected errors/crashes. Note (for everyone) it was only a requirement that the primary options for managing unsupported features give a meaningful error message. For example UseG1GC should report it is not available, but selecting an option that only works for G1 will report whatever is reported when that option is called without G1 being selected (ie it wont report "this option only works with G1 and G1 is not available in the minimal VM"). And some failure modes may vary depending on which modules are available. > Verified by running the tests on Linux ARMv7 and Linux x86. > > Webrev: > > http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ Initial comment only. I'm not sure that @requires vm.flavor == "minimal" is the right way to do this. When you are going to launch a secondary VM with -minimal it suffices that the JRE/JDK under test has the minimal VM available, it isn't required that the main test VM run in the minimal VM. Granted jtreg can't tell you that (I don't think). There are features of the test library that use API's and VM capabilities that may not exist in the minimal VM (ie RuntimeMXBeans). More later ... Thanks, David > > > Bug (unfortunately not visible): > > https://bugs.openjdk.java.net/browse/JDK-8155964 > > > > Thanks, > > Christian > > > From chris.plummer at oracle.com Mon Aug 22 23:08:07 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 22 Aug 2016 16:08:07 -0700 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> Message-ID: <798ed613-5a5e-c4dd-6d8e-e4bb60673cc8@oracle.com> On 8/22/16 3:09 PM, David Holmes wrote: > Hi Christian, > > Quick feedback ... > > On 23/08/2016 5:03 AM, Christian Tornqvist wrote: >> Hi everyone, >> >> Please review this change that adds a set of tests for the Minimal >> variant >> of the JVM. The Minimal JVM is a subset that excludes some >> functionality, >> the tests here are intended to test verify that trying to use this >> excluded >> functionality doesn't lead to any unexpected errors/crashes. > > Note (for everyone) it was only a requirement that the primary options > for managing unsupported features give a meaningful error message. For > example UseG1GC should report it is not available, but selecting an > option that only works for G1 will report whatever is reported when > that option is called without G1 being selected (ie it wont report > "this option only works with G1 and G1 is not available in the minimal > VM"). And some failure modes may vary depending on which modules are > available. > >> Verified by running the tests on Linux ARMv7 and Linux x86. >> >> Webrev: >> >> http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ > > Initial comment only. I'm not sure that @requires vm.flavor == > "minimal" is the right way to do this. When you are going to launch a > secondary VM with -minimal it suffices that the JRE/JDK under test has > the minimal VM available, it isn't required that the main test VM run > in the minimal VM. Granted jtreg can't tell you that (I don't think). > There are features of the test library that use API's and VM > capabilities that may not exist in the minimal VM (ie RuntimeMXBeans). We need to make sure the tests are not run on platforms that don't support the minimalVM, because they will fail. Using @requires vm.flavor == "minimal" guarantees that the minimalVM is supported, and that you will get it when the secondary VM is launched with -minimal. Chris > > More later ... > > Thanks, > David > >> >> >> Bug (unfortunately not visible): >> >> https://bugs.openjdk.java.net/browse/JDK-8155964 >> >> >> >> Thanks, >> >> Christian >> >> >> From george.triantafillou at oracle.com Tue Aug 23 00:03:31 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Mon, 22 Aug 2016 20:03:31 -0400 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: <10e801d1fca7$d8710ac0$89532040$@oracle.com> References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> Message-ID: <231098d6-dc72-fd61-d192-b85d7089e7b4@oracle.com> Hi Christian, Your changes look good. -George On 8/22/2016 3:03 PM, Christian Tornqvist wrote: > Hi everyone, > > > > Please review this change that adds a set of tests for the Minimal variant > of the JVM. The Minimal JVM is a subset that excludes some functionality, > the tests here are intended to test verify that trying to use this excluded > functionality doesn't lead to any unexpected errors/crashes. > > > > Verified by running the tests on Linux ARMv7 and Linux x86. > > > > Webrev: > > http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ > > > > Bug (unfortunately not visible): > > https://bugs.openjdk.java.net/browse/JDK-8155964 > > > > Thanks, > > Christian > > > From david.holmes at oracle.com Tue Aug 23 00:57:42 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Aug 2016 10:57:42 +1000 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: <798ed613-5a5e-c4dd-6d8e-e4bb60673cc8@oracle.com> References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> <798ed613-5a5e-c4dd-6d8e-e4bb60673cc8@oracle.com> Message-ID: On 23/08/2016 9:08 AM, Chris Plummer wrote: > On 8/22/16 3:09 PM, David Holmes wrote: >> Hi Christian, >> >> Quick feedback ... >> >> On 23/08/2016 5:03 AM, Christian Tornqvist wrote: >>> Hi everyone, >>> >>> Please review this change that adds a set of tests for the Minimal >>> variant >>> of the JVM. The Minimal JVM is a subset that excludes some >>> functionality, >>> the tests here are intended to test verify that trying to use this >>> excluded >>> functionality doesn't lead to any unexpected errors/crashes. >> >> Note (for everyone) it was only a requirement that the primary options >> for managing unsupported features give a meaningful error message. For >> example UseG1GC should report it is not available, but selecting an >> option that only works for G1 will report whatever is reported when >> that option is called without G1 being selected (ie it wont report >> "this option only works with G1 and G1 is not available in the minimal >> VM"). And some failure modes may vary depending on which modules are >> available. >> >>> Verified by running the tests on Linux ARMv7 and Linux x86. >>> >>> Webrev: >>> >>> http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ >> >> Initial comment only. I'm not sure that @requires vm.flavor == >> "minimal" is the right way to do this. When you are going to launch a >> secondary VM with -minimal it suffices that the JRE/JDK under test has >> the minimal VM available, it isn't required that the main test VM run >> in the minimal VM. Granted jtreg can't tell you that (I don't think). >> There are features of the test library that use API's and VM >> capabilities that may not exist in the minimal VM (ie RuntimeMXBeans). > We need to make sure the tests are not run on platforms that don't > support the minimalVM, because they will fail. Using @requires vm.flavor > == "minimal" guarantees that the minimalVM is supported, and that you > will get it when the secondary VM is launched with -minimal. Understood. We should be okay as long as -minimal is not aliased (I just discovered -client is aliased to -server on linux x86!). Was also concerned about use of MXBeans to get runtime info - but maybe that only affects execution on "compact profiles" not minimalVM per-se. Cheers, David > Chris >> >> More later ... >> >> Thanks, >> David >> >>> >>> >>> Bug (unfortunately not visible): >>> >>> https://bugs.openjdk.java.net/browse/JDK-8155964 >>> >>> >>> >>> Thanks, >>> >>> Christian >>> >>> >>> > From david.holmes at oracle.com Tue Aug 23 02:08:47 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Aug 2016 12:08:47 +1000 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> Message-ID: Hi Christian, Okay more detailed review - everything looks fine. A couple of minor comments. test/runtime/MinimalVM/Instrumentation.java No @requires -minimal ? Should not need: 29 * jdk.jartool/sun.tools.jar 30 * @run main RedefineClassHelper as the jar file doesn't actually need to exist --- Only other comment is regarding what seems to be missing: - no check of -Xrun:jdwp (though I couldn't figure out the syntax to trigger the "Debugging agents are not supported ..." message - no check of various -XX flags (eg UseG1GC and other GC flags, ProfileInterpreter) Thanks, David On 23/08/2016 8:09 AM, David Holmes wrote: > Hi Christian, > > Quick feedback ... > > On 23/08/2016 5:03 AM, Christian Tornqvist wrote: >> Hi everyone, >> >> Please review this change that adds a set of tests for the Minimal >> variant >> of the JVM. The Minimal JVM is a subset that excludes some functionality, >> the tests here are intended to test verify that trying to use this >> excluded >> functionality doesn't lead to any unexpected errors/crashes. > > Note (for everyone) it was only a requirement that the primary options > for managing unsupported features give a meaningful error message. For > example UseG1GC should report it is not available, but selecting an > option that only works for G1 will report whatever is reported when that > option is called without G1 being selected (ie it wont report "this > option only works with G1 and G1 is not available in the minimal VM"). > And some failure modes may vary depending on which modules are available. > >> Verified by running the tests on Linux ARMv7 and Linux x86. >> >> Webrev: >> >> http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ > > Initial comment only. I'm not sure that @requires vm.flavor == "minimal" > is the right way to do this. When you are going to launch a secondary VM > with -minimal it suffices that the JRE/JDK under test has the minimal VM > available, it isn't required that the main test VM run in the minimal > VM. Granted jtreg can't tell you that (I don't think). There are > features of the test library that use API's and VM capabilities that may > not exist in the minimal VM (ie RuntimeMXBeans). > > More later ... > > Thanks, > David > >> >> >> Bug (unfortunately not visible): >> >> https://bugs.openjdk.java.net/browse/JDK-8155964 >> >> >> >> Thanks, >> >> Christian >> >> >> From david.holmes at oracle.com Tue Aug 23 07:48:34 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Aug 2016 17:48:34 +1000 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> Message-ID: <31c7aab6-2082-db8d-6598-e137ec4dbaf0@oracle.com> Hi Zhengyu, On 22/08/2016 11:13 PM, Zhengyu Gu wrote: > Hi David, > > The changes look good to me. Thanks for the review! > Just a minor comment: > > I saw you made InstanceKlass::_array_klasses pointer "volatile", but not > some other places. I know that probably it has not effort, but should we > make all these pointers "volatile" just for consistency? Yes. I missed _methods_jmethod_ids in instanceKLass.hpp, and _next in classLoader.hpp - now fixed. Webrev updating in place. Thanks, David > Thanks, > > -Zhengyu > > > > On 08/22/2016 12:16 AM, David Holmes wrote: >> I went to push this and realized I hadn't hg add'ed the new >> >> src/share/vm/oops/arrayKlass.inline.hpp >> >> which is also missing from the webrev (but now updated in place). >> >> Thanks, >> David >> >> On 19/08/2016 12:02 PM, Daniel D. Daugherty wrote: >>> On 8/17/16 8:50 PM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8158854 >>>> >>>> webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ >>> >>> src/share/vm/classfile/classLoader.hpp >>> No comments. >>> >>> src/share/vm/classfile/verifier.cpp >>> No comments. >>> >>> src/share/vm/oops/arrayKlass.hpp >>> No comments. >>> >>> src/share/vm/oops/instanceKlass.cpp >>> No comments. >>> >>> src/share/vm/oops/instanceKlass.hpp >>> No comments. >>> >>> src/share/vm/oops/instanceKlass.inline.hpp >>> No comments. >>> >>> src/share/vm/oops/objArrayKlass.cpp >>> No comments. >>> >>> src/share/vm/oops/typeArrayKlass.cpp >>> No comments. >>> >>> src/share/vm/runtime/vmStructs.cpp >>> No comments. >>> >>> Thumbs up. >>> >>> Dan >>> >>> >>>> >>>> >>>> Generally speaking release_store should be paired with load_acquire to >>>> ensure correct memory visibility and ordering in lock-free code (often >>>> the read path is what is lock-free). So based on some observations >>>> from earlier bug fixes this bug was intended to examine the use of >>>> release_store and see if we have the appropriate load_acquire as well. >>>> The bug report lists all of the cases that were examined - some clear >>>> cut correct, some complex correct, some fixed here and some split out >>>> into separate issues. >>>> >>>> Here's a summary of the actual changes in the webrev: >>>> >>>> src/share/vm/classfile/classLoader.hpp >>>> >>>> - next() accessor needs to use load_acquire. >>>> >>>> --- >>>> >>>> src/share/vm/classfile/verifier.cpp >>>> >>>> - load of _verify_byte_codes_fn needs to load_acquire to pair with use >>>> of release_store >>>> - release_store of _is_new_verify_byte_codes_fn is not needed >>>> >>>> --- >>>> >>>> src/share/vm/oops/arrayKlass.hpp >>>> src/share/vm/oops/instanceKlass.cpp >>>> src/share/vm/oops/instanceKlass.hpp >>>> src/share/vm/oops/instanceKlass.inline.hpp >>>> src/share/vm/oops/objArrayKlass.cpp >>>> src/share/vm/oops/typeArrayKlass.cpp >>>> >>>> The logic for storing dimensions values was using a storeStore barrier >>>> between the lower and higher dimensions. This is converted to use a >>>> release-store setter for higher-dimension, with paired load-acquire >>>> accessor. Plus the accessed fields are declared volatile. >>>> >>>> The methods_jmethod_ids_acquire() and its paired >>>> release_set_methods_jmethod_ids(), are moved to the .inline.hpp file >>>> where they belong. >>>> >>>> --- >>>> >>>> src/share/vm/runtime/vmStructs.cpp >>>> >>>> Updated declaration for _array_klasses now it is volatile. >>>> >>>> --- >>>> >>>> Thanks, >>>> David >>> > From david.holmes at oracle.com Tue Aug 23 08:55:04 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Aug 2016 18:55:04 +1000 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> Message-ID: <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> Hi Volker, Andrew, On 23/08/2016 12:27 AM, Volker Simonis wrote: > Hi, > > I don't particularly like the const_casts as well. I would have thought this was exactly the kind of thing const_cast was good for - avoiding the need to define multiple overloads to deal with volatile, non-volatile, const etc. > Why not change pointer_delta to accept pointers to volatiles as well: > > pointer_delta(const volatile void* left, const volatile void* right, I can do that. I also have to make a similar change to align_ptr_down. Now should I also change align_ptr_up for consistency (though I note they are already inconsistent in that one takes void* and one takes const void*) ? Alternative webrev at: http://cr.openjdk.java.net/~dholmes/8157904/webrev.v2/ Thanks, David ----- > Notice that "const volatile void*" means a pointer to a value which > might change unexpectedly but which can not be changed by the program > itself. As the function doesn't really dereferences the pointers (i.e. > reads the values pointed to by the pointers) but only computes the > delta of the pointers themselves, this shouldn't do any harm regarding > the optimization possibilities. > > Because there are standard conversion from "*" to "volatile > *", the new version will still work for callers which use > non-volatile arguments. > > Regards, > Volker > > > On Mon, Aug 22, 2016 at 12:59 PM, David Holmes wrote: >> On 22/08/2016 6:25 PM, Andrew Haley wrote: >>> >>> On 22/08/16 08:54, David Holmes wrote: >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8157904 >>>> >>>> webrev: http://cr.openjdk.java.net/~dholmes/8157904/webrev/ >>> >>> >>> The casting away volatile is confusing and ugly. Is the problem >>> that we don't have volatile cmpxchg() ? Or that we don't have a >>> pointer_delta() which accepts a volatile? >> >> >> The latter - pointer_delta takes const args. >> >> David >> >>> Andrew. >>> >>> >> From ioi.lam at oracle.com Tue Aug 23 10:01:52 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 23 Aug 2016 03:01:52 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol Message-ID: <57BC1F10.2000104@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8161280 http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ Summary: The test was loading a lot of JCK classes into the same VM. Many of the JCK classes refer to "javasoft/sqe/javatest/Status", so the refcount (a signed short integer) of this Symbol would run up and past 0x7fff. The assert was caused by a race condition: the refcount started with a large (16-bit) positive value such as 0x7fff, one thread is decrementing and several other threads are incrementing. The refcount will end up being 0x8000 or slightly higher (limited to the number of concurrent threads that are running within a small window of several instructions in the decrementing thread, so most likely it will be 0x800?). As a result, the decrementing thread found that the refecount is negative after the operation, and thought that an underflow had happened. The fix is to ignore any value that may appear in the [0x8000 - 0xbfff] range and do not flag these as underflows (since they are most likely overflows -- overflows are already handled by making the Symbol permanent). Thanks - Ioi From leonid.mesnik at oracle.com Tue Aug 23 10:23:46 2016 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 23 Aug 2016 13:23:46 +0300 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> <798ed613-5a5e-c4dd-6d8e-e4bb60673cc8@oracle.com> Message-ID: <22eaf277-d0ce-7d61-2294-a5894b090055@oracle.com> David On 23.08.2016 03:57, David Holmes wrote: > On 23/08/2016 9:08 AM, Chris Plummer wrote: >> On 8/22/16 3:09 PM, David Holmes wrote: >>> Hi Christian, >>> >>> Quick feedback ... >>> >>> On 23/08/2016 5:03 AM, Christian Tornqvist wrote: >>>> Hi everyone, >>>> >>>> Please review this change that adds a set of tests for the Minimal >>>> variant >>>> of the JVM. The Minimal JVM is a subset that excludes some >>>> functionality, >>>> the tests here are intended to test verify that trying to use this >>>> excluded >>>> functionality doesn't lead to any unexpected errors/crashes. >>> >>> Note (for everyone) it was only a requirement that the primary options >>> for managing unsupported features give a meaningful error message. For >>> example UseG1GC should report it is not available, but selecting an >>> option that only works for G1 will report whatever is reported when >>> that option is called without G1 being selected (ie it wont report >>> "this option only works with G1 and G1 is not available in the minimal >>> VM"). And some failure modes may vary depending on which modules are >>> available. >>> >>>> Verified by running the tests on Linux ARMv7 and Linux x86. >>>> >>>> Webrev: >>>> >>>> http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ >>> >>> Initial comment only. I'm not sure that @requires vm.flavor == >>> "minimal" is the right way to do this. When you are going to launch a >>> secondary VM with -minimal it suffices that the JRE/JDK under test has >>> the minimal VM available, it isn't required that the main test VM run >>> in the minimal VM. Granted jtreg can't tell you that (I don't think). >>> There are features of the test library that use API's and VM >>> capabilities that may not exist in the minimal VM (ie RuntimeMXBeans). >> We need to make sure the tests are not run on platforms that don't >> support the minimalVM, because they will fail. Using @requires vm.flavor >> == "minimal" guarantees that the minimalVM is supported, and that you >> will get it when the secondary VM is launched with -minimal. > > Understood. We should be okay as long as -minimal is not aliased (I > just discovered -client is aliased to -server on linux x86!). Was also > concerned about use of MXBeans to get runtime info - but maybe that > only affects execution on "compact profiles" not minimalVM per-se. The @requires vm.flavor == "minimal" means that tests are executed only when jtreg runs with vmoption '-minimal'. It is good enough solution assuming that these tests are executed only with '-minimal' on supported platforms. It is possible extend test/jtreg-ext/requires/VMProps.java with new property "isMinimalVMSupported" which could be used to run negative minimal VM tests only for JDK where '-minimal' is supported. Leonid > > Cheers, > David > > >> Chris >>> >>> More later ... >>> >>> Thanks, >>> David >>> >>>> >>>> >>>> Bug (unfortunately not visible): >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8155964 >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Christian >>>> >>>> >>>> >> From david.simms at oracle.com Tue Aug 23 10:24:59 2016 From: david.simms at oracle.com (David Simms) Date: Tue, 23 Aug 2016 12:24:59 +0200 Subject: RFR (S): JDK-8164086: Checked JNI pending exception check should be cleared when returning to Java frame In-Reply-To: References: Message-ID: <6b96634e-a9c0-47b2-c17b-f6b2057a5f14@oracle.com> Reply in-line... On 19/08/16 14:29, David Holmes wrote: > Hi David, > > > The changes in the native wrapper seem okay though I'm not an expert > on the machine specific encodings. > > I'm a little surprised there are not more things that need changing > though. Does the JIT use those wrappers too? Yeah they do, I double checked Nils from compiler group. I also tested with -Xcomp, test failed without sharedRuntime fix. The test execution time was over 10 seconds, so I removed it from the jtreg test itself (hard-coded ProcessTools.executeTestJVM()) since it is part of "hotspot_fast_runtime". > Can we transition from Java to VM to native and then back - and if so > might we need to clear the pending exception check? (I'm not sure if > from in the VM a native call could actually be a JNI call, or will > only be a direct native call?). At first I thought JavaCallWrapper needs it, following all the places we manipulate the thread's active handle block (besides manual push/pop). But then call helper just ends up calling the native wrapper, which takes care of it. Not a direct native call. So I left it, as-is. > > Did you intend to leave in the changes to > jdk/src/java.base/share/native/libjli/java.c? It looks like debug/test > code to me. The launcher produces warnings (Java method invokes) that break the jtreg test, so yeah, thought it was best to check and print them. Some of the existing code checks and silently returns, I followed the same pattern where that pattern was in place. > > The test I'm finding a bit hard to follow but don't you need to check > for pending exceptions here: > > 29 static jmethodID get_method_id(JNIEnv *env, jclass clz, jstring > jname, jstring jsig) { > 30 jmethodID mid; > 31 const char *name, *sig; > 32 name = (*env)->GetStringUTFChars(env, jname, NULL); > 33 sig = (*env)->GetStringUTFChars(env, jsig, NULL); > 34 mid = (*env)->GetMethodID(env, clz, name, sig); > > to avoid triggering the warning? > Those methods don't require an explicit check since there return values denote an error condition. Whilst Java invoke return values are user defined, so they do need it https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/design.html#asynchronous_exceptions). Technically array stores need to check for AIOOBE, but given most code handles index/bounds checks, it seemed way too pedantic (commented in jniCheck.cpp:176). Cheers /David Simms From christian.tornqvist at oracle.com Tue Aug 23 10:47:23 2016 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Tue, 23 Aug 2016 06:47:23 -0400 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> Message-ID: <14c001d1fd2b$bb6f2720$324d7560$@oracle.com> Hi Chris, >Why does Instrumentation.java not have @requires minimal? It should, I'll correct this. >Why does JMX.java have: > >28 * @run main/othervm -minimal JMX This was done to make sure the parent process was running minimal for the jcmd check at line 50. This isn't needed, I'll remove it. Thanks, Christian -----Original Message----- From: Chris Plummer [mailto:chris.plummer at oracle.com] Sent: Monday, August 22, 2016 5:15 PM To: Christian Tornqvist ; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM Hi Christian, Overall it looks good. Thanks for adding these tests. Just a few questions: Why does Instrumentation.java not have @requires minimal? Why does JMX.java have: 28 * @run main/othervm -minimal JMX Have you tested with a jre that only has minimalvm and java.base? thanks, Chris On 8/22/16 12:03 PM, Christian Tornqvist wrote: > Hi everyone, > > > > Please review this change that adds a set of tests for the Minimal > variant of the JVM. The Minimal JVM is a subset that excludes some > functionality, the tests here are intended to test verify that trying > to use this excluded functionality doesn't lead to any unexpected errors/crashes. > > > > Verified by running the tests on Linux ARMv7 and Linux x86. > > > > Webrev: > > http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ > > > > Bug (unfortunately not visible): > > https://bugs.openjdk.java.net/browse/JDK-8155964 > > > > Thanks, > > Christian > > > From marcus.larsson at oracle.com Tue Aug 23 11:17:47 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Tue, 23 Aug 2016 13:17:47 +0200 Subject: RFR: 8150894: Unused -Xlog tag sequences are silently ignored. In-Reply-To: <570D02CF.7070708@oracle.com> References: <5704B0B0.4010404@oracle.com> <570D02CF.7070708@oracle.com> Message-ID: <307928f7-55b3-999d-b3e8-ffd5966457a1@oracle.com> Hi, Still looking for a Reviewer for this. (Rebased webrev in-place.) Thanks, Marcus On 04/12/2016 04:14 PM, Marcus Larsson wrote: > Ping! > > On 04/06/2016 08:46 AM, Marcus Larsson wrote: >> Hi, >> >> Please review the following patch to add a warning for when tag >> selections in -Xlog or VM.log don't match any tag sets used in the VM. >> >> Webrev: >> http://cr.openjdk.java.net/~mlarsson/8150894/webrev.00/ >> >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8150894 >> >> Testing: >> Internal VM tests with RBT >> >> Thanks, >> Marcus > From zgu at redhat.com Tue Aug 23 12:04:03 2016 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 23 Aug 2016 08:04:03 -0400 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: <31c7aab6-2082-db8d-6598-e137ec4dbaf0@oracle.com> References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> <31c7aab6-2082-db8d-6598-e137ec4dbaf0@oracle.com> Message-ID: <9563ba26-ac59-a1ff-09ba-e48bc6b3690b@redhat.com> Thanks. Look good to me. -Zhengyu On 08/23/2016 03:48 AM, David Holmes wrote: > Hi Zhengyu, > > On 22/08/2016 11:13 PM, Zhengyu Gu wrote: >> Hi David, >> >> The changes look good to me. > > Thanks for the review! > >> Just a minor comment: >> >> I saw you made InstanceKlass::_array_klasses pointer "volatile", but not >> some other places. I know that probably it has not effort, but should we >> make all these pointers "volatile" just for consistency? > > Yes. I missed _methods_jmethod_ids in instanceKLass.hpp, and _next in > classLoader.hpp - now fixed. > > Webrev updating in place. > > Thanks, > David > >> Thanks, >> >> -Zhengyu >> >> >> >> On 08/22/2016 12:16 AM, David Holmes wrote: >>> I went to push this and realized I hadn't hg add'ed the new >>> >>> src/share/vm/oops/arrayKlass.inline.hpp >>> >>> which is also missing from the webrev (but now updated in place). >>> >>> Thanks, >>> David >>> >>> On 19/08/2016 12:02 PM, Daniel D. Daugherty wrote: >>>> On 8/17/16 8:50 PM, David Holmes wrote: >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8158854 >>>>> >>>>> webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ >>>> >>>> src/share/vm/classfile/classLoader.hpp >>>> No comments. >>>> >>>> src/share/vm/classfile/verifier.cpp >>>> No comments. >>>> >>>> src/share/vm/oops/arrayKlass.hpp >>>> No comments. >>>> >>>> src/share/vm/oops/instanceKlass.cpp >>>> No comments. >>>> >>>> src/share/vm/oops/instanceKlass.hpp >>>> No comments. >>>> >>>> src/share/vm/oops/instanceKlass.inline.hpp >>>> No comments. >>>> >>>> src/share/vm/oops/objArrayKlass.cpp >>>> No comments. >>>> >>>> src/share/vm/oops/typeArrayKlass.cpp >>>> No comments. >>>> >>>> src/share/vm/runtime/vmStructs.cpp >>>> No comments. >>>> >>>> Thumbs up. >>>> >>>> Dan >>>> >>>> >>>>> >>>>> >>>>> Generally speaking release_store should be paired with >>>>> load_acquire to >>>>> ensure correct memory visibility and ordering in lock-free code >>>>> (often >>>>> the read path is what is lock-free). So based on some observations >>>>> from earlier bug fixes this bug was intended to examine the use of >>>>> release_store and see if we have the appropriate load_acquire as >>>>> well. >>>>> The bug report lists all of the cases that were examined - some clear >>>>> cut correct, some complex correct, some fixed here and some split out >>>>> into separate issues. >>>>> >>>>> Here's a summary of the actual changes in the webrev: >>>>> >>>>> src/share/vm/classfile/classLoader.hpp >>>>> >>>>> - next() accessor needs to use load_acquire. >>>>> >>>>> --- >>>>> >>>>> src/share/vm/classfile/verifier.cpp >>>>> >>>>> - load of _verify_byte_codes_fn needs to load_acquire to pair with >>>>> use >>>>> of release_store >>>>> - release_store of _is_new_verify_byte_codes_fn is not needed >>>>> >>>>> --- >>>>> >>>>> src/share/vm/oops/arrayKlass.hpp >>>>> src/share/vm/oops/instanceKlass.cpp >>>>> src/share/vm/oops/instanceKlass.hpp >>>>> src/share/vm/oops/instanceKlass.inline.hpp >>>>> src/share/vm/oops/objArrayKlass.cpp >>>>> src/share/vm/oops/typeArrayKlass.cpp >>>>> >>>>> The logic for storing dimensions values was using a storeStore >>>>> barrier >>>>> between the lower and higher dimensions. This is converted to use a >>>>> release-store setter for higher-dimension, with paired load-acquire >>>>> accessor. Plus the accessed fields are declared volatile. >>>>> >>>>> The methods_jmethod_ids_acquire() and its paired >>>>> release_set_methods_jmethod_ids(), are moved to the .inline.hpp file >>>>> where they belong. >>>>> >>>>> --- >>>>> >>>>> src/share/vm/runtime/vmStructs.cpp >>>>> >>>>> Updated declaration for _array_klasses now it is volatile. >>>>> >>>>> --- >>>>> >>>>> Thanks, >>>>> David >>>> >> From david.holmes at oracle.com Tue Aug 23 12:04:51 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Aug 2016 22:04:51 +1000 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: <14c001d1fd2b$bb6f2720$324d7560$@oracle.com> References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> <14c001d1fd2b$bb6f2720$324d7560$@oracle.com> Message-ID: <9fabb0c8-7a0f-b693-f9a4-e9afd0c34d69@oracle.com> On 23/08/2016 8:47 PM, Christian Tornqvist wrote: > Hi Chris, > >> Why does Instrumentation.java not have @requires minimal? > It should, I'll correct this. > >> Why does JMX.java have: >> >> 28 * @run main/othervm -minimal JMX > This was done to make sure the parent process was running minimal for the jcmd check at line 50. This isn't needed, I'll remove it. Hmm I thought it was needed else the parent VM will show for the jcmd request. ?? David ----- > Thanks, > Christian > > -----Original Message----- > From: Chris Plummer [mailto:chris.plummer at oracle.com] > Sent: Monday, August 22, 2016 5:15 PM > To: Christian Tornqvist ; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM > > Hi Christian, > > Overall it looks good. Thanks for adding these tests. Just a few questions: > > Why does Instrumentation.java not have @requires minimal? > > Why does JMX.java have: > > 28 * @run main/othervm -minimal JMX > > Have you tested with a jre that only has minimalvm and java.base? > > thanks, > > Chris > > On 8/22/16 12:03 PM, Christian Tornqvist wrote: >> Hi everyone, >> >> >> >> Please review this change that adds a set of tests for the Minimal >> variant of the JVM. The Minimal JVM is a subset that excludes some >> functionality, the tests here are intended to test verify that trying >> to use this excluded functionality doesn't lead to any unexpected errors/crashes. >> >> >> >> Verified by running the tests on Linux ARMv7 and Linux x86. >> >> >> >> Webrev: >> >> http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ >> >> >> >> Bug (unfortunately not visible): >> >> https://bugs.openjdk.java.net/browse/JDK-8155964 >> >> >> >> Thanks, >> >> Christian >> >> >> > > From david.holmes at oracle.com Tue Aug 23 12:16:30 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Aug 2016 22:16:30 +1000 Subject: RFR (S): JDK-8164086: Checked JNI pending exception check should be cleared when returning to Java frame In-Reply-To: <6b96634e-a9c0-47b2-c17b-f6b2057a5f14@oracle.com> References: <6b96634e-a9c0-47b2-c17b-f6b2057a5f14@oracle.com> Message-ID: Hi David On 23/08/2016 8:24 PM, David Simms wrote: > > Reply in-line... > > On 19/08/16 14:29, David Holmes wrote: >> Hi David, >> >> >> The changes in the native wrapper seem okay though I'm not an expert >> on the machine specific encodings. >> >> I'm a little surprised there are not more things that need changing >> though. Does the JIT use those wrappers too? > > Yeah they do, I double checked Nils from compiler group. I also tested > with -Xcomp, test failed without sharedRuntime fix. The test execution > time was over 10 seconds, so I removed it from the jtreg test itself > (hard-coded ProcessTools.executeTestJVM()) since it is part of > "hotspot_fast_runtime". > >> Can we transition from Java to VM to native and then back - and if so >> might we need to clear the pending exception check? (I'm not sure if >> from in the VM a native call could actually be a JNI call, or will >> only be a direct native call?). > > At first I thought JavaCallWrapper needs it, following all the places we > manipulate the thread's active handle block (besides manual push/pop). > But then call helper just ends up calling the native wrapper, which > takes care of it. Not a direct native call. So I left it, as-is. That's not the case I was thinking of. We have ThreadToNativeFromVM and then we do native stuff - if any of that were JNI-based (perhaps it is not) then we would enable the check but not disable it again when returning from VM to Java. >> >> Did you intend to leave in the changes to >> jdk/src/java.base/share/native/libjli/java.c? It looks like debug/test >> code to me. > > The launcher produces warnings (Java method invokes) that break the > jtreg test, so yeah, thought it was best to check and print them. Some > of the existing code checks and silently returns, I followed the same > pattern where that pattern was in place. This needs to be looked at closer then and reviewed by the launcher folk (ie Kumar). >> >> The test I'm finding a bit hard to follow but don't you need to check >> for pending exceptions here: >> >> 29 static jmethodID get_method_id(JNIEnv *env, jclass clz, jstring >> jname, jstring jsig) { >> 30 jmethodID mid; >> 31 const char *name, *sig; >> 32 name = (*env)->GetStringUTFChars(env, jname, NULL); >> 33 sig = (*env)->GetStringUTFChars(env, jsig, NULL); >> 34 mid = (*env)->GetMethodID(env, clz, name, sig); >> >> to avoid triggering the warning? >> > Those methods don't require an explicit check since there return values > denote an error condition. > > Whilst Java invoke return values are user defined, so they do need > it > https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/design.html#asynchronous_exceptions). > > Technically array stores need to check for AIOOBE, but given most > code handles index/bounds checks, it seemed way too pedantic > (commented in jniCheck.cpp:176). Not following. GetStringUTFChars can post OOME so we would enable the check-flag if that happens on the first call above, then the second call would be made with the exception pending and trigger the warning. Thanks, David H. > Cheers > /David Simms From volker.simonis at gmail.com Tue Aug 23 12:20:02 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 23 Aug 2016 14:20:02 +0200 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> Message-ID: On Tue, Aug 23, 2016 at 10:55 AM, David Holmes wrote: > Hi Volker, Andrew, > > On 23/08/2016 12:27 AM, Volker Simonis wrote: >> >> Hi, >> >> I don't particularly like the const_casts as well. > > > I would have thought this was exactly the kind of thing const_cast was good > for - avoiding the need to define multiple overloads to deal with volatile, > non-volatile, const etc. > >> Why not change pointer_delta to accept pointers to volatiles as well: >> >> pointer_delta(const volatile void* left, const volatile void* right, > > > I can do that. I also have to make a similar change to align_ptr_down. Now > should I also change align_ptr_up for consistency (though I note they are > already inconsistent in that one takes void* and one takes const void*) ? > > Alternative webrev at: > > http://cr.openjdk.java.net/~dholmes/8157904/webrev.v2/ I like it much better this way, but you don't need the old versions any more. pointer_delta() taking "const volatile void*" and align_ptr_down() taking "volatile void*" arguments are enough. The other versions can be removed. I tried to make that clear in my previous mail: Because there are standard conversion from "*" to "volatile *", the new version will still work for callers which use non-volatile arguments. And: Notice that "const volatile void*" means a pointer to a value which might change unexpectedly but which can not be changed by the program itself. As the function doesn't really dereferences the pointers (i.e. reads the values pointed to by the pointers) but only computes the delta of the pointers themselves, this shouldn't do any harm regarding the optimization possibilities. Thanks, Volker > > Thanks, > David > ----- > > > >> Notice that "const volatile void*" means a pointer to a value which >> might change unexpectedly but which can not be changed by the program >> itself. As the function doesn't really dereferences the pointers (i.e. >> reads the values pointed to by the pointers) but only computes the >> delta of the pointers themselves, this shouldn't do any harm regarding >> the optimization possibilities. >> >> Because there are standard conversion from "*" to "volatile >> *", the new version will still work for callers which use >> non-volatile arguments. >> >> Regards, >> Volker >> >> >> On Mon, Aug 22, 2016 at 12:59 PM, David Holmes >> wrote: >>> >>> On 22/08/2016 6:25 PM, Andrew Haley wrote: >>>> >>>> >>>> On 22/08/16 08:54, David Holmes wrote: >>>>> >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8157904 >>>>> >>>>> webrev: http://cr.openjdk.java.net/~dholmes/8157904/webrev/ >>>> >>>> >>>> >>>> The casting away volatile is confusing and ugly. Is the problem >>>> that we don't have volatile cmpxchg() ? Or that we don't have a >>>> pointer_delta() which accepts a volatile? >>> >>> >>> >>> The latter - pointer_delta takes const args. >>> >>> David >>> >>>> Andrew. >>>> >>>> >>> > From dmitry.samersoff at oracle.com Tue Aug 23 12:34:43 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Tue, 23 Aug 2016 15:34:43 +0300 Subject: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools In-Reply-To: References: Message-ID: <12dc677b-e33c-0345-4680-e97cc1604cbe@oracle.com> On 2016-08-17 10:51, Dmitry Samersoff wrote: > Everybody, > > Please review the changes: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.01/ > > -Dmitry > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From christian.tornqvist at oracle.com Tue Aug 23 12:44:40 2016 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Tue, 23 Aug 2016 08:44:40 -0400 Subject: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM In-Reply-To: <9fabb0c8-7a0f-b693-f9a4-e9afd0c34d69@oracle.com> References: <10e801d1fca7$d8710ac0$89532040$@oracle.com> <14c001d1fd2b$bb6f2720$324d7560$@oracle.com> <9fabb0c8-7a0f-b693-f9a4-e9afd0c34d69@oracle.com> Message-ID: <158801d1fd3c$1dc1a4b0$5944ee10$@oracle.com> Since we'll pass -minimal in as an external flag, this shouldn't be needed. -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Tuesday, August 23, 2016 8:05 AM To: Christian Tornqvist ; 'Chris Plummer' ; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8155964 - Create a set of tests for verifying the Minimal VM On 23/08/2016 8:47 PM, Christian Tornqvist wrote: > Hi Chris, > >> Why does Instrumentation.java not have @requires minimal? > It should, I'll correct this. > >> Why does JMX.java have: >> >> 28 * @run main/othervm -minimal JMX > This was done to make sure the parent process was running minimal for the jcmd check at line 50. This isn't needed, I'll remove it. Hmm I thought it was needed else the parent VM will show for the jcmd request. ?? David ----- > Thanks, > Christian > > -----Original Message----- > From: Chris Plummer [mailto:chris.plummer at oracle.com] > Sent: Monday, August 22, 2016 5:15 PM > To: Christian Tornqvist ; > hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8155964 - Create a set of tests for verifying the > Minimal VM > > Hi Christian, > > Overall it looks good. Thanks for adding these tests. Just a few questions: > > Why does Instrumentation.java not have @requires minimal? > > Why does JMX.java have: > > 28 * @run main/othervm -minimal JMX > > Have you tested with a jre that only has minimalvm and java.base? > > thanks, > > Chris > > On 8/22/16 12:03 PM, Christian Tornqvist wrote: >> Hi everyone, >> >> >> >> Please review this change that adds a set of tests for the Minimal >> variant of the JVM. The Minimal JVM is a subset that excludes some >> functionality, the tests here are intended to test verify that trying >> to use this excluded functionality doesn't lead to any unexpected errors/crashes. >> >> >> >> Verified by running the tests on Linux ARMv7 and Linux x86. >> >> >> >> Webrev: >> >> http://cr.openjdk.java.net/~ctornqvi/webrev/8155964/webrev.00/ >> >> >> >> Bug (unfortunately not visible): >> >> https://bugs.openjdk.java.net/browse/JDK-8155964 >> >> >> >> Thanks, >> >> Christian >> >> >> > > From ioi.lam at oracle.com Tue Aug 23 13:10:25 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 23 Aug 2016 06:10:25 -0700 Subject: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools In-Reply-To: <12dc677b-e33c-0345-4680-e97cc1604cbe@oracle.com> References: <12dc677b-e33c-0345-4680-e97cc1604cbe@oracle.com> Message-ID: <57BC4B41.60305@oracle.com> Hi Dmitry, Why are you adding /test/lib: - * @library /lib/testlibrary + * @library /lib/testlibrary /test/lib The only class used by jdk/test/sun/tools/jps/*.java in /test/lib is here: TestJpsSanity.java:import jdk.test.lib.apps.LingeredApp; But TestJpsSanity.java is not use by this test -- I ran the test with your patch in a clean jtreg directory and the test passed, but I don't see TestJpsSanity.class, or any jdk.test.lib.* class. So I don't think you need to add /test/lib. - Ioi On 8/23/16 5:34 AM, Dmitry Samersoff wrote: > On 2016-08-17 10:51, Dmitry Samersoff wrote: >> Everybody, >> >> Please review the changes: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.01/ >> >> -Dmitry >> > From robbin.ehn at oracle.com Tue Aug 23 16:12:49 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 23 Aug 2016 18:12:49 +0200 Subject: RFR: 8164208: Update tests with redefine classes UL options and tags Message-ID: Hi all, This converts TraceRedefineClasses to UL in our tests. Webrev: http://cr.openjdk.java.net/~rehn/8164208/ Bug: https://bugs.openjdk.java.net/browse/JDK-8164208 Thanks! /Robbin From robbin.ehn at oracle.com Tue Aug 23 16:29:29 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 23 Aug 2016 18:29:29 +0200 Subject: RFR: 8158628: test/java/lang/instrument/NativeMethodPrefixAgent.java: Error occurred during initialization of VM: Failed to start tracing backend. Message-ID: <2955a5da-204b-2321-e825-2656ba788037@oracle.com> Hi all, This test should not run with jfr. Webrev: http://cr.openjdk.java.net/~rehn/8158628/webrev/ Bug: https://bugs.openjdk.java.net/browse/JDK-8158628 /Robbin From staffan.larsen at oracle.com Tue Aug 23 17:48:58 2016 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 23 Aug 2016 19:48:58 +0200 Subject: RFR: 8158628: test/java/lang/instrument/NativeMethodPrefixAgent.java: Error occurred during initialization of VM: Failed to start tracing backend. In-Reply-To: <2955a5da-204b-2321-e825-2656ba788037@oracle.com> References: <2955a5da-204b-2321-e825-2656ba788037@oracle.com> Message-ID: Looks good! Thanks, /Staffan > On 23 aug. 2016, at 18:29, Robbin Ehn wrote: > > Hi all, > > This test should not run with jfr. > > Webrev: > http://cr.openjdk.java.net/~rehn/8158628/webrev/ > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8158628 > > /Robbin From george.triantafillou at oracle.com Tue Aug 23 18:10:36 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 23 Aug 2016 14:10:36 -0400 Subject: RFR: 8164208: Update tests with redefine classes UL options and tags In-Reply-To: References: Message-ID: Hi Robbin, Looks good, (r)eviewed. -George On 8/23/2016 12:12 PM, Robbin Ehn wrote: > Hi all, > > This converts TraceRedefineClasses to UL in our tests. > > Webrev: > http://cr.openjdk.java.net/~rehn/8164208/ > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8164208 > > Thanks! > > /Robbin From george.triantafillou at oracle.com Tue Aug 23 18:11:32 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 23 Aug 2016 14:11:32 -0400 Subject: RFR: 8158628: test/java/lang/instrument/NativeMethodPrefixAgent.java: Error occurred during initialization of VM: Failed to start tracing backend. In-Reply-To: <2955a5da-204b-2321-e825-2656ba788037@oracle.com> References: <2955a5da-204b-2321-e825-2656ba788037@oracle.com> Message-ID: <20864a87-7dd4-e00d-4d15-fa4d99b01e0f@oracle.com> Hi Robbin, Looks good, (r)eviewed. -George On 8/23/2016 12:29 PM, Robbin Ehn wrote: > Hi all, > > This test should not run with jfr. > > Webrev: > http://cr.openjdk.java.net/~rehn/8158628/webrev/ > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8158628 > > /Robbin From coleen.phillimore at oracle.com Tue Aug 23 18:42:31 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 23 Aug 2016 14:42:31 -0400 Subject: RFR: 8164208: Update tests with redefine classes UL options and tags In-Reply-To: References: Message-ID: Robbin, I think this looks good. Was the output not excessive? thanks, Coleen On 8/23/16 12:12 PM, Robbin Ehn wrote: > Hi all, > > This converts TraceRedefineClasses to UL in our tests. > > Webrev: > http://cr.openjdk.java.net/~rehn/8164208/ > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8164208 > > Thanks! > > /Robbin From dmitry.samersoff at oracle.com Tue Aug 23 19:02:15 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Tue, 23 Aug 2016 22:02:15 +0300 Subject: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools In-Reply-To: <57BC4B41.60305@oracle.com> References: <12dc677b-e33c-0345-4680-e97cc1604cbe@oracle.com> <57BC4B41.60305@oracle.com> Message-ID: <938232c7-c206-f0db-7446-78960537ad2b@oracle.com> Ioi, Thank you for review. Hmm. It looks like changes below solves the problem. - * @build jdk.testlibrary.* JpsHelper JpsBase + * @build JpsHelper JpsBase I'm running rbt job to verify it. -Dmitry On 2016-08-23 16:10, Ioi Lam wrote: > Hi Dmitry, > > Why are you adding /test/lib: > > - * @library /lib/testlibrary > + * @library /lib/testlibrary /test/lib > > The only class used by jdk/test/sun/tools/jps/*.java in /test/lib is here: > > TestJpsSanity.java:import jdk.test.lib.apps.LingeredApp; > > But TestJpsSanity.java is not use by this test -- I ran the test with > your patch in a clean jtreg directory and the test passed, but I don't > see TestJpsSanity.class, or any jdk.test.lib.* class. > > So I don't think you need to add /test/lib. > > - Ioi > > On 8/23/16 5:34 AM, Dmitry Samersoff wrote: >> On 2016-08-17 10:51, Dmitry Samersoff wrote: >>> Everybody, >>> >>> Please review the changes: >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.01/ >>> >>> -Dmitry >>> >> > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From christian.tornqvist at oracle.com Tue Aug 23 19:10:20 2016 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Tue, 23 Aug 2016 15:10:20 -0400 Subject: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools In-Reply-To: <938232c7-c206-f0db-7446-78960537ad2b@oracle.com> References: <12dc677b-e33c-0345-4680-e97cc1604cbe@oracle.com> <57BC4B41.60305@oracle.com> <938232c7-c206-f0db-7446-78960537ad2b@oracle.com> Message-ID: <1f8101d1fd71$feb399d0$fc1acd70$@oracle.com> Hi Dmitry, You don't need to explicitly build JpsHelper, I also noticed that you're using ProcessTools and OutputAnalyzer from /lib/testlibrary , would it make sense to change this to use the /test/lib ones and simply have: @library /test/lib ? Thanks, Christian -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Dmitry Samersoff Sent: Tuesday, August 23, 2016 3:02 PM To: Ioi Lam ; serviceability-dev at openjdk.java.net; hotspot-runtime-dev Subject: Re: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools Ioi, Thank you for review. Hmm. It looks like changes below solves the problem. - * @build jdk.testlibrary.* JpsHelper JpsBase + * @build JpsHelper JpsBase I'm running rbt job to verify it. -Dmitry On 2016-08-23 16:10, Ioi Lam wrote: > Hi Dmitry, > > Why are you adding /test/lib: > > - * @library /lib/testlibrary > + * @library /lib/testlibrary /test/lib > > The only class used by jdk/test/sun/tools/jps/*.java in /test/lib is here: > > TestJpsSanity.java:import jdk.test.lib.apps.LingeredApp; > > But TestJpsSanity.java is not use by this test -- I ran the test with > your patch in a clean jtreg directory and the test passed, but I don't > see TestJpsSanity.class, or any jdk.test.lib.* class. > > So I don't think you need to add /test/lib. > > - Ioi > > On 8/23/16 5:34 AM, Dmitry Samersoff wrote: >> On 2016-08-17 10:51, Dmitry Samersoff wrote: >>> Everybody, >>> >>> Please review the changes: >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.01/ >>> >>> -Dmitry >>> >> > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From kim.barrett at oracle.com Tue Aug 23 20:25:55 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 23 Aug 2016 16:25:55 -0400 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> Message-ID: > On Aug 23, 2016, at 4:55 AM, David Holmes wrote: > > Hi Volker, Andrew, > > On 23/08/2016 12:27 AM, Volker Simonis wrote: >> Hi, >> >> I don't particularly like the const_casts as well. > > I would have thought this was exactly the kind of thing const_cast was good for - avoiding the need to define multiple overloads to deal with volatile, non-volatile, const etc. > >> Why not change pointer_delta to accept pointers to volatiles as well: >> >> pointer_delta(const volatile void* left, const volatile void* right, > > I can do that. I also have to make a similar change to align_ptr_down. Now should I also change align_ptr_up for consistency (though I note they are already inconsistent in that one takes void* and one takes const void*) ? > > Alternative webrev at: > > http://cr.openjdk.java.net/~dholmes/8157904/webrev.v2/ ------------------------------------------------------------------------------ src/share/vm/runtime/atomic.hpp 155 assert(sizeof(jbyte) == 1, "assumption"); STATIC_ASSERT would be better here. ------------------------------------------------------------------------------ src/share/vm/utilities/globalDefinitions.hpp 524 inline void* align_ptr_down(volatile void* ptr, size_t alignment) { 525 return (void*)align_size_down((intptr_t)ptr, (intptr_t)alignment); 526 } I think implicitly (to the caller of align_ptr_down) casting away volatile like this is a mistake. I disagree with the rationale for this change; stripping off volatile (or const) *should* be annoyingly in your face with a const_cast. The addition of volatile to pointer_delta is not the same sort of thing. I think that change is good, except I agree with Volker that only the one version is needed. ------------------------------------------------------------------------------ Otherwise looks good to me. Regarding: Now should I also change align_ptr_up for consistency (though I note they are already inconsistent in that one takes void* and one takes const void*) ? I think there should be two overloads of each of these, one with const qualified argument and result, and one without const qualification for either. That way the result has the same const-ness as the argument. We could double the number of overloads by similarly dealing with volatile, but I doubt there are enough relevant callers for that to be worthwhile; just use const_cast to deal with volatile at the call sites. But this is all a different issue... Another option would be to make the argument and result const-qualified, and make callers deal with the result, but there are probably enough call sites to make the second overload worthwhile. From christian.tornqvist at oracle.com Tue Aug 23 21:54:27 2016 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Tue, 23 Aug 2016 17:54:27 -0400 Subject: RFR(S): 8163146 - Remove os::check_heap on Windows Message-ID: <20ce01d1fd88$eb75cac0$c2616040$@oracle.com> Hi everyone, Please review this small change that removes os::check_heap, it's proven to be unreliable. The same functionality and more can be enabled in Windows using gflags (pageheap) or by turning on heap debugging in Windbg using the !heap extension. Tested by running hotspot_fast_runtime test group on all platforms. Webrev: http://cr.openjdk.java.net/~ctornqvi/webrev/8163146/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8163146 Thanks, Christian From coleen.phillimore at oracle.com Tue Aug 23 23:05:16 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 23 Aug 2016 19:05:16 -0400 Subject: RFR(S): 8163146 - Remove os::check_heap on Windows In-Reply-To: <20ce01d1fd88$eb75cac0$c2616040$@oracle.com> References: <20ce01d1fd88$eb75cac0$c2616040$@oracle.com> Message-ID: <33baf1f8-4421-a09f-1dc6-4827439d1f58@oracle.com> This looks great. thanks! Coleen On 8/23/16 5:54 PM, Christian Tornqvist wrote: > Hi everyone, > > > > Please review this small change that removes os::check_heap, it's proven to > be unreliable. The same functionality and more can be enabled in Windows > using gflags (pageheap) or by turning on heap debugging in Windbg using the > !heap extension. > > > > Tested by running hotspot_fast_runtime test group on all platforms. > > > > Webrev: > > http://cr.openjdk.java.net/~ctornqvi/webrev/8163146/webrev.00/ > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8163146 > > > > Thanks, > > Christian > From coleen.phillimore at oracle.com Tue Aug 23 23:24:37 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 23 Aug 2016 19:24:37 -0400 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BC1F10.2000104@oracle.com> References: <57BC1F10.2000104@oracle.com> Message-ID: <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> This doesn't make sense for me and I have to go in gdb to print out what -16384 is. It appears that this is trying to detect that we went below zero from zero, which is an error, but this isn't clear at all. It seems that if (_refcount >= 0) { Should be > 0 and we should assert if this is ever zero instead, and allow anything negative to mean that this count has gone immortal. Kim thought it should use CAS rather than atomic increment and decrement, but maybe that isn't necessary, especially since there isn't a short version of cmpxchg. thanks, Coleen On 8/23/16 6:01 AM, Ioi Lam wrote: > https://bugs.openjdk.java.net/browse/JDK-8161280 > http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ > > > Summary: > > The test was loading a lot of JCK classes into the same VM. Many of > the JCK classes refer to "javasoft/sqe/javatest/Status", so the > refcount (a signed short integer) of this Symbol would run up and past > 0x7fff. > > The assert was caused by a race condition: the refcount started with a > large (16-bit) positive value such as 0x7fff, one thread is > decrementing and several other threads are incrementing. The refcount > will end up being 0x8000 or slightly higher (limited to the number of > concurrent threads that are running within a small window of several > instructions in the decrementing thread, so most likely it will be > 0x800?). > > As a result, the decrementing thread found that the refecount is > negative after the operation, and thought that an underflow had happened. > > The fix is to ignore any value that may appear in the [0x8000 - > 0xbfff] range and do not flag these as underflows (since they are most > likely overflows -- overflows are already handled by making the Symbol > permanent). > > Thanks > - Ioi > > From george.triantafillou at oracle.com Tue Aug 23 23:25:00 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 23 Aug 2016 19:25:00 -0400 Subject: RFR(S): 8163146 - Remove os::check_heap on Windows In-Reply-To: <33baf1f8-4421-a09f-1dc6-4827439d1f58@oracle.com> References: <20ce01d1fd88$eb75cac0$c2616040$@oracle.com> <33baf1f8-4421-a09f-1dc6-4827439d1f58@oracle.com> Message-ID: <1e6f6b6f-37d5-b6a2-bc55-5940cfb23d7f@oracle.com> +1 -George On 8/23/2016 7:05 PM, Coleen Phillimore wrote: > > This looks great. > thanks! > Coleen > > On 8/23/16 5:54 PM, Christian Tornqvist wrote: >> Hi everyone, >> >> >> Please review this small change that removes os::check_heap, it's >> proven to >> be unreliable. The same functionality and more can be enabled in Windows >> using gflags (pageheap) or by turning on heap debugging in Windbg >> using the >> !heap extension. >> >> >> Tested by running hotspot_fast_runtime test group on all platforms. >> >> >> Webrev: >> >> http://cr.openjdk.java.net/~ctornqvi/webrev/8163146/webrev.00/ >> >> >> Bug: >> >> https://bugs.openjdk.java.net/browse/JDK-8163146 >> >> >> Thanks, >> >> Christian >> > From ioi.lam at oracle.com Wed Aug 24 01:03:07 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 23 Aug 2016 18:03:07 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> Message-ID: <57BCF24B.3000407@oracle.com> Hi Coleen, thanks for suggestion the simplification: void Symbol::decrement_refcount() { #ifdef ASSERT if (_refcount == 0) { print(); assert(false, "reference count underflow for symbol"); } } #endif Atomic::dec(&_refcount); } There's a race condition that won't detect the underflow. E.g., refcount is 1. Two threads comes in and decrement at the same time. We will end up with -1. However, it's not worse than before. The old version also has a race condition: refcount is 0 thread A decrements thread B increments thread A checks for underflow the decrementing thread will read _refcount==0 at the end so it won't detect the (transient) underflow. I think the failure to detect underflow is fine, since this happens only with concurrent access. The kinds of underflow that we are interested usually can be caught in single-threaded situations. Thanks - Ioi On 8/23/16 4:24 PM, Coleen Phillimore wrote: > > This doesn't make sense for me and I have to go in gdb to print out > what -16384 is. It appears that this is trying to detect that we > went below zero from zero, which is an error, but this isn't clear at > all. > > It seems that > > if (_refcount >= 0) { > > > Should be > 0 and we should assert if this is ever zero instead, and > allow anything negative to mean that this count has gone immortal. > > Kim thought it should use CAS rather than atomic increment and > decrement, but maybe that isn't necessary, especially since there > isn't a short version of cmpxchg. > > thanks, > Coleen > > On 8/23/16 6:01 AM, Ioi Lam wrote: >> https://bugs.openjdk.java.net/browse/JDK-8161280 >> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >> >> >> Summary: >> >> The test was loading a lot of JCK classes into the same VM. Many of >> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >> refcount (a signed short integer) of this Symbol would run up and >> past 0x7fff. >> >> The assert was caused by a race condition: the refcount started with >> a large (16-bit) positive value such as 0x7fff, one thread is >> decrementing and several other threads are incrementing. The refcount >> will end up being 0x8000 or slightly higher (limited to the number of >> concurrent threads that are running within a small window of several >> instructions in the decrementing thread, so most likely it will be >> 0x800?). >> >> As a result, the decrementing thread found that the refecount is >> negative after the operation, and thought that an underflow had >> happened. >> >> The fix is to ignore any value that may appear in the [0x8000 - >> 0xbfff] range and do not flag these as underflows (since they are >> most likely overflows -- overflows are already handled by making the >> Symbol permanent). >> >> Thanks >> - Ioi >> >> > From david.holmes at oracle.com Wed Aug 24 01:05:06 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Aug 2016 11:05:06 +1000 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BC1F10.2000104@oracle.com> References: <57BC1F10.2000104@oracle.com> Message-ID: <73f2f2eb-9f60-4a6b-5fc9-eaad4274071f@oracle.com> Hi Ioi, On 23/08/2016 8:01 PM, Ioi Lam wrote: > https://bugs.openjdk.java.net/browse/JDK-8161280 > http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ > > > Summary: > > The test was loading a lot of JCK classes into the same VM. Many of the > JCK classes refer to "javasoft/sqe/javatest/Status", so the refcount (a > signed short integer) of this Symbol would run up and past 0x7fff. > > The assert was caused by a race condition: the refcount started with a > large (16-bit) positive value such as 0x7fff, one thread is decrementing > and several other threads are incrementing. The refcount will end up > being 0x8000 or slightly higher (limited to the number of concurrent > threads that are running within a small window of several instructions > in the decrementing thread, so most likely it will be 0x800?). > > As a result, the decrementing thread found that the refecount is > negative after the operation, and thought that an underflow had happened. > > The fix is to ignore any value that may appear in the [0x8000 - 0xbfff] > range and do not flag these as underflows (since they are most likely > overflows -- overflows are already handled by making the Symbol permanent). This seems fine to me. Essentially we hit overflow but it appeared as underflow. So we extend the range of the signed value as-if it were unsigned but only to a point. Once we hit that point, even if overflowing, it will be flagged as an underflow. Another way to handle this would be to start from a negative initial value and allow the counter to cover the range from there, through zero up to the maximum positive value. But the end result is the same so I am fine with this. Thanks, David > Thanks > - Ioi > > From david.holmes at oracle.com Wed Aug 24 01:08:44 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Aug 2016 11:08:44 +1000 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BCF24B.3000407@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> Message-ID: On 24/08/2016 11:03 AM, Ioi Lam wrote: > Hi Coleen, thanks for suggestion the simplification: > > void Symbol::decrement_refcount() { > #ifdef ASSERT > if (_refcount == 0) { > print(); > assert(false, "reference count underflow for symbol"); > } > } > #endif > Atomic::dec(&_refcount); > } > > There's a race condition that won't detect the underflow. E.g., refcount > is 1. Two threads comes in and decrement at the same time. We will end > up with -1. So if we're going this path then you can get rid of the race by using Atomic::add(&_refcount, -1), which returns the updated value. If you get back -1 then assert. Cheers, David > However, it's not worse than before. The old version also has a race > condition: > > refcount is 0 > thread A decrements > thread B increments > thread A checks for underflow > > the decrementing thread will read _refcount==0 at the end so it won't > detect the (transient) underflow. > > I think the failure to detect underflow is fine, since this happens only > with concurrent access. The kinds of underflow that we are interested > usually can be caught in single-threaded situations. > > Thanks > - Ioi > > > > > On 8/23/16 4:24 PM, Coleen Phillimore wrote: >> >> This doesn't make sense for me and I have to go in gdb to print out >> what -16384 is. It appears that this is trying to detect that we >> went below zero from zero, which is an error, but this isn't clear at >> all. >> >> It seems that >> >> if (_refcount >= 0) { >> >> >> Should be > 0 and we should assert if this is ever zero instead, and >> allow anything negative to mean that this count has gone immortal. >> >> Kim thought it should use CAS rather than atomic increment and >> decrement, but maybe that isn't necessary, especially since there >> isn't a short version of cmpxchg. >> >> thanks, >> Coleen >> >> On 8/23/16 6:01 AM, Ioi Lam wrote: >>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>> >>> >>> Summary: >>> >>> The test was loading a lot of JCK classes into the same VM. Many of >>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >>> refcount (a signed short integer) of this Symbol would run up and >>> past 0x7fff. >>> >>> The assert was caused by a race condition: the refcount started with >>> a large (16-bit) positive value such as 0x7fff, one thread is >>> decrementing and several other threads are incrementing. The refcount >>> will end up being 0x8000 or slightly higher (limited to the number of >>> concurrent threads that are running within a small window of several >>> instructions in the decrementing thread, so most likely it will be >>> 0x800?). >>> >>> As a result, the decrementing thread found that the refecount is >>> negative after the operation, and thought that an underflow had >>> happened. >>> >>> The fix is to ignore any value that may appear in the [0x8000 - >>> 0xbfff] range and do not flag these as underflows (since they are >>> most likely overflows -- overflows are already handled by making the >>> Symbol permanent). >>> >>> Thanks >>> - Ioi >>> >>> >> > From ioi.lam at oracle.com Wed Aug 24 01:32:13 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 23 Aug 2016 18:32:13 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> Message-ID: <57BCF91D.8050900@oracle.com> On 8/23/16 6:08 PM, David Holmes wrote: > On 24/08/2016 11:03 AM, Ioi Lam wrote: >> Hi Coleen, thanks for suggestion the simplification: >> >> void Symbol::decrement_refcount() { >> #ifdef ASSERT >> if (_refcount == 0) { >> print(); >> assert(false, "reference count underflow for symbol"); >> } >> } >> #endif >> Atomic::dec(&_refcount); >> } >> >> There's a race condition that won't detect the underflow. E.g., refcount >> is 1. Two threads comes in and decrement at the same time. We will end >> up with -1. > > So if we're going this path then you can get rid of the race by using > Atomic::add(&_refcount, -1), which returns the updated value. If you > get back -1 then assert. > The problem is we allow -1 to mean "permanent". Symbols in the CDS archive are mapped read-only and have refcount==-1. If we decrement them we get a SEGV. Also, if we blindly decrement the refcount, we could end up rolling back all the way from 0xffff to 0x0, at which point we may free the Symbol which may still be in use. The fundamental problem is we need a two-step operation and we avoid proper synchronization (for performance/avoiding deadlock/etc). So we will always have a race condition somewhere, and we just need to allow only the benign ones. - Ioi > Cheers, > David > >> However, it's not worse than before. The old version also has a race >> condition: >> >> refcount is 0 >> thread A decrements >> thread B increments >> thread A checks for underflow >> >> the decrementing thread will read _refcount==0 at the end so it won't >> detect the (transient) underflow. >> >> I think the failure to detect underflow is fine, since this happens only >> with concurrent access. The kinds of underflow that we are interested >> usually can be caught in single-threaded situations. >> >> Thanks >> - Ioi >> >> >> >> >> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>> >>> This doesn't make sense for me and I have to go in gdb to print out >>> what -16384 is. It appears that this is trying to detect that we >>> went below zero from zero, which is an error, but this isn't clear at >>> all. >>> >>> It seems that >>> >>> if (_refcount >= 0) { >>> >>> >>> Should be > 0 and we should assert if this is ever zero instead, and >>> allow anything negative to mean that this count has gone immortal. >>> >>> Kim thought it should use CAS rather than atomic increment and >>> decrement, but maybe that isn't necessary, especially since there >>> isn't a short version of cmpxchg. >>> >>> thanks, >>> Coleen >>> >>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>> >>>> >>>> >>>> Summary: >>>> >>>> The test was loading a lot of JCK classes into the same VM. Many of >>>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >>>> refcount (a signed short integer) of this Symbol would run up and >>>> past 0x7fff. >>>> >>>> The assert was caused by a race condition: the refcount started with >>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>> decrementing and several other threads are incrementing. The refcount >>>> will end up being 0x8000 or slightly higher (limited to the number of >>>> concurrent threads that are running within a small window of several >>>> instructions in the decrementing thread, so most likely it will be >>>> 0x800?). >>>> >>>> As a result, the decrementing thread found that the refecount is >>>> negative after the operation, and thought that an underflow had >>>> happened. >>>> >>>> The fix is to ignore any value that may appear in the [0x8000 - >>>> 0xbfff] range and do not flag these as underflows (since they are >>>> most likely overflows -- overflows are already handled by making the >>>> Symbol permanent). >>>> >>>> Thanks >>>> - Ioi >>>> >>>> >>> >> From david.holmes at oracle.com Wed Aug 24 01:51:14 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Aug 2016 11:51:14 +1000 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BCF91D.8050900@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> Message-ID: On 24/08/2016 11:32 AM, Ioi Lam wrote: > On 8/23/16 6:08 PM, David Holmes wrote: >> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>> Hi Coleen, thanks for suggestion the simplification: >>> >>> void Symbol::decrement_refcount() { >>> #ifdef ASSERT >>> if (_refcount == 0) { >>> print(); >>> assert(false, "reference count underflow for symbol"); >>> } >>> } >>> #endif >>> Atomic::dec(&_refcount); >>> } >>> >>> There's a race condition that won't detect the underflow. E.g., refcount >>> is 1. Two threads comes in and decrement at the same time. We will end >>> up with -1. >> >> So if we're going this path then you can get rid of the race by using >> Atomic::add(&_refcount, -1), which returns the updated value. If you >> get back -1 then assert. >> > > The problem is we allow -1 to mean "permanent". Symbols in the CDS > archive are mapped read-only and have refcount==-1. If we decrement them > we get a SEGV. Unless you set them using the decrement_refcount function I don't see how this is a problem. The assertion would only trigger if we decrement from zero to -1. > Also, if we blindly decrement the refcount, we could end up rolling back > all the way from 0xffff to 0x0, at which point we may free the Symbol > which may still be in use. ?? As soon as a bad decrement happens the assert is trigerred and the VM is dead. David ----- > The fundamental problem is we need a two-step operation and we avoid > proper synchronization (for performance/avoiding deadlock/etc). So we > will always have a race condition somewhere, and we just need to allow > only the benign ones. > > - Ioi > > >> Cheers, >> David >> >>> However, it's not worse than before. The old version also has a race >>> condition: >>> >>> refcount is 0 >>> thread A decrements >>> thread B increments >>> thread A checks for underflow >>> >>> the decrementing thread will read _refcount==0 at the end so it won't >>> detect the (transient) underflow. >>> >>> I think the failure to detect underflow is fine, since this happens only >>> with concurrent access. The kinds of underflow that we are interested >>> usually can be caught in single-threaded situations. >>> >>> Thanks >>> - Ioi >>> >>> >>> >>> >>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>> >>>> This doesn't make sense for me and I have to go in gdb to print out >>>> what -16384 is. It appears that this is trying to detect that we >>>> went below zero from zero, which is an error, but this isn't clear at >>>> all. >>>> >>>> It seems that >>>> >>>> if (_refcount >= 0) { >>>> >>>> >>>> Should be > 0 and we should assert if this is ever zero instead, and >>>> allow anything negative to mean that this count has gone immortal. >>>> >>>> Kim thought it should use CAS rather than atomic increment and >>>> decrement, but maybe that isn't necessary, especially since there >>>> isn't a short version of cmpxchg. >>>> >>>> thanks, >>>> Coleen >>>> >>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>> >>>>> >>>>> >>>>> Summary: >>>>> >>>>> The test was loading a lot of JCK classes into the same VM. Many of >>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >>>>> refcount (a signed short integer) of this Symbol would run up and >>>>> past 0x7fff. >>>>> >>>>> The assert was caused by a race condition: the refcount started with >>>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>>> decrementing and several other threads are incrementing. The refcount >>>>> will end up being 0x8000 or slightly higher (limited to the number of >>>>> concurrent threads that are running within a small window of several >>>>> instructions in the decrementing thread, so most likely it will be >>>>> 0x800?). >>>>> >>>>> As a result, the decrementing thread found that the refecount is >>>>> negative after the operation, and thought that an underflow had >>>>> happened. >>>>> >>>>> The fix is to ignore any value that may appear in the [0x8000 - >>>>> 0xbfff] range and do not flag these as underflows (since they are >>>>> most likely overflows -- overflows are already handled by making the >>>>> Symbol permanent). >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> >>>> >>> > From ioi.lam at oracle.com Wed Aug 24 02:24:45 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 23 Aug 2016 19:24:45 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> Message-ID: <57BD056D.1020303@oracle.com> On 8/23/16 6:51 PM, David Holmes wrote: > On 24/08/2016 11:32 AM, Ioi Lam wrote: >> On 8/23/16 6:08 PM, David Holmes wrote: >>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>> Hi Coleen, thanks for suggestion the simplification: >>>> >>>> void Symbol::decrement_refcount() { >>>> #ifdef ASSERT >>>> if (_refcount == 0) { >>>> print(); >>>> assert(false, "reference count underflow for symbol"); >>>> } >>>> } >>>> #endif >>>> Atomic::dec(&_refcount); >>>> } >>>> >>>> There's a race condition that won't detect the underflow. E.g., >>>> refcount >>>> is 1. Two threads comes in and decrement at the same time. We will end >>>> up with -1. >>> >>> So if we're going this path then you can get rid of the race by using >>> Atomic::add(&_refcount, -1), which returns the updated value. If you >>> get back -1 then assert. >>> >> >> The problem is we allow -1 to mean "permanent". Symbols in the CDS >> archive are mapped read-only and have refcount==-1. If we decrement them >> we get a SEGV. > > Unless you set them using the decrement_refcount function I don't see > how this is a problem. The assertion would only trigger if we > decrement from zero to -1. > >> Also, if we blindly decrement the refcount, we could end up rolling back >> all the way from 0xffff to 0x0, at which point we may free the Symbol >> which may still be in use. > > ?? As soon as a bad decrement happens the assert is trigerred and the > VM is dead. The assert doesn't happen in product VM. Currently the product VM will continue to work, and all symbols with underflown/overflown refcounts will be considered as permanent. - Ioi > > David > ----- > >> The fundamental problem is we need a two-step operation and we avoid >> proper synchronization (for performance/avoiding deadlock/etc). So we >> will always have a race condition somewhere, and we just need to allow >> only the benign ones. >> >> - Ioi >> >> >>> Cheers, >>> David >>> >>>> However, it's not worse than before. The old version also has a race >>>> condition: >>>> >>>> refcount is 0 >>>> thread A decrements >>>> thread B increments >>>> thread A checks for underflow >>>> >>>> the decrementing thread will read _refcount==0 at the end so it won't >>>> detect the (transient) underflow. >>>> >>>> I think the failure to detect underflow is fine, since this happens >>>> only >>>> with concurrent access. The kinds of underflow that we are interested >>>> usually can be caught in single-threaded situations. >>>> >>>> Thanks >>>> - Ioi >>>> >>>> >>>> >>>> >>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>> >>>>> This doesn't make sense for me and I have to go in gdb to print out >>>>> what -16384 is. It appears that this is trying to detect that we >>>>> went below zero from zero, which is an error, but this isn't clear at >>>>> all. >>>>> >>>>> It seems that >>>>> >>>>> if (_refcount >= 0) { >>>>> >>>>> >>>>> Should be > 0 and we should assert if this is ever zero instead, and >>>>> allow anything negative to mean that this count has gone immortal. >>>>> >>>>> Kim thought it should use CAS rather than atomic increment and >>>>> decrement, but maybe that isn't necessary, especially since there >>>>> isn't a short version of cmpxchg. >>>>> >>>>> thanks, >>>>> Coleen >>>>> >>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Summary: >>>>>> >>>>>> The test was loading a lot of JCK classes into the same VM. Many of >>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >>>>>> refcount (a signed short integer) of this Symbol would run up and >>>>>> past 0x7fff. >>>>>> >>>>>> The assert was caused by a race condition: the refcount started with >>>>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>>>> decrementing and several other threads are incrementing. The >>>>>> refcount >>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>> number of >>>>>> concurrent threads that are running within a small window of several >>>>>> instructions in the decrementing thread, so most likely it will be >>>>>> 0x800?). >>>>>> >>>>>> As a result, the decrementing thread found that the refecount is >>>>>> negative after the operation, and thought that an underflow had >>>>>> happened. >>>>>> >>>>>> The fix is to ignore any value that may appear in the [0x8000 - >>>>>> 0xbfff] range and do not flag these as underflows (since they are >>>>>> most likely overflows -- overflows are already handled by making the >>>>>> Symbol permanent). >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> >>>>> >>>> >> From david.holmes at oracle.com Wed Aug 24 02:54:01 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Aug 2016 12:54:01 +1000 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BD056D.1020303@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> Message-ID: <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> On 24/08/2016 12:24 PM, Ioi Lam wrote: > > > On 8/23/16 6:51 PM, David Holmes wrote: >> On 24/08/2016 11:32 AM, Ioi Lam wrote: >>> On 8/23/16 6:08 PM, David Holmes wrote: >>>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>>> Hi Coleen, thanks for suggestion the simplification: >>>>> >>>>> void Symbol::decrement_refcount() { >>>>> #ifdef ASSERT >>>>> if (_refcount == 0) { >>>>> print(); >>>>> assert(false, "reference count underflow for symbol"); >>>>> } >>>>> } >>>>> #endif >>>>> Atomic::dec(&_refcount); >>>>> } >>>>> >>>>> There's a race condition that won't detect the underflow. E.g., >>>>> refcount >>>>> is 1. Two threads comes in and decrement at the same time. We will end >>>>> up with -1. >>>> >>>> So if we're going this path then you can get rid of the race by using >>>> Atomic::add(&_refcount, -1), which returns the updated value. If you >>>> get back -1 then assert. >>>> >>> >>> The problem is we allow -1 to mean "permanent". Symbols in the CDS >>> archive are mapped read-only and have refcount==-1. If we decrement them >>> we get a SEGV. >> >> Unless you set them using the decrement_refcount function I don't see >> how this is a problem. The assertion would only trigger if we >> decrement from zero to -1. >> >>> Also, if we blindly decrement the refcount, we could end up rolling back >>> all the way from 0xffff to 0x0, at which point we may free the Symbol >>> which may still be in use. >> >> ?? As soon as a bad decrement happens the assert is trigerred and the >> VM is dead. > > The assert doesn't happen in product VM. Currently the product VM will > continue to work, and all symbols with underflown/overflown refcounts > will be considered as permanent. Okay - note this bug is about an assertion failure :) So there are two problems: 1. There is a race related to the assert that the Atomic::add(-1) can fix. 2. There is the product-mode unconditional decrementing of the refcount which will mark everything as "permanent". I haven't see any suggestions to address that existing issue. One obvious suggestion is to use a different value than -1 so it doesn't easily get hit by unexpected underflow. David > - Ioi > > >> >> David >> ----- >> >>> The fundamental problem is we need a two-step operation and we avoid >>> proper synchronization (for performance/avoiding deadlock/etc). So we >>> will always have a race condition somewhere, and we just need to allow >>> only the benign ones. >>> >>> - Ioi >>> >>> >>>> Cheers, >>>> David >>>> >>>>> However, it's not worse than before. The old version also has a race >>>>> condition: >>>>> >>>>> refcount is 0 >>>>> thread A decrements >>>>> thread B increments >>>>> thread A checks for underflow >>>>> >>>>> the decrementing thread will read _refcount==0 at the end so it won't >>>>> detect the (transient) underflow. >>>>> >>>>> I think the failure to detect underflow is fine, since this happens >>>>> only >>>>> with concurrent access. The kinds of underflow that we are interested >>>>> usually can be caught in single-threaded situations. >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> >>>>> >>>>> >>>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>>> >>>>>> This doesn't make sense for me and I have to go in gdb to print out >>>>>> what -16384 is. It appears that this is trying to detect that we >>>>>> went below zero from zero, which is an error, but this isn't clear at >>>>>> all. >>>>>> >>>>>> It seems that >>>>>> >>>>>> if (_refcount >= 0) { >>>>>> >>>>>> >>>>>> Should be > 0 and we should assert if this is ever zero instead, and >>>>>> allow anything negative to mean that this count has gone immortal. >>>>>> >>>>>> Kim thought it should use CAS rather than atomic increment and >>>>>> decrement, but maybe that isn't necessary, especially since there >>>>>> isn't a short version of cmpxchg. >>>>>> >>>>>> thanks, >>>>>> Coleen >>>>>> >>>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Summary: >>>>>>> >>>>>>> The test was loading a lot of JCK classes into the same VM. Many of >>>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >>>>>>> refcount (a signed short integer) of this Symbol would run up and >>>>>>> past 0x7fff. >>>>>>> >>>>>>> The assert was caused by a race condition: the refcount started with >>>>>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>>>>> decrementing and several other threads are incrementing. The >>>>>>> refcount >>>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>>> number of >>>>>>> concurrent threads that are running within a small window of several >>>>>>> instructions in the decrementing thread, so most likely it will be >>>>>>> 0x800?). >>>>>>> >>>>>>> As a result, the decrementing thread found that the refecount is >>>>>>> negative after the operation, and thought that an underflow had >>>>>>> happened. >>>>>>> >>>>>>> The fix is to ignore any value that may appear in the [0x8000 - >>>>>>> 0xbfff] range and do not flag these as underflows (since they are >>>>>>> most likely overflows -- overflows are already handled by making the >>>>>>> Symbol permanent). >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> >>>>>> >>>>> >>> > From ioi.lam at oracle.com Wed Aug 24 03:44:29 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 23 Aug 2016 20:44:29 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> Message-ID: <57BD181D.30108@oracle.com> On 8/23/16 7:54 PM, David Holmes wrote: > On 24/08/2016 12:24 PM, Ioi Lam wrote: >> >> >> On 8/23/16 6:51 PM, David Holmes wrote: >>> On 24/08/2016 11:32 AM, Ioi Lam wrote: >>>> On 8/23/16 6:08 PM, David Holmes wrote: >>>>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>>>> Hi Coleen, thanks for suggestion the simplification: >>>>>> >>>>>> void Symbol::decrement_refcount() { >>>>>> #ifdef ASSERT >>>>>> if (_refcount == 0) { >>>>>> print(); >>>>>> assert(false, "reference count underflow for symbol"); >>>>>> } >>>>>> } >>>>>> #endif >>>>>> Atomic::dec(&_refcount); >>>>>> } >>>>>> >>>>>> There's a race condition that won't detect the underflow. E.g., >>>>>> refcount >>>>>> is 1. Two threads comes in and decrement at the same time. We >>>>>> will end >>>>>> up with -1. >>>>> >>>>> So if we're going this path then you can get rid of the race by using >>>>> Atomic::add(&_refcount, -1), which returns the updated value. If you >>>>> get back -1 then assert. >>>>> >>>> >>>> The problem is we allow -1 to mean "permanent". Symbols in the CDS >>>> archive are mapped read-only and have refcount==-1. If we decrement >>>> them >>>> we get a SEGV. >>> >>> Unless you set them using the decrement_refcount function I don't see >>> how this is a problem. The assertion would only trigger if we >>> decrement from zero to -1. >>> >>>> Also, if we blindly decrement the refcount, we could end up rolling >>>> back >>>> all the way from 0xffff to 0x0, at which point we may free the Symbol >>>> which may still be in use. >>> >>> ?? As soon as a bad decrement happens the assert is trigerred and the >>> VM is dead. >> >> The assert doesn't happen in product VM. Currently the product VM will >> continue to work, and all symbols with underflown/overflown refcounts >> will be considered as permanent. > > Okay - note this bug is about an assertion failure :) > > So there are two problems: > > 1. There is a race related to the assert that the Atomic::add(-1) can > fix. > > 2. There is the product-mode unconditional decrementing of the > refcount which will mark everything as "permanent". I haven't see any > suggestions to address that existing issue. One obvious suggestion is > to use a different value than -1 so it doesn't easily get hit by > unexpected underflow. > Here's an updated version that's an improvement over my original patch: + A negative refcount means permanent symbols. + Only non-permanent symbols can be incremented/decremented + Underflow is detected by 0 -> -1 transition. Real underflows are always caught. + Theoretically there's still a race condition for false asserts: [1] thread A observes that refcount is 0x7fff, proceeds to decrement it [2] but in the mean time, around 32768 threads come in. They all observe that the refcount is non-negative, and then all proceed to increment the refcount to 0x0. [3] thread A decrements the refcount and observes that refcount is -1 afterwards but I think this can be safely ignored :-) http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v02/ I like this version as it doesn't change the logic for the product VM and only limits the possibility of false asserts. Thanks - Ioi > David > >> - Ioi >> >> >>> >>> David >>> ----- >>> >>>> The fundamental problem is we need a two-step operation and we avoid >>>> proper synchronization (for performance/avoiding deadlock/etc). So we >>>> will always have a race condition somewhere, and we just need to allow >>>> only the benign ones. >>>> >>>> - Ioi >>>> >>>> >>>>> Cheers, >>>>> David >>>>> >>>>>> However, it's not worse than before. The old version also has a race >>>>>> condition: >>>>>> >>>>>> refcount is 0 >>>>>> thread A decrements >>>>>> thread B increments >>>>>> thread A checks for underflow >>>>>> >>>>>> the decrementing thread will read _refcount==0 at the end so it >>>>>> won't >>>>>> detect the (transient) underflow. >>>>>> >>>>>> I think the failure to detect underflow is fine, since this happens >>>>>> only >>>>>> with concurrent access. The kinds of underflow that we are >>>>>> interested >>>>>> usually can be caught in single-threaded situations. >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>>>> >>>>>>> This doesn't make sense for me and I have to go in gdb to print out >>>>>>> what -16384 is. It appears that this is trying to detect that we >>>>>>> went below zero from zero, which is an error, but this isn't >>>>>>> clear at >>>>>>> all. >>>>>>> >>>>>>> It seems that >>>>>>> >>>>>>> if (_refcount >= 0) { >>>>>>> >>>>>>> >>>>>>> Should be > 0 and we should assert if this is ever zero instead, >>>>>>> and >>>>>>> allow anything negative to mean that this count has gone immortal. >>>>>>> >>>>>>> Kim thought it should use CAS rather than atomic increment and >>>>>>> decrement, but maybe that isn't necessary, especially since there >>>>>>> isn't a short version of cmpxchg. >>>>>>> >>>>>>> thanks, >>>>>>> Coleen >>>>>>> >>>>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Summary: >>>>>>>> >>>>>>>> The test was loading a lot of JCK classes into the same VM. >>>>>>>> Many of >>>>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >>>>>>>> refcount (a signed short integer) of this Symbol would run up and >>>>>>>> past 0x7fff. >>>>>>>> >>>>>>>> The assert was caused by a race condition: the refcount started >>>>>>>> with >>>>>>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>>>>>> decrementing and several other threads are incrementing. The >>>>>>>> refcount >>>>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>>>> number of >>>>>>>> concurrent threads that are running within a small window of >>>>>>>> several >>>>>>>> instructions in the decrementing thread, so most likely it will be >>>>>>>> 0x800?). >>>>>>>> >>>>>>>> As a result, the decrementing thread found that the refecount is >>>>>>>> negative after the operation, and thought that an underflow had >>>>>>>> happened. >>>>>>>> >>>>>>>> The fix is to ignore any value that may appear in the [0x8000 - >>>>>>>> 0xbfff] range and do not flag these as underflows (since they are >>>>>>>> most likely overflows -- overflows are already handled by >>>>>>>> making the >>>>>>>> Symbol permanent). >>>>>>>> >>>>>>>> Thanks >>>>>>>> - Ioi >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >> From david.holmes at oracle.com Wed Aug 24 05:12:10 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Aug 2016 15:12:10 +1000 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BD181D.30108@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> Message-ID: <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> Hi Ioi, Sorry I have a problem here. I just realized there is no Atomic::add(jshort*, jshort). But I can't accept changing the Atomic::inc/dec API only for the jshort case (and why return jint ??). So either the whole Atomic inc/dec API needs updating across the board (not a small task) or else we need to introduce Atomic::add(jshort*, jshort). The latter seems a small task - the existing inc/dec implementations can call the new add version. Also in a product build might the "new_value" variable trigger an "unused" warning? If so an ugly option would be: DEBUG_ONLY(jshort new_value =) Atomic::dec(...); assert(new_value != -1, "..."); Thanks, David On 24/08/2016 1:44 PM, Ioi Lam wrote: > > > On 8/23/16 7:54 PM, David Holmes wrote: >> On 24/08/2016 12:24 PM, Ioi Lam wrote: >>> >>> >>> On 8/23/16 6:51 PM, David Holmes wrote: >>>> On 24/08/2016 11:32 AM, Ioi Lam wrote: >>>>> On 8/23/16 6:08 PM, David Holmes wrote: >>>>>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>>>>> Hi Coleen, thanks for suggestion the simplification: >>>>>>> >>>>>>> void Symbol::decrement_refcount() { >>>>>>> #ifdef ASSERT >>>>>>> if (_refcount == 0) { >>>>>>> print(); >>>>>>> assert(false, "reference count underflow for symbol"); >>>>>>> } >>>>>>> } >>>>>>> #endif >>>>>>> Atomic::dec(&_refcount); >>>>>>> } >>>>>>> >>>>>>> There's a race condition that won't detect the underflow. E.g., >>>>>>> refcount >>>>>>> is 1. Two threads comes in and decrement at the same time. We >>>>>>> will end >>>>>>> up with -1. >>>>>> >>>>>> So if we're going this path then you can get rid of the race by using >>>>>> Atomic::add(&_refcount, -1), which returns the updated value. If you >>>>>> get back -1 then assert. >>>>>> >>>>> >>>>> The problem is we allow -1 to mean "permanent". Symbols in the CDS >>>>> archive are mapped read-only and have refcount==-1. If we decrement >>>>> them >>>>> we get a SEGV. >>>> >>>> Unless you set them using the decrement_refcount function I don't see >>>> how this is a problem. The assertion would only trigger if we >>>> decrement from zero to -1. >>>> >>>>> Also, if we blindly decrement the refcount, we could end up rolling >>>>> back >>>>> all the way from 0xffff to 0x0, at which point we may free the Symbol >>>>> which may still be in use. >>>> >>>> ?? As soon as a bad decrement happens the assert is trigerred and the >>>> VM is dead. >>> >>> The assert doesn't happen in product VM. Currently the product VM will >>> continue to work, and all symbols with underflown/overflown refcounts >>> will be considered as permanent. >> >> Okay - note this bug is about an assertion failure :) >> >> So there are two problems: >> >> 1. There is a race related to the assert that the Atomic::add(-1) can >> fix. >> >> 2. There is the product-mode unconditional decrementing of the >> refcount which will mark everything as "permanent". I haven't see any >> suggestions to address that existing issue. One obvious suggestion is >> to use a different value than -1 so it doesn't easily get hit by >> unexpected underflow. >> > > Here's an updated version that's an improvement over my original patch: > > + A negative refcount means permanent symbols. > + Only non-permanent symbols can be incremented/decremented > + Underflow is detected by 0 -> -1 transition. Real underflows are > always caught. > + Theoretically there's still a race condition for false asserts: > > [1] thread A observes that refcount is 0x7fff, proceeds to decrement it > [2] but in the mean time, around 32768 threads come in. They all observe > that the refcount is non-negative, and then all proceed to increment the > refcount to 0x0. > [3] thread A decrements the refcount and observes that refcount is -1 > afterwards > > but I think this can be safely ignored :-) > > http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v02/ > > > I like this version as it doesn't change the logic for the product VM > and only limits the possibility of false asserts. > > Thanks > - Ioi > >> David >> >>> - Ioi >>> >>> >>>> >>>> David >>>> ----- >>>> >>>>> The fundamental problem is we need a two-step operation and we avoid >>>>> proper synchronization (for performance/avoiding deadlock/etc). So we >>>>> will always have a race condition somewhere, and we just need to allow >>>>> only the benign ones. >>>>> >>>>> - Ioi >>>>> >>>>> >>>>>> Cheers, >>>>>> David >>>>>> >>>>>>> However, it's not worse than before. The old version also has a race >>>>>>> condition: >>>>>>> >>>>>>> refcount is 0 >>>>>>> thread A decrements >>>>>>> thread B increments >>>>>>> thread A checks for underflow >>>>>>> >>>>>>> the decrementing thread will read _refcount==0 at the end so it >>>>>>> won't >>>>>>> detect the (transient) underflow. >>>>>>> >>>>>>> I think the failure to detect underflow is fine, since this happens >>>>>>> only >>>>>>> with concurrent access. The kinds of underflow that we are >>>>>>> interested >>>>>>> usually can be caught in single-threaded situations. >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>>>>> >>>>>>>> This doesn't make sense for me and I have to go in gdb to print out >>>>>>>> what -16384 is. It appears that this is trying to detect that we >>>>>>>> went below zero from zero, which is an error, but this isn't >>>>>>>> clear at >>>>>>>> all. >>>>>>>> >>>>>>>> It seems that >>>>>>>> >>>>>>>> if (_refcount >= 0) { >>>>>>>> >>>>>>>> >>>>>>>> Should be > 0 and we should assert if this is ever zero instead, >>>>>>>> and >>>>>>>> allow anything negative to mean that this count has gone immortal. >>>>>>>> >>>>>>>> Kim thought it should use CAS rather than atomic increment and >>>>>>>> decrement, but maybe that isn't necessary, especially since there >>>>>>>> isn't a short version of cmpxchg. >>>>>>>> >>>>>>>> thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> >>>>>>>>> The test was loading a lot of JCK classes into the same VM. >>>>>>>>> Many of >>>>>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >>>>>>>>> refcount (a signed short integer) of this Symbol would run up and >>>>>>>>> past 0x7fff. >>>>>>>>> >>>>>>>>> The assert was caused by a race condition: the refcount started >>>>>>>>> with >>>>>>>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>>>>>>> decrementing and several other threads are incrementing. The >>>>>>>>> refcount >>>>>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>>>>> number of >>>>>>>>> concurrent threads that are running within a small window of >>>>>>>>> several >>>>>>>>> instructions in the decrementing thread, so most likely it will be >>>>>>>>> 0x800?). >>>>>>>>> >>>>>>>>> As a result, the decrementing thread found that the refecount is >>>>>>>>> negative after the operation, and thought that an underflow had >>>>>>>>> happened. >>>>>>>>> >>>>>>>>> The fix is to ignore any value that may appear in the [0x8000 - >>>>>>>>> 0xbfff] range and do not flag these as underflows (since they are >>>>>>>>> most likely overflows -- overflows are already handled by >>>>>>>>> making the >>>>>>>>> Symbol permanent). >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>> > From david.holmes at oracle.com Wed Aug 24 05:21:56 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Aug 2016 15:21:56 +1000 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> Message-ID: <1981f29f-b96b-c00a-2aab-280937d275f4@oracle.com> Hi Kim, Thanks for looking at this. Webrev updated in-place. Comments inline. On 24/08/2016 6:25 AM, Kim Barrett wrote: >> On Aug 23, 2016, at 4:55 AM, David Holmes wrote: >> >> Hi Volker, Andrew, >> >> On 23/08/2016 12:27 AM, Volker Simonis wrote: >>> Hi, >>> >>> I don't particularly like the const_casts as well. >> >> I would have thought this was exactly the kind of thing const_cast was good for - avoiding the need to define multiple overloads to deal with volatile, non-volatile, const etc. >> >>> Why not change pointer_delta to accept pointers to volatiles as well: >>> >>> pointer_delta(const volatile void* left, const volatile void* right, >> >> I can do that. I also have to make a similar change to align_ptr_down. Now should I also change align_ptr_up for consistency (though I note they are already inconsistent in that one takes void* and one takes const void*) ? >> >> Alternative webrev at: >> >> http://cr.openjdk.java.net/~dholmes/8157904/webrev.v2/ > > ------------------------------------------------------------------------------ > src/share/vm/runtime/atomic.hpp > 155 assert(sizeof(jbyte) == 1, "assumption"); > > STATIC_ASSERT would be better here. Changed. > ------------------------------------------------------------------------------ > src/share/vm/utilities/globalDefinitions.hpp > 524 inline void* align_ptr_down(volatile void* ptr, size_t alignment) { > 525 return (void*)align_size_down((intptr_t)ptr, (intptr_t)alignment); > 526 } > > I think implicitly (to the caller of align_ptr_down) casting away > volatile like this is a mistake. I disagree with the rationale for > this change; stripping off volatile (or const) *should* be annoyingly > in your face with a const_cast. Yep my bad - volatile in, volatile out: inline volatile void* align_ptr_down(volatile void* ptr, size_t alignment) { return (volatile void*)align_size_down((intptr_t)ptr, (intptr_t)alignment); } This also leads to a change to the static_cast to be "volatile jint*". > The addition of volatile to pointer_delta is not the same sort of > thing. I think that change is good, except I agree with Volker that > only the one version is needed. Fixed. I hadn't appreciated what Volker was saying about one version. > ------------------------------------------------------------------------------ > > Otherwise looks good to me. > > Regarding: > > Now should I also change align_ptr_up for consistency (though I note > they are already inconsistent in that one takes void* and one takes > const void*) ? > > I think there should be two overloads of each of these, one with const > qualified argument and result, and one without const qualification for > either. That way the result has the same const-ness as the argument. > We could double the number of overloads by similarly dealing with > volatile, but I doubt there are enough relevant callers for that to be > worthwhile; just use const_cast to deal with volatile at the call > sites. But this is all a different issue... Agreed - separate issue if when it becomes an issue. Thanks, David > Another option would be to make the argument and result > const-qualified, and make callers deal with the result, but there are > probably enough call sites to make the second overload worthwhile. > > From robbin.ehn at oracle.com Wed Aug 24 06:28:12 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 24 Aug 2016 08:28:12 +0200 Subject: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools In-Reply-To: References: Message-ID: Hi Dmitry, Looks good, thanks for fixing! /Robbin On 08/17/2016 09:51 AM, Dmitry Samersoff wrote: > Everybody, > > Please review the changes: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.01/ > > -Dmitry > From robbin.ehn at oracle.com Wed Aug 24 06:28:47 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 24 Aug 2016 08:28:47 +0200 Subject: RFR: 8158628: test/java/lang/instrument/NativeMethodPrefixAgent.java: Error occurred during initialization of VM: Failed to start tracing backend. In-Reply-To: References: <2955a5da-204b-2321-e825-2656ba788037@oracle.com> Message-ID: <6e81a010-08ab-8fa5-f2fe-18a25dd34f19@oracle.com> Thanks Staffan! /Robbin On 08/23/2016 07:48 PM, Staffan Larsen wrote: > Looks good! > > Thanks, > /Staffan > >> On 23 aug. 2016, at 18:29, Robbin Ehn wrote: >> >> Hi all, >> >> This test should not run with jfr. >> >> Webrev: >> http://cr.openjdk.java.net/~rehn/8158628/webrev/ >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8158628 >> >> /Robbin > From robbin.ehn at oracle.com Wed Aug 24 06:30:50 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 24 Aug 2016 08:30:50 +0200 Subject: RFR: 8158628: test/java/lang/instrument/NativeMethodPrefixAgent.java: Error occurred during initialization of VM: Failed to start tracing backend. In-Reply-To: <20864a87-7dd4-e00d-4d15-fa4d99b01e0f@oracle.com> References: <2955a5da-204b-2321-e825-2656ba788037@oracle.com> <20864a87-7dd4-e00d-4d15-fa4d99b01e0f@oracle.com> Message-ID: Thanks George! /Robbin On 08/23/2016 08:11 PM, George Triantafillou wrote: > Hi Robbin, > > Looks good, (r)eviewed. > > -George > > On 8/23/2016 12:29 PM, Robbin Ehn wrote: >> Hi all, >> >> This test should not run with jfr. >> >> Webrev: >> http://cr.openjdk.java.net/~rehn/8158628/webrev/ >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8158628 >> >> /Robbin > From thomas.stuefe at gmail.com Wed Aug 24 06:32:52 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 24 Aug 2016 08:32:52 +0200 Subject: RFR(S): 8163146 - Remove os::check_heap on Windows In-Reply-To: <20ce01d1fd88$eb75cac0$c2616040$@oracle.com> References: <20ce01d1fd88$eb75cac0$c2616040$@oracle.com> Message-ID: Hi Christian, this looks fine. Out of curiosity, what were the issues? Did you get false positives or did HeapValidate not find anything? Kind Regards, Thomas On Tue, Aug 23, 2016 at 11:54 PM, Christian Tornqvist < christian.tornqvist at oracle.com> wrote: > Hi everyone, > > > > Please review this small change that removes os::check_heap, it's proven to > be unreliable. The same functionality and more can be enabled in Windows > using gflags (pageheap) or by turning on heap debugging in Windbg using the > !heap extension. > > > > Tested by running hotspot_fast_runtime test group on all platforms. > > > > Webrev: > > http://cr.openjdk.java.net/~ctornqvi/webrev/8163146/webrev.00/ > > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8163146 > > > > Thanks, > > Christian > > From ioi.lam at oracle.com Wed Aug 24 07:01:29 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 24 Aug 2016 00:01:29 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> Message-ID: <57BD4649.6000804@oracle.com> Hi David, Here's an updated version that added Atomic::add(jshort*, jshort) as you suggested. To appease the "unused" warnings, I just added (void)new_value. http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ I am running RBT with "--test hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" to make sure everything works. Thanks - Ioi On 8/23/16 10:12 PM, David Holmes wrote: > Hi Ioi, > > Sorry I have a problem here. I just realized there is no > Atomic::add(jshort*, jshort). But I can't accept changing the > Atomic::inc/dec API only for the jshort case (and why return jint ??). > So either the whole Atomic inc/dec API needs updating across the board > (not a small task) or else we need to introduce Atomic::add(jshort*, > jshort). The latter seems a small task - the existing inc/dec > implementations can call the new add version. > > Also in a product build might the "new_value" variable trigger an > "unused" warning? If so an ugly option would be: > > DEBUG_ONLY(jshort new_value =) Atomic::dec(...); > assert(new_value != -1, "..."); > > Thanks, > David > > On 24/08/2016 1:44 PM, Ioi Lam wrote: >> >> >> On 8/23/16 7:54 PM, David Holmes wrote: >>> On 24/08/2016 12:24 PM, Ioi Lam wrote: >>>> >>>> >>>> On 8/23/16 6:51 PM, David Holmes wrote: >>>>> On 24/08/2016 11:32 AM, Ioi Lam wrote: >>>>>> On 8/23/16 6:08 PM, David Holmes wrote: >>>>>>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>>>>>> Hi Coleen, thanks for suggestion the simplification: >>>>>>>> >>>>>>>> void Symbol::decrement_refcount() { >>>>>>>> #ifdef ASSERT >>>>>>>> if (_refcount == 0) { >>>>>>>> print(); >>>>>>>> assert(false, "reference count underflow for symbol"); >>>>>>>> } >>>>>>>> } >>>>>>>> #endif >>>>>>>> Atomic::dec(&_refcount); >>>>>>>> } >>>>>>>> >>>>>>>> There's a race condition that won't detect the underflow. E.g., >>>>>>>> refcount >>>>>>>> is 1. Two threads comes in and decrement at the same time. We >>>>>>>> will end >>>>>>>> up with -1. >>>>>>> >>>>>>> So if we're going this path then you can get rid of the race by >>>>>>> using >>>>>>> Atomic::add(&_refcount, -1), which returns the updated value. If >>>>>>> you >>>>>>> get back -1 then assert. >>>>>>> >>>>>> >>>>>> The problem is we allow -1 to mean "permanent". Symbols in the CDS >>>>>> archive are mapped read-only and have refcount==-1. If we decrement >>>>>> them >>>>>> we get a SEGV. >>>>> >>>>> Unless you set them using the decrement_refcount function I don't see >>>>> how this is a problem. The assertion would only trigger if we >>>>> decrement from zero to -1. >>>>> >>>>>> Also, if we blindly decrement the refcount, we could end up rolling >>>>>> back >>>>>> all the way from 0xffff to 0x0, at which point we may free the >>>>>> Symbol >>>>>> which may still be in use. >>>>> >>>>> ?? As soon as a bad decrement happens the assert is trigerred and the >>>>> VM is dead. >>>> >>>> The assert doesn't happen in product VM. Currently the product VM will >>>> continue to work, and all symbols with underflown/overflown refcounts >>>> will be considered as permanent. >>> >>> Okay - note this bug is about an assertion failure :) >>> >>> So there are two problems: >>> >>> 1. There is a race related to the assert that the Atomic::add(-1) can >>> fix. >>> >>> 2. There is the product-mode unconditional decrementing of the >>> refcount which will mark everything as "permanent". I haven't see any >>> suggestions to address that existing issue. One obvious suggestion is >>> to use a different value than -1 so it doesn't easily get hit by >>> unexpected underflow. >>> >> >> Here's an updated version that's an improvement over my original patch: >> >> + A negative refcount means permanent symbols. >> + Only non-permanent symbols can be incremented/decremented >> + Underflow is detected by 0 -> -1 transition. Real underflows are >> always caught. >> + Theoretically there's still a race condition for false asserts: >> >> [1] thread A observes that refcount is 0x7fff, proceeds to decrement it >> [2] but in the mean time, around 32768 threads come in. They all observe >> that the refcount is non-negative, and then all proceed to increment the >> refcount to 0x0. >> [3] thread A decrements the refcount and observes that refcount is -1 >> afterwards >> >> but I think this can be safely ignored :-) >> >> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v02/ >> >> >> >> I like this version as it doesn't change the logic for the product VM >> and only limits the possibility of false asserts. >> >> Thanks >> - Ioi >> >>> David >>> >>>> - Ioi >>>> >>>> >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> The fundamental problem is we need a two-step operation and we avoid >>>>>> proper synchronization (for performance/avoiding deadlock/etc). >>>>>> So we >>>>>> will always have a race condition somewhere, and we just need to >>>>>> allow >>>>>> only the benign ones. >>>>>> >>>>>> - Ioi >>>>>> >>>>>> >>>>>>> Cheers, >>>>>>> David >>>>>>> >>>>>>>> However, it's not worse than before. The old version also has a >>>>>>>> race >>>>>>>> condition: >>>>>>>> >>>>>>>> refcount is 0 >>>>>>>> thread A decrements >>>>>>>> thread B increments >>>>>>>> thread A checks for underflow >>>>>>>> >>>>>>>> the decrementing thread will read _refcount==0 at the end so it >>>>>>>> won't >>>>>>>> detect the (transient) underflow. >>>>>>>> >>>>>>>> I think the failure to detect underflow is fine, since this >>>>>>>> happens >>>>>>>> only >>>>>>>> with concurrent access. The kinds of underflow that we are >>>>>>>> interested >>>>>>>> usually can be caught in single-threaded situations. >>>>>>>> >>>>>>>> Thanks >>>>>>>> - Ioi >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>>>>>> >>>>>>>>> This doesn't make sense for me and I have to go in gdb to >>>>>>>>> print out >>>>>>>>> what -16384 is. It appears that this is trying to detect >>>>>>>>> that we >>>>>>>>> went below zero from zero, which is an error, but this isn't >>>>>>>>> clear at >>>>>>>>> all. >>>>>>>>> >>>>>>>>> It seems that >>>>>>>>> >>>>>>>>> if (_refcount >= 0) { >>>>>>>>> >>>>>>>>> >>>>>>>>> Should be > 0 and we should assert if this is ever zero instead, >>>>>>>>> and >>>>>>>>> allow anything negative to mean that this count has gone >>>>>>>>> immortal. >>>>>>>>> >>>>>>>>> Kim thought it should use CAS rather than atomic increment and >>>>>>>>> decrement, but maybe that isn't necessary, especially since there >>>>>>>>> isn't a short version of cmpxchg. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Summary: >>>>>>>>>> >>>>>>>>>> The test was loading a lot of JCK classes into the same VM. >>>>>>>>>> Many of >>>>>>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >>>>>>>>>> refcount (a signed short integer) of this Symbol would run up >>>>>>>>>> and >>>>>>>>>> past 0x7fff. >>>>>>>>>> >>>>>>>>>> The assert was caused by a race condition: the refcount started >>>>>>>>>> with >>>>>>>>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>>>>>>>> decrementing and several other threads are incrementing. The >>>>>>>>>> refcount >>>>>>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>>>>>> number of >>>>>>>>>> concurrent threads that are running within a small window of >>>>>>>>>> several >>>>>>>>>> instructions in the decrementing thread, so most likely it >>>>>>>>>> will be >>>>>>>>>> 0x800?). >>>>>>>>>> >>>>>>>>>> As a result, the decrementing thread found that the refecount is >>>>>>>>>> negative after the operation, and thought that an underflow had >>>>>>>>>> happened. >>>>>>>>>> >>>>>>>>>> The fix is to ignore any value that may appear in the [0x8000 - >>>>>>>>>> 0xbfff] range and do not flag these as underflows (since they >>>>>>>>>> are >>>>>>>>>> most likely overflows -- overflows are already handled by >>>>>>>>>> making the >>>>>>>>>> Symbol permanent). >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> - Ioi >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>> >> From robbin.ehn at oracle.com Wed Aug 24 07:05:23 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 24 Aug 2016 09:05:23 +0200 Subject: RFR: 8164208: Update tests with redefine classes UL options and tags In-Reply-To: References: Message-ID: <7ba8b64d-c31d-f509-759c-65ffa1bb16e9@oracle.com> Thanks George! /Robbin On 08/23/2016 08:10 PM, George Triantafillou wrote: > Hi Robbin, > > Looks good, (r)eviewed. > > -George > > On 8/23/2016 12:12 PM, Robbin Ehn wrote: >> Hi all, >> >> This converts TraceRedefineClasses to UL in our tests. >> >> Webrev: >> http://cr.openjdk.java.net/~rehn/8164208/ >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8164208 >> >> Thanks! >> >> /Robbin > From robbin.ehn at oracle.com Wed Aug 24 07:12:09 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 24 Aug 2016 09:12:09 +0200 Subject: RFR: 8164208: Update tests with redefine classes UL options and tags In-Reply-To: References: Message-ID: Thanks Coleen! On 08/23/2016 08:42 PM, Coleen Phillimore wrote: > > Robbin, > > I think this looks good. Was the output not excessive? No, 1300 lines in the jtr file for the most verbose test. /Robbin > thanks, > Coleen > > On 8/23/16 12:12 PM, Robbin Ehn wrote: >> Hi all, >> >> This converts TraceRedefineClasses to UL in our tests. >> >> Webrev: >> http://cr.openjdk.java.net/~rehn/8164208/ >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8164208 >> >> Thanks! >> >> /Robbin > From david.holmes at oracle.com Wed Aug 24 07:14:29 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Aug 2016 17:14:29 +1000 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BD4649.6000804@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> Message-ID: <71152d0d-c794-de2e-ec2b-31fa7f994b66@oracle.com> Hi Ioi, On 24/08/2016 5:01 PM, Ioi Lam wrote: > Hi David, > > Here's an updated version that added Atomic::add(jshort*, jshort) as you > suggested. Thanks. Looks good. > To appease the "unused" warnings, I just added (void)new_value. I think I prefer the DEBUG_ONLY version. :) Cheers, David > http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ > > > I am running RBT with "--test > hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" > to make sure everything works. > > Thanks > - Ioi > > On 8/23/16 10:12 PM, David Holmes wrote: >> Hi Ioi, >> >> Sorry I have a problem here. I just realized there is no >> Atomic::add(jshort*, jshort). But I can't accept changing the >> Atomic::inc/dec API only for the jshort case (and why return jint ??). >> So either the whole Atomic inc/dec API needs updating across the board >> (not a small task) or else we need to introduce Atomic::add(jshort*, >> jshort). The latter seems a small task - the existing inc/dec >> implementations can call the new add version. >> >> Also in a product build might the "new_value" variable trigger an >> "unused" warning? If so an ugly option would be: >> >> DEBUG_ONLY(jshort new_value =) Atomic::dec(...); >> assert(new_value != -1, "..."); >> >> Thanks, >> David >> >> On 24/08/2016 1:44 PM, Ioi Lam wrote: >>> >>> >>> On 8/23/16 7:54 PM, David Holmes wrote: >>>> On 24/08/2016 12:24 PM, Ioi Lam wrote: >>>>> >>>>> >>>>> On 8/23/16 6:51 PM, David Holmes wrote: >>>>>> On 24/08/2016 11:32 AM, Ioi Lam wrote: >>>>>>> On 8/23/16 6:08 PM, David Holmes wrote: >>>>>>>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>>>>>>> Hi Coleen, thanks for suggestion the simplification: >>>>>>>>> >>>>>>>>> void Symbol::decrement_refcount() { >>>>>>>>> #ifdef ASSERT >>>>>>>>> if (_refcount == 0) { >>>>>>>>> print(); >>>>>>>>> assert(false, "reference count underflow for symbol"); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> #endif >>>>>>>>> Atomic::dec(&_refcount); >>>>>>>>> } >>>>>>>>> >>>>>>>>> There's a race condition that won't detect the underflow. E.g., >>>>>>>>> refcount >>>>>>>>> is 1. Two threads comes in and decrement at the same time. We >>>>>>>>> will end >>>>>>>>> up with -1. >>>>>>>> >>>>>>>> So if we're going this path then you can get rid of the race by >>>>>>>> using >>>>>>>> Atomic::add(&_refcount, -1), which returns the updated value. If >>>>>>>> you >>>>>>>> get back -1 then assert. >>>>>>>> >>>>>>> >>>>>>> The problem is we allow -1 to mean "permanent". Symbols in the CDS >>>>>>> archive are mapped read-only and have refcount==-1. If we decrement >>>>>>> them >>>>>>> we get a SEGV. >>>>>> >>>>>> Unless you set them using the decrement_refcount function I don't see >>>>>> how this is a problem. The assertion would only trigger if we >>>>>> decrement from zero to -1. >>>>>> >>>>>>> Also, if we blindly decrement the refcount, we could end up rolling >>>>>>> back >>>>>>> all the way from 0xffff to 0x0, at which point we may free the >>>>>>> Symbol >>>>>>> which may still be in use. >>>>>> >>>>>> ?? As soon as a bad decrement happens the assert is trigerred and the >>>>>> VM is dead. >>>>> >>>>> The assert doesn't happen in product VM. Currently the product VM will >>>>> continue to work, and all symbols with underflown/overflown refcounts >>>>> will be considered as permanent. >>>> >>>> Okay - note this bug is about an assertion failure :) >>>> >>>> So there are two problems: >>>> >>>> 1. There is a race related to the assert that the Atomic::add(-1) can >>>> fix. >>>> >>>> 2. There is the product-mode unconditional decrementing of the >>>> refcount which will mark everything as "permanent". I haven't see any >>>> suggestions to address that existing issue. One obvious suggestion is >>>> to use a different value than -1 so it doesn't easily get hit by >>>> unexpected underflow. >>>> >>> >>> Here's an updated version that's an improvement over my original patch: >>> >>> + A negative refcount means permanent symbols. >>> + Only non-permanent symbols can be incremented/decremented >>> + Underflow is detected by 0 -> -1 transition. Real underflows are >>> always caught. >>> + Theoretically there's still a race condition for false asserts: >>> >>> [1] thread A observes that refcount is 0x7fff, proceeds to decrement it >>> [2] but in the mean time, around 32768 threads come in. They all observe >>> that the refcount is non-negative, and then all proceed to increment the >>> refcount to 0x0. >>> [3] thread A decrements the refcount and observes that refcount is -1 >>> afterwards >>> >>> but I think this can be safely ignored :-) >>> >>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v02/ >>> >>> >>> >>> I like this version as it doesn't change the logic for the product VM >>> and only limits the possibility of false asserts. >>> >>> Thanks >>> - Ioi >>> >>>> David >>>> >>>>> - Ioi >>>>> >>>>> >>>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>>> The fundamental problem is we need a two-step operation and we avoid >>>>>>> proper synchronization (for performance/avoiding deadlock/etc). >>>>>>> So we >>>>>>> will always have a race condition somewhere, and we just need to >>>>>>> allow >>>>>>> only the benign ones. >>>>>>> >>>>>>> - Ioi >>>>>>> >>>>>>> >>>>>>>> Cheers, >>>>>>>> David >>>>>>>> >>>>>>>>> However, it's not worse than before. The old version also has a >>>>>>>>> race >>>>>>>>> condition: >>>>>>>>> >>>>>>>>> refcount is 0 >>>>>>>>> thread A decrements >>>>>>>>> thread B increments >>>>>>>>> thread A checks for underflow >>>>>>>>> >>>>>>>>> the decrementing thread will read _refcount==0 at the end so it >>>>>>>>> won't >>>>>>>>> detect the (transient) underflow. >>>>>>>>> >>>>>>>>> I think the failure to detect underflow is fine, since this >>>>>>>>> happens >>>>>>>>> only >>>>>>>>> with concurrent access. The kinds of underflow that we are >>>>>>>>> interested >>>>>>>>> usually can be caught in single-threaded situations. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>>>>>>> >>>>>>>>>> This doesn't make sense for me and I have to go in gdb to >>>>>>>>>> print out >>>>>>>>>> what -16384 is. It appears that this is trying to detect >>>>>>>>>> that we >>>>>>>>>> went below zero from zero, which is an error, but this isn't >>>>>>>>>> clear at >>>>>>>>>> all. >>>>>>>>>> >>>>>>>>>> It seems that >>>>>>>>>> >>>>>>>>>> if (_refcount >= 0) { >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Should be > 0 and we should assert if this is ever zero instead, >>>>>>>>>> and >>>>>>>>>> allow anything negative to mean that this count has gone >>>>>>>>>> immortal. >>>>>>>>>> >>>>>>>>>> Kim thought it should use CAS rather than atomic increment and >>>>>>>>>> decrement, but maybe that isn't necessary, especially since there >>>>>>>>>> isn't a short version of cmpxchg. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Summary: >>>>>>>>>>> >>>>>>>>>>> The test was loading a lot of JCK classes into the same VM. >>>>>>>>>>> Many of >>>>>>>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the >>>>>>>>>>> refcount (a signed short integer) of this Symbol would run up >>>>>>>>>>> and >>>>>>>>>>> past 0x7fff. >>>>>>>>>>> >>>>>>>>>>> The assert was caused by a race condition: the refcount started >>>>>>>>>>> with >>>>>>>>>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>>>>>>>>> decrementing and several other threads are incrementing. The >>>>>>>>>>> refcount >>>>>>>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>>>>>>> number of >>>>>>>>>>> concurrent threads that are running within a small window of >>>>>>>>>>> several >>>>>>>>>>> instructions in the decrementing thread, so most likely it >>>>>>>>>>> will be >>>>>>>>>>> 0x800?). >>>>>>>>>>> >>>>>>>>>>> As a result, the decrementing thread found that the refecount is >>>>>>>>>>> negative after the operation, and thought that an underflow had >>>>>>>>>>> happened. >>>>>>>>>>> >>>>>>>>>>> The fix is to ignore any value that may appear in the [0x8000 - >>>>>>>>>>> 0xbfff] range and do not flag these as underflows (since they >>>>>>>>>>> are >>>>>>>>>>> most likely overflows -- overflows are already handled by >>>>>>>>>>> making the >>>>>>>>>>> Symbol permanent). >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> - Ioi >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>> > From aph at redhat.com Wed Aug 24 07:36:29 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 24 Aug 2016 08:36:29 +0100 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: References: Message-ID: On 22/08/16 08:54, David Holmes wrote: > An earlier code review noticed that the default shared implementation of > Atomic::cmpxchg(jbyte*) was missing the required post-memory-barrier in > case of an initial failure: Just to satisfy my curiosity, why is the post-memory-barrier in case of an initial failure required? Is there some specification somewhere that I can refer to? Thanks, Andrew. From david.holmes at oracle.com Wed Aug 24 08:01:24 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Aug 2016 18:01:24 +1000 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: References: Message-ID: <34dbe14b-38d3-6755-aa78-9a48371be1da@oracle.com> On 24/08/2016 5:36 PM, Andrew Haley wrote: > On 22/08/16 08:54, David Holmes wrote: >> An earlier code review noticed that the default shared implementation of >> Atomic::cmpxchg(jbyte*) was missing the required post-memory-barrier in >> case of an initial failure: > > Just to satisfy my curiosity, why is the post-memory-barrier in > case of an initial failure required? Is there some specification > somewhere that I can refer to? In atomic.hpp: // All of the atomic operations that imply a read-modify-write action // guarantee a two-way memory barrier across that operation. and then each op has additional descriptions e.g.: // Performs atomic compare of *dest and compare_value, and exchanges // *dest with exchange_value if the comparison succeeded. Returns prior // value of *dest. cmpxchg*() provide: // compare-and-exchange David > Thanks, > > Andrew. > From ioi.lam at oracle.com Wed Aug 24 08:28:17 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 24 Aug 2016 01:28:17 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <71152d0d-c794-de2e-ec2b-31fa7f994b66@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <71152d0d-c794-de2e-ec2b-31fa7f994b66@oracle.com> Message-ID: <57BD5AA1.9090204@oracle.com> On 8/24/16 12:14 AM, David Holmes wrote: > Hi Ioi, > > On 24/08/2016 5:01 PM, Ioi Lam wrote: >> Hi David, >> >> Here's an updated version that added Atomic::add(jshort*, jshort) as you >> suggested. > > Thanks. Looks good. > >> To appease the "unused" warnings, I just added (void)new_value. > > I think I prefer the DEBUG_ONLY version. :) > I am not a big fan of ending DEBUG_ONLY with a "=", so I'll keep the code as is :-) - Ioi > Cheers, > David > >> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ >> >> >> >> I am running RBT with "--test >> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" >> to make sure everything works. >> >> Thanks >> - Ioi >> >> On 8/23/16 10:12 PM, David Holmes wrote: >>> Hi Ioi, >>> >>> Sorry I have a problem here. I just realized there is no >>> Atomic::add(jshort*, jshort). But I can't accept changing the >>> Atomic::inc/dec API only for the jshort case (and why return jint ??). >>> So either the whole Atomic inc/dec API needs updating across the board >>> (not a small task) or else we need to introduce Atomic::add(jshort*, >>> jshort). The latter seems a small task - the existing inc/dec >>> implementations can call the new add version. >>> >>> Also in a product build might the "new_value" variable trigger an >>> "unused" warning? If so an ugly option would be: >>> >>> DEBUG_ONLY(jshort new_value =) Atomic::dec(...); >>> assert(new_value != -1, "..."); >>> >>> Thanks, >>> David >>> >>> On 24/08/2016 1:44 PM, Ioi Lam wrote: >>>> >>>> >>>> On 8/23/16 7:54 PM, David Holmes wrote: >>>>> On 24/08/2016 12:24 PM, Ioi Lam wrote: >>>>>> >>>>>> >>>>>> On 8/23/16 6:51 PM, David Holmes wrote: >>>>>>> On 24/08/2016 11:32 AM, Ioi Lam wrote: >>>>>>>> On 8/23/16 6:08 PM, David Holmes wrote: >>>>>>>>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>>>>>>>> Hi Coleen, thanks for suggestion the simplification: >>>>>>>>>> >>>>>>>>>> void Symbol::decrement_refcount() { >>>>>>>>>> #ifdef ASSERT >>>>>>>>>> if (_refcount == 0) { >>>>>>>>>> print(); >>>>>>>>>> assert(false, "reference count underflow for symbol"); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> #endif >>>>>>>>>> Atomic::dec(&_refcount); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> There's a race condition that won't detect the underflow. E.g., >>>>>>>>>> refcount >>>>>>>>>> is 1. Two threads comes in and decrement at the same time. We >>>>>>>>>> will end >>>>>>>>>> up with -1. >>>>>>>>> >>>>>>>>> So if we're going this path then you can get rid of the race by >>>>>>>>> using >>>>>>>>> Atomic::add(&_refcount, -1), which returns the updated value. If >>>>>>>>> you >>>>>>>>> get back -1 then assert. >>>>>>>>> >>>>>>>> >>>>>>>> The problem is we allow -1 to mean "permanent". Symbols in the CDS >>>>>>>> archive are mapped read-only and have refcount==-1. If we >>>>>>>> decrement >>>>>>>> them >>>>>>>> we get a SEGV. >>>>>>> >>>>>>> Unless you set them using the decrement_refcount function I >>>>>>> don't see >>>>>>> how this is a problem. The assertion would only trigger if we >>>>>>> decrement from zero to -1. >>>>>>> >>>>>>>> Also, if we blindly decrement the refcount, we could end up >>>>>>>> rolling >>>>>>>> back >>>>>>>> all the way from 0xffff to 0x0, at which point we may free the >>>>>>>> Symbol >>>>>>>> which may still be in use. >>>>>>> >>>>>>> ?? As soon as a bad decrement happens the assert is trigerred >>>>>>> and the >>>>>>> VM is dead. >>>>>> >>>>>> The assert doesn't happen in product VM. Currently the product VM >>>>>> will >>>>>> continue to work, and all symbols with underflown/overflown >>>>>> refcounts >>>>>> will be considered as permanent. >>>>> >>>>> Okay - note this bug is about an assertion failure :) >>>>> >>>>> So there are two problems: >>>>> >>>>> 1. There is a race related to the assert that the Atomic::add(-1) can >>>>> fix. >>>>> >>>>> 2. There is the product-mode unconditional decrementing of the >>>>> refcount which will mark everything as "permanent". I haven't see any >>>>> suggestions to address that existing issue. One obvious suggestion is >>>>> to use a different value than -1 so it doesn't easily get hit by >>>>> unexpected underflow. >>>>> >>>> >>>> Here's an updated version that's an improvement over my original >>>> patch: >>>> >>>> + A negative refcount means permanent symbols. >>>> + Only non-permanent symbols can be incremented/decremented >>>> + Underflow is detected by 0 -> -1 transition. Real underflows are >>>> always caught. >>>> + Theoretically there's still a race condition for false asserts: >>>> >>>> [1] thread A observes that refcount is 0x7fff, proceeds to >>>> decrement it >>>> [2] but in the mean time, around 32768 threads come in. They all >>>> observe >>>> that the refcount is non-negative, and then all proceed to >>>> increment the >>>> refcount to 0x0. >>>> [3] thread A decrements the refcount and observes that refcount is -1 >>>> afterwards >>>> >>>> but I think this can be safely ignored :-) >>>> >>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v02/ >>>> >>>> >>>> >>>> >>>> I like this version as it doesn't change the logic for the product VM >>>> and only limits the possibility of false asserts. >>>> >>>> Thanks >>>> - Ioi >>>> >>>>> David >>>>> >>>>>> - Ioi >>>>>> >>>>>> >>>>>>> >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> The fundamental problem is we need a two-step operation and we >>>>>>>> avoid >>>>>>>> proper synchronization (for performance/avoiding deadlock/etc). >>>>>>>> So we >>>>>>>> will always have a race condition somewhere, and we just need to >>>>>>>> allow >>>>>>>> only the benign ones. >>>>>>>> >>>>>>>> - Ioi >>>>>>>> >>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> However, it's not worse than before. The old version also has a >>>>>>>>>> race >>>>>>>>>> condition: >>>>>>>>>> >>>>>>>>>> refcount is 0 >>>>>>>>>> thread A decrements >>>>>>>>>> thread B increments >>>>>>>>>> thread A checks for underflow >>>>>>>>>> >>>>>>>>>> the decrementing thread will read _refcount==0 at the end so it >>>>>>>>>> won't >>>>>>>>>> detect the (transient) underflow. >>>>>>>>>> >>>>>>>>>> I think the failure to detect underflow is fine, since this >>>>>>>>>> happens >>>>>>>>>> only >>>>>>>>>> with concurrent access. The kinds of underflow that we are >>>>>>>>>> interested >>>>>>>>>> usually can be caught in single-threaded situations. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> - Ioi >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>>>>>>>> >>>>>>>>>>> This doesn't make sense for me and I have to go in gdb to >>>>>>>>>>> print out >>>>>>>>>>> what -16384 is. It appears that this is trying to detect >>>>>>>>>>> that we >>>>>>>>>>> went below zero from zero, which is an error, but this isn't >>>>>>>>>>> clear at >>>>>>>>>>> all. >>>>>>>>>>> >>>>>>>>>>> It seems that >>>>>>>>>>> >>>>>>>>>>> if (_refcount >= 0) { >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Should be > 0 and we should assert if this is ever zero >>>>>>>>>>> instead, >>>>>>>>>>> and >>>>>>>>>>> allow anything negative to mean that this count has gone >>>>>>>>>>> immortal. >>>>>>>>>>> >>>>>>>>>>> Kim thought it should use CAS rather than atomic increment and >>>>>>>>>>> decrement, but maybe that isn't necessary, especially since >>>>>>>>>>> there >>>>>>>>>>> isn't a short version of cmpxchg. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Summary: >>>>>>>>>>>> >>>>>>>>>>>> The test was loading a lot of JCK classes into the same VM. >>>>>>>>>>>> Many of >>>>>>>>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", so >>>>>>>>>>>> the >>>>>>>>>>>> refcount (a signed short integer) of this Symbol would run up >>>>>>>>>>>> and >>>>>>>>>>>> past 0x7fff. >>>>>>>>>>>> >>>>>>>>>>>> The assert was caused by a race condition: the refcount >>>>>>>>>>>> started >>>>>>>>>>>> with >>>>>>>>>>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>>>>>>>>>> decrementing and several other threads are incrementing. The >>>>>>>>>>>> refcount >>>>>>>>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>>>>>>>> number of >>>>>>>>>>>> concurrent threads that are running within a small window of >>>>>>>>>>>> several >>>>>>>>>>>> instructions in the decrementing thread, so most likely it >>>>>>>>>>>> will be >>>>>>>>>>>> 0x800?). >>>>>>>>>>>> >>>>>>>>>>>> As a result, the decrementing thread found that the >>>>>>>>>>>> refecount is >>>>>>>>>>>> negative after the operation, and thought that an underflow >>>>>>>>>>>> had >>>>>>>>>>>> happened. >>>>>>>>>>>> >>>>>>>>>>>> The fix is to ignore any value that may appear in the >>>>>>>>>>>> [0x8000 - >>>>>>>>>>>> 0xbfff] range and do not flag these as underflows (since they >>>>>>>>>>>> are >>>>>>>>>>>> most likely overflows -- overflows are already handled by >>>>>>>>>>>> making the >>>>>>>>>>>> Symbol permanent). >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> - Ioi >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>> >> From christian.tornqvist at oracle.com Wed Aug 24 10:30:36 2016 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Wed, 24 Aug 2016 06:30:36 -0400 Subject: RFR(S): 8163146 - Remove os::check_heap on Windows In-Reply-To: References: <20ce01d1fd88$eb75cac0$c2616040$@oracle.com> Message-ID: <226901d1fdf2$8e04fea0$aa0efbe0$@oracle.com> Hi Thomas, Yes, we?ve seen a lot of false positives. Thanks, Christian From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: Wednesday, August 24, 2016 2:33 AM To: Christian Tornqvist Cc: hotspot-runtime-dev Subject: Re: RFR(S): 8163146 - Remove os::check_heap on Windows Hi Christian, this looks fine. Out of curiosity, what were the issues? Did you get false positives or did HeapValidate not find anything? Kind Regards, Thomas On Tue, Aug 23, 2016 at 11:54 PM, Christian Tornqvist > wrote: Hi everyone, Please review this small change that removes os::check_heap, it's proven to be unreliable. The same functionality and more can be enabled in Windows using gflags (pageheap) or by turning on heap debugging in Windbg using the !heap extension. Tested by running hotspot_fast_runtime test group on all platforms. Webrev: http://cr.openjdk.java.net/~ctornqvi/webrev/8163146/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8163146 Thanks, Christian From dmitry.samersoff at oracle.com Wed Aug 24 11:42:46 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 24 Aug 2016 14:42:46 +0300 Subject: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools In-Reply-To: <1f8101d1fd71$feb399d0$fc1acd70$@oracle.com> References: <12dc677b-e33c-0345-4680-e97cc1604cbe@oracle.com> <57BC4B41.60305@oracle.com> <938232c7-c206-f0db-7446-78960537ad2b@oracle.com> <1f8101d1fd71$feb399d0$fc1acd70$@oracle.com> Message-ID: <9fc8324f-4f49-bb09-4f97-ba478d5368be@oracle.com> Christian, Thank you for the review. Please see updated webrev: http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.03/ I still have no ideas why this @build construction works with @run driver but doesn't work with @run main/othervm. Is there a chance to have all such knowledge documented? > You don't need to explicitly build JpsHelper, I would prefer to leave it as is - it's harmless but highlights TestJpsJar dependency. > would it make sense to change this to use the /test/lib ones and I'd tried it[1] and it doesn't work. jtreg claims that package jdk.test.lib doesn't exist.[2] 1. http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.02.bad/ 2. http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.02.bad/TestJpsClass.jtr -Dmitry On 2016-08-23 22:10, Christian Tornqvist wrote: > Hi Dmitry, > > You don't need to explicitly build JpsHelper, > I also noticed that > you're using ProcessTools and OutputAnalyzer from /lib/testlibrary , > would it make sense to change this to use the /test/lib ones and > simply have: > > @library /test/lib > > ? > > Thanks, Christian -----Original Message----- From: > hotspot-runtime-dev > [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of > Dmitry Samersoff Sent: Tuesday, August 23, 2016 3:02 PM To: Ioi Lam > ; serviceability-dev at openjdk.java.net; > hotspot-runtime-dev Subject: > Re: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java > fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools > > Ioi, > > Thank you for review. > > Hmm. It looks like changes below solves the problem. > > - * @build jdk.testlibrary.* JpsHelper JpsBase + * @build JpsHelper > JpsBase > > I'm running rbt job to verify it. > > -Dmitry > > On 2016-08-23 16:10, Ioi Lam wrote: >> Hi Dmitry, >> >> Why are you adding /test/lib: >> >> - * @library /lib/testlibrary + * @library /lib/testlibrary >> /test/lib >> >> The only class used by jdk/test/sun/tools/jps/*.java in /test/lib >> is here: >> >> TestJpsSanity.java:import jdk.test.lib.apps.LingeredApp; >> >> But TestJpsSanity.java is not use by this test -- I ran the test >> with your patch in a clean jtreg directory and the test passed, but >> I don't see TestJpsSanity.class, or any jdk.test.lib.* class. >> >> So I don't think you need to add /test/lib. >> >> - Ioi >> >> On 8/23/16 5:34 AM, Dmitry Samersoff wrote: >>> On 2016-08-17 10:51, Dmitry Samersoff wrote: >>>> Everybody, >>>> >>>> Please review the changes: >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.01/ >>>> >>>> -Dmitry >>>> >>> >> > > > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, > Russia * I would love to change the world, but they won't give me the > sources. > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From coleen.phillimore at oracle.com Wed Aug 24 12:14:46 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 24 Aug 2016 08:14:46 -0400 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BD5AA1.9090204@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <71152d0d-c794-de2e-ec2b-31fa7f994b66@oracle.com> <57BD5AA1.9090204@oracle.com> Message-ID: I still don't want us to observe a _refcount == 0 - this is always an error. + if (_refcount >= 0) { // not a permanent symbol Can this be changed to something like: + volatile int ref = _refcount; + assert(ref != 0, "underflow"); + if (ref > 0) { // not a permanent symbol I really like the atomic::add checking return value. I'm working on a change for later that will eagerly delete symbols whose refcounts go to zero. Thanks, Coleen On 8/24/16 4:28 AM, Ioi Lam wrote: > > > On 8/24/16 12:14 AM, David Holmes wrote: >> Hi Ioi, >> >> On 24/08/2016 5:01 PM, Ioi Lam wrote: >>> Hi David, >>> >>> Here's an updated version that added Atomic::add(jshort*, jshort) as >>> you >>> suggested. >> >> Thanks. Looks good. >> >>> To appease the "unused" warnings, I just added (void)new_value. >> >> I think I prefer the DEBUG_ONLY version. :) >> > > I am not a big fan of ending DEBUG_ONLY with a "=", so I'll keep the > code as is :-) > > - Ioi > > >> Cheers, >> David >> >>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ >>> >>> >>> >>> I am running RBT with "--test >>> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" >>> >>> to make sure everything works. >>> >>> Thanks >>> - Ioi >>> >>> On 8/23/16 10:12 PM, David Holmes wrote: >>>> Hi Ioi, >>>> >>>> Sorry I have a problem here. I just realized there is no >>>> Atomic::add(jshort*, jshort). But I can't accept changing the >>>> Atomic::inc/dec API only for the jshort case (and why return jint ??). >>>> So either the whole Atomic inc/dec API needs updating across the board >>>> (not a small task) or else we need to introduce Atomic::add(jshort*, >>>> jshort). The latter seems a small task - the existing inc/dec >>>> implementations can call the new add version. >>>> >>>> Also in a product build might the "new_value" variable trigger an >>>> "unused" warning? If so an ugly option would be: >>>> >>>> DEBUG_ONLY(jshort new_value =) Atomic::dec(...); >>>> assert(new_value != -1, "..."); >>>> >>>> Thanks, >>>> David >>>> >>>> On 24/08/2016 1:44 PM, Ioi Lam wrote: >>>>> >>>>> >>>>> On 8/23/16 7:54 PM, David Holmes wrote: >>>>>> On 24/08/2016 12:24 PM, Ioi Lam wrote: >>>>>>> >>>>>>> >>>>>>> On 8/23/16 6:51 PM, David Holmes wrote: >>>>>>>> On 24/08/2016 11:32 AM, Ioi Lam wrote: >>>>>>>>> On 8/23/16 6:08 PM, David Holmes wrote: >>>>>>>>>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>>>>>>>>> Hi Coleen, thanks for suggestion the simplification: >>>>>>>>>>> >>>>>>>>>>> void Symbol::decrement_refcount() { >>>>>>>>>>> #ifdef ASSERT >>>>>>>>>>> if (_refcount == 0) { >>>>>>>>>>> print(); >>>>>>>>>>> assert(false, "reference count underflow for symbol"); >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> #endif >>>>>>>>>>> Atomic::dec(&_refcount); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> There's a race condition that won't detect the underflow. E.g., >>>>>>>>>>> refcount >>>>>>>>>>> is 1. Two threads comes in and decrement at the same time. We >>>>>>>>>>> will end >>>>>>>>>>> up with -1. >>>>>>>>>> >>>>>>>>>> So if we're going this path then you can get rid of the race by >>>>>>>>>> using >>>>>>>>>> Atomic::add(&_refcount, -1), which returns the updated value. If >>>>>>>>>> you >>>>>>>>>> get back -1 then assert. >>>>>>>>>> >>>>>>>>> >>>>>>>>> The problem is we allow -1 to mean "permanent". Symbols in the >>>>>>>>> CDS >>>>>>>>> archive are mapped read-only and have refcount==-1. If we >>>>>>>>> decrement >>>>>>>>> them >>>>>>>>> we get a SEGV. >>>>>>>> >>>>>>>> Unless you set them using the decrement_refcount function I >>>>>>>> don't see >>>>>>>> how this is a problem. The assertion would only trigger if we >>>>>>>> decrement from zero to -1. >>>>>>>> >>>>>>>>> Also, if we blindly decrement the refcount, we could end up >>>>>>>>> rolling >>>>>>>>> back >>>>>>>>> all the way from 0xffff to 0x0, at which point we may free the >>>>>>>>> Symbol >>>>>>>>> which may still be in use. >>>>>>>> >>>>>>>> ?? As soon as a bad decrement happens the assert is trigerred >>>>>>>> and the >>>>>>>> VM is dead. >>>>>>> >>>>>>> The assert doesn't happen in product VM. Currently the product >>>>>>> VM will >>>>>>> continue to work, and all symbols with underflown/overflown >>>>>>> refcounts >>>>>>> will be considered as permanent. >>>>>> >>>>>> Okay - note this bug is about an assertion failure :) >>>>>> >>>>>> So there are two problems: >>>>>> >>>>>> 1. There is a race related to the assert that the Atomic::add(-1) >>>>>> can >>>>>> fix. >>>>>> >>>>>> 2. There is the product-mode unconditional decrementing of the >>>>>> refcount which will mark everything as "permanent". I haven't see >>>>>> any >>>>>> suggestions to address that existing issue. One obvious >>>>>> suggestion is >>>>>> to use a different value than -1 so it doesn't easily get hit by >>>>>> unexpected underflow. >>>>>> >>>>> >>>>> Here's an updated version that's an improvement over my original >>>>> patch: >>>>> >>>>> + A negative refcount means permanent symbols. >>>>> + Only non-permanent symbols can be incremented/decremented >>>>> + Underflow is detected by 0 -> -1 transition. Real underflows are >>>>> always caught. >>>>> + Theoretically there's still a race condition for false asserts: >>>>> >>>>> [1] thread A observes that refcount is 0x7fff, proceeds to >>>>> decrement it >>>>> [2] but in the mean time, around 32768 threads come in. They all >>>>> observe >>>>> that the refcount is non-negative, and then all proceed to >>>>> increment the >>>>> refcount to 0x0. >>>>> [3] thread A decrements the refcount and observes that refcount is -1 >>>>> afterwards >>>>> >>>>> but I think this can be safely ignored :-) >>>>> >>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v02/ >>>>> >>>>> >>>>> >>>>> >>>>> I like this version as it doesn't change the logic for the product VM >>>>> and only limits the possibility of false asserts. >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>>> David >>>>>> >>>>>>> - Ioi >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> The fundamental problem is we need a two-step operation and we >>>>>>>>> avoid >>>>>>>>> proper synchronization (for performance/avoiding deadlock/etc). >>>>>>>>> So we >>>>>>>>> will always have a race condition somewhere, and we just need to >>>>>>>>> allow >>>>>>>>> only the benign ones. >>>>>>>>> >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> However, it's not worse than before. The old version also has a >>>>>>>>>>> race >>>>>>>>>>> condition: >>>>>>>>>>> >>>>>>>>>>> refcount is 0 >>>>>>>>>>> thread A decrements >>>>>>>>>>> thread B increments >>>>>>>>>>> thread A checks for underflow >>>>>>>>>>> >>>>>>>>>>> the decrementing thread will read _refcount==0 at the end so it >>>>>>>>>>> won't >>>>>>>>>>> detect the (transient) underflow. >>>>>>>>>>> >>>>>>>>>>> I think the failure to detect underflow is fine, since this >>>>>>>>>>> happens >>>>>>>>>>> only >>>>>>>>>>> with concurrent access. The kinds of underflow that we are >>>>>>>>>>> interested >>>>>>>>>>> usually can be caught in single-threaded situations. >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> - Ioi >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>>>>>>>>> >>>>>>>>>>>> This doesn't make sense for me and I have to go in gdb to >>>>>>>>>>>> print out >>>>>>>>>>>> what -16384 is. It appears that this is trying to detect >>>>>>>>>>>> that we >>>>>>>>>>>> went below zero from zero, which is an error, but this isn't >>>>>>>>>>>> clear at >>>>>>>>>>>> all. >>>>>>>>>>>> >>>>>>>>>>>> It seems that >>>>>>>>>>>> >>>>>>>>>>>> if (_refcount >= 0) { >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Should be > 0 and we should assert if this is ever zero >>>>>>>>>>>> instead, >>>>>>>>>>>> and >>>>>>>>>>>> allow anything negative to mean that this count has gone >>>>>>>>>>>> immortal. >>>>>>>>>>>> >>>>>>>>>>>> Kim thought it should use CAS rather than atomic increment and >>>>>>>>>>>> decrement, but maybe that isn't necessary, especially since >>>>>>>>>>>> there >>>>>>>>>>>> isn't a short version of cmpxchg. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> Coleen >>>>>>>>>>>> >>>>>>>>>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Summary: >>>>>>>>>>>>> >>>>>>>>>>>>> The test was loading a lot of JCK classes into the same VM. >>>>>>>>>>>>> Many of >>>>>>>>>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", >>>>>>>>>>>>> so the >>>>>>>>>>>>> refcount (a signed short integer) of this Symbol would run up >>>>>>>>>>>>> and >>>>>>>>>>>>> past 0x7fff. >>>>>>>>>>>>> >>>>>>>>>>>>> The assert was caused by a race condition: the refcount >>>>>>>>>>>>> started >>>>>>>>>>>>> with >>>>>>>>>>>>> a large (16-bit) positive value such as 0x7fff, one thread is >>>>>>>>>>>>> decrementing and several other threads are incrementing. The >>>>>>>>>>>>> refcount >>>>>>>>>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>>>>>>>>> number of >>>>>>>>>>>>> concurrent threads that are running within a small window of >>>>>>>>>>>>> several >>>>>>>>>>>>> instructions in the decrementing thread, so most likely it >>>>>>>>>>>>> will be >>>>>>>>>>>>> 0x800?). >>>>>>>>>>>>> >>>>>>>>>>>>> As a result, the decrementing thread found that the >>>>>>>>>>>>> refecount is >>>>>>>>>>>>> negative after the operation, and thought that an >>>>>>>>>>>>> underflow had >>>>>>>>>>>>> happened. >>>>>>>>>>>>> >>>>>>>>>>>>> The fix is to ignore any value that may appear in the >>>>>>>>>>>>> [0x8000 - >>>>>>>>>>>>> 0xbfff] range and do not flag these as underflows (since they >>>>>>>>>>>>> are >>>>>>>>>>>>> most likely overflows -- overflows are already handled by >>>>>>>>>>>>> making the >>>>>>>>>>>>> Symbol permanent). >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> - Ioi >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>> > From ioi.lam at oracle.com Wed Aug 24 12:37:10 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 24 Aug 2016 05:37:10 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <71152d0d-c794-de2e-ec2b-31fa7f994b66@oracle.com> <57BD5AA1.9090204@oracle.com> Message-ID: <57BD94F6.9060302@oracle.com> On 8/24/16 5:14 AM, Coleen Phillimore wrote: > > I still don't want us to observe a _refcount == 0 - this is always an > error. > > + if (_refcount >= 0) { // not a permanent symbol > > > Can this be changed to something like: > > + volatile int ref = _refcount; > + assert(ref != 0, "underflow"); > > + if (ref > 0) { // not a permanent symbol > > This will probably mess up product VM, as refcount will stay being zero. I'd rather have it fall negative and become a permanent symbol. The current code will make the product VM more resilient even if underflow happens. Thanks - Ioi > I really like the atomic::add checking return value. > > I'm working on a change for later that will eagerly delete symbols > whose refcounts go to zero. > > Thanks, > Coleen > > On 8/24/16 4:28 AM, Ioi Lam wrote: >> >> >> On 8/24/16 12:14 AM, David Holmes wrote: >>> Hi Ioi, >>> >>> On 24/08/2016 5:01 PM, Ioi Lam wrote: >>>> Hi David, >>>> >>>> Here's an updated version that added Atomic::add(jshort*, jshort) >>>> as you >>>> suggested. >>> >>> Thanks. Looks good. >>> >>>> To appease the "unused" warnings, I just added (void)new_value. >>> >>> I think I prefer the DEBUG_ONLY version. :) >>> >> >> I am not a big fan of ending DEBUG_ONLY with a "=", so I'll keep the >> code as is :-) >> >> - Ioi >> >> >>> Cheers, >>> David >>> >>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ >>>> >>>> >>>> >>>> I am running RBT with "--test >>>> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" >>>> >>>> to make sure everything works. >>>> >>>> Thanks >>>> - Ioi >>>> >>>> On 8/23/16 10:12 PM, David Holmes wrote: >>>>> Hi Ioi, >>>>> >>>>> Sorry I have a problem here. I just realized there is no >>>>> Atomic::add(jshort*, jshort). But I can't accept changing the >>>>> Atomic::inc/dec API only for the jshort case (and why return jint >>>>> ??). >>>>> So either the whole Atomic inc/dec API needs updating across the >>>>> board >>>>> (not a small task) or else we need to introduce Atomic::add(jshort*, >>>>> jshort). The latter seems a small task - the existing inc/dec >>>>> implementations can call the new add version. >>>>> >>>>> Also in a product build might the "new_value" variable trigger an >>>>> "unused" warning? If so an ugly option would be: >>>>> >>>>> DEBUG_ONLY(jshort new_value =) Atomic::dec(...); >>>>> assert(new_value != -1, "..."); >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 24/08/2016 1:44 PM, Ioi Lam wrote: >>>>>> >>>>>> >>>>>> On 8/23/16 7:54 PM, David Holmes wrote: >>>>>>> On 24/08/2016 12:24 PM, Ioi Lam wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 8/23/16 6:51 PM, David Holmes wrote: >>>>>>>>> On 24/08/2016 11:32 AM, Ioi Lam wrote: >>>>>>>>>> On 8/23/16 6:08 PM, David Holmes wrote: >>>>>>>>>>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>>>>>>>>>> Hi Coleen, thanks for suggestion the simplification: >>>>>>>>>>>> >>>>>>>>>>>> void Symbol::decrement_refcount() { >>>>>>>>>>>> #ifdef ASSERT >>>>>>>>>>>> if (_refcount == 0) { >>>>>>>>>>>> print(); >>>>>>>>>>>> assert(false, "reference count underflow for symbol"); >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> #endif >>>>>>>>>>>> Atomic::dec(&_refcount); >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> There's a race condition that won't detect the underflow. >>>>>>>>>>>> E.g., >>>>>>>>>>>> refcount >>>>>>>>>>>> is 1. Two threads comes in and decrement at the same time. We >>>>>>>>>>>> will end >>>>>>>>>>>> up with -1. >>>>>>>>>>> >>>>>>>>>>> So if we're going this path then you can get rid of the race by >>>>>>>>>>> using >>>>>>>>>>> Atomic::add(&_refcount, -1), which returns the updated >>>>>>>>>>> value. If >>>>>>>>>>> you >>>>>>>>>>> get back -1 then assert. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The problem is we allow -1 to mean "permanent". Symbols in >>>>>>>>>> the CDS >>>>>>>>>> archive are mapped read-only and have refcount==-1. If we >>>>>>>>>> decrement >>>>>>>>>> them >>>>>>>>>> we get a SEGV. >>>>>>>>> >>>>>>>>> Unless you set them using the decrement_refcount function I >>>>>>>>> don't see >>>>>>>>> how this is a problem. The assertion would only trigger if we >>>>>>>>> decrement from zero to -1. >>>>>>>>> >>>>>>>>>> Also, if we blindly decrement the refcount, we could end up >>>>>>>>>> rolling >>>>>>>>>> back >>>>>>>>>> all the way from 0xffff to 0x0, at which point we may free the >>>>>>>>>> Symbol >>>>>>>>>> which may still be in use. >>>>>>>>> >>>>>>>>> ?? As soon as a bad decrement happens the assert is trigerred >>>>>>>>> and the >>>>>>>>> VM is dead. >>>>>>>> >>>>>>>> The assert doesn't happen in product VM. Currently the product >>>>>>>> VM will >>>>>>>> continue to work, and all symbols with underflown/overflown >>>>>>>> refcounts >>>>>>>> will be considered as permanent. >>>>>>> >>>>>>> Okay - note this bug is about an assertion failure :) >>>>>>> >>>>>>> So there are two problems: >>>>>>> >>>>>>> 1. There is a race related to the assert that the >>>>>>> Atomic::add(-1) can >>>>>>> fix. >>>>>>> >>>>>>> 2. There is the product-mode unconditional decrementing of the >>>>>>> refcount which will mark everything as "permanent". I haven't >>>>>>> see any >>>>>>> suggestions to address that existing issue. One obvious >>>>>>> suggestion is >>>>>>> to use a different value than -1 so it doesn't easily get hit by >>>>>>> unexpected underflow. >>>>>>> >>>>>> >>>>>> Here's an updated version that's an improvement over my original >>>>>> patch: >>>>>> >>>>>> + A negative refcount means permanent symbols. >>>>>> + Only non-permanent symbols can be incremented/decremented >>>>>> + Underflow is detected by 0 -> -1 transition. Real underflows are >>>>>> always caught. >>>>>> + Theoretically there's still a race condition for false asserts: >>>>>> >>>>>> [1] thread A observes that refcount is 0x7fff, proceeds to >>>>>> decrement it >>>>>> [2] but in the mean time, around 32768 threads come in. They all >>>>>> observe >>>>>> that the refcount is non-negative, and then all proceed to >>>>>> increment the >>>>>> refcount to 0x0. >>>>>> [3] thread A decrements the refcount and observes that refcount >>>>>> is -1 >>>>>> afterwards >>>>>> >>>>>> but I think this can be safely ignored :-) >>>>>> >>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v02/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I like this version as it doesn't change the logic for the >>>>>> product VM >>>>>> and only limits the possibility of false asserts. >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>>> David >>>>>>> >>>>>>>> - Ioi >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> David >>>>>>>>> ----- >>>>>>>>> >>>>>>>>>> The fundamental problem is we need a two-step operation and >>>>>>>>>> we avoid >>>>>>>>>> proper synchronization (for performance/avoiding deadlock/etc). >>>>>>>>>> So we >>>>>>>>>> will always have a race condition somewhere, and we just need to >>>>>>>>>> allow >>>>>>>>>> only the benign ones. >>>>>>>>>> >>>>>>>>>> - Ioi >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> However, it's not worse than before. The old version also >>>>>>>>>>>> has a >>>>>>>>>>>> race >>>>>>>>>>>> condition: >>>>>>>>>>>> >>>>>>>>>>>> refcount is 0 >>>>>>>>>>>> thread A decrements >>>>>>>>>>>> thread B increments >>>>>>>>>>>> thread A checks for underflow >>>>>>>>>>>> >>>>>>>>>>>> the decrementing thread will read _refcount==0 at the end >>>>>>>>>>>> so it >>>>>>>>>>>> won't >>>>>>>>>>>> detect the (transient) underflow. >>>>>>>>>>>> >>>>>>>>>>>> I think the failure to detect underflow is fine, since this >>>>>>>>>>>> happens >>>>>>>>>>>> only >>>>>>>>>>>> with concurrent access. The kinds of underflow that we are >>>>>>>>>>>> interested >>>>>>>>>>>> usually can be caught in single-threaded situations. >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> - Ioi >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> This doesn't make sense for me and I have to go in gdb to >>>>>>>>>>>>> print out >>>>>>>>>>>>> what -16384 is. It appears that this is trying to detect >>>>>>>>>>>>> that we >>>>>>>>>>>>> went below zero from zero, which is an error, but this isn't >>>>>>>>>>>>> clear at >>>>>>>>>>>>> all. >>>>>>>>>>>>> >>>>>>>>>>>>> It seems that >>>>>>>>>>>>> >>>>>>>>>>>>> if (_refcount >= 0) { >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Should be > 0 and we should assert if this is ever zero >>>>>>>>>>>>> instead, >>>>>>>>>>>>> and >>>>>>>>>>>>> allow anything negative to mean that this count has gone >>>>>>>>>>>>> immortal. >>>>>>>>>>>>> >>>>>>>>>>>>> Kim thought it should use CAS rather than atomic increment >>>>>>>>>>>>> and >>>>>>>>>>>>> decrement, but maybe that isn't necessary, especially >>>>>>>>>>>>> since there >>>>>>>>>>>>> isn't a short version of cmpxchg. >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> Coleen >>>>>>>>>>>>> >>>>>>>>>>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Summary: >>>>>>>>>>>>>> >>>>>>>>>>>>>> The test was loading a lot of JCK classes into the same VM. >>>>>>>>>>>>>> Many of >>>>>>>>>>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", >>>>>>>>>>>>>> so the >>>>>>>>>>>>>> refcount (a signed short integer) of this Symbol would >>>>>>>>>>>>>> run up >>>>>>>>>>>>>> and >>>>>>>>>>>>>> past 0x7fff. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The assert was caused by a race condition: the refcount >>>>>>>>>>>>>> started >>>>>>>>>>>>>> with >>>>>>>>>>>>>> a large (16-bit) positive value such as 0x7fff, one >>>>>>>>>>>>>> thread is >>>>>>>>>>>>>> decrementing and several other threads are incrementing. The >>>>>>>>>>>>>> refcount >>>>>>>>>>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>>>>>>>>>> number of >>>>>>>>>>>>>> concurrent threads that are running within a small window of >>>>>>>>>>>>>> several >>>>>>>>>>>>>> instructions in the decrementing thread, so most likely it >>>>>>>>>>>>>> will be >>>>>>>>>>>>>> 0x800?). >>>>>>>>>>>>>> >>>>>>>>>>>>>> As a result, the decrementing thread found that the >>>>>>>>>>>>>> refecount is >>>>>>>>>>>>>> negative after the operation, and thought that an >>>>>>>>>>>>>> underflow had >>>>>>>>>>>>>> happened. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The fix is to ignore any value that may appear in the >>>>>>>>>>>>>> [0x8000 - >>>>>>>>>>>>>> 0xbfff] range and do not flag these as underflows (since >>>>>>>>>>>>>> they >>>>>>>>>>>>>> are >>>>>>>>>>>>>> most likely overflows -- overflows are already handled by >>>>>>>>>>>>>> making the >>>>>>>>>>>>>> Symbol permanent). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>> - Ioi >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>> >> > From coleen.phillimore at oracle.com Wed Aug 24 12:42:32 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 24 Aug 2016 08:42:32 -0400 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BD94F6.9060302@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <71152d0d-c794-de2e-ec2b-31fa7f994b66@oracle.com> <57BD5AA1.9090204@oracle.com> <57BD94F6.9060302@oracle.com> Message-ID: On 8/24/16 8:37 AM, Ioi Lam wrote: > > > On 8/24/16 5:14 AM, Coleen Phillimore wrote: >> >> I still don't want us to observe a _refcount == 0 - this is always an >> error. >> >> + if (_refcount >= 0) { // not a permanent symbol >> >> >> Can this be changed to something like: >> >> + volatile int ref = _refcount; >> + assert(ref != 0, "underflow"); >> >> + if (ref > 0) { // not a permanent symbol >> >> > This will probably mess up product VM, as refcount will stay being > zero. I'd rather have it fall negative and become a permanent symbol. > > The current code will make the product VM more resilient even if > underflow happens. Okay, that's fine. Thanks, Coleen > > Thanks > - Ioi > >> I really like the atomic::add checking return value. >> >> I'm working on a change for later that will eagerly delete symbols >> whose refcounts go to zero. >> >> Thanks, >> Coleen >> >> On 8/24/16 4:28 AM, Ioi Lam wrote: >>> >>> >>> On 8/24/16 12:14 AM, David Holmes wrote: >>>> Hi Ioi, >>>> >>>> On 24/08/2016 5:01 PM, Ioi Lam wrote: >>>>> Hi David, >>>>> >>>>> Here's an updated version that added Atomic::add(jshort*, jshort) >>>>> as you >>>>> suggested. >>>> >>>> Thanks. Looks good. >>>> >>>>> To appease the "unused" warnings, I just added (void)new_value. >>>> >>>> I think I prefer the DEBUG_ONLY version. :) >>>> >>> >>> I am not a big fan of ending DEBUG_ONLY with a "=", so I'll keep the >>> code as is :-) >>> >>> - Ioi >>> >>> >>>> Cheers, >>>> David >>>> >>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ >>>>> >>>>> >>>>> >>>>> I am running RBT with "--test >>>>> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" >>>>> >>>>> to make sure everything works. >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> On 8/23/16 10:12 PM, David Holmes wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> Sorry I have a problem here. I just realized there is no >>>>>> Atomic::add(jshort*, jshort). But I can't accept changing the >>>>>> Atomic::inc/dec API only for the jshort case (and why return jint >>>>>> ??). >>>>>> So either the whole Atomic inc/dec API needs updating across the >>>>>> board >>>>>> (not a small task) or else we need to introduce Atomic::add(jshort*, >>>>>> jshort). The latter seems a small task - the existing inc/dec >>>>>> implementations can call the new add version. >>>>>> >>>>>> Also in a product build might the "new_value" variable trigger an >>>>>> "unused" warning? If so an ugly option would be: >>>>>> >>>>>> DEBUG_ONLY(jshort new_value =) Atomic::dec(...); >>>>>> assert(new_value != -1, "..."); >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 24/08/2016 1:44 PM, Ioi Lam wrote: >>>>>>> >>>>>>> >>>>>>> On 8/23/16 7:54 PM, David Holmes wrote: >>>>>>>> On 24/08/2016 12:24 PM, Ioi Lam wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 8/23/16 6:51 PM, David Holmes wrote: >>>>>>>>>> On 24/08/2016 11:32 AM, Ioi Lam wrote: >>>>>>>>>>> On 8/23/16 6:08 PM, David Holmes wrote: >>>>>>>>>>>> On 24/08/2016 11:03 AM, Ioi Lam wrote: >>>>>>>>>>>>> Hi Coleen, thanks for suggestion the simplification: >>>>>>>>>>>>> >>>>>>>>>>>>> void Symbol::decrement_refcount() { >>>>>>>>>>>>> #ifdef ASSERT >>>>>>>>>>>>> if (_refcount == 0) { >>>>>>>>>>>>> print(); >>>>>>>>>>>>> assert(false, "reference count underflow for symbol"); >>>>>>>>>>>>> } >>>>>>>>>>>>> } >>>>>>>>>>>>> #endif >>>>>>>>>>>>> Atomic::dec(&_refcount); >>>>>>>>>>>>> } >>>>>>>>>>>>> >>>>>>>>>>>>> There's a race condition that won't detect the underflow. >>>>>>>>>>>>> E.g., >>>>>>>>>>>>> refcount >>>>>>>>>>>>> is 1. Two threads comes in and decrement at the same time. We >>>>>>>>>>>>> will end >>>>>>>>>>>>> up with -1. >>>>>>>>>>>> >>>>>>>>>>>> So if we're going this path then you can get rid of the >>>>>>>>>>>> race by >>>>>>>>>>>> using >>>>>>>>>>>> Atomic::add(&_refcount, -1), which returns the updated >>>>>>>>>>>> value. If >>>>>>>>>>>> you >>>>>>>>>>>> get back -1 then assert. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The problem is we allow -1 to mean "permanent". Symbols in >>>>>>>>>>> the CDS >>>>>>>>>>> archive are mapped read-only and have refcount==-1. If we >>>>>>>>>>> decrement >>>>>>>>>>> them >>>>>>>>>>> we get a SEGV. >>>>>>>>>> >>>>>>>>>> Unless you set them using the decrement_refcount function I >>>>>>>>>> don't see >>>>>>>>>> how this is a problem. The assertion would only trigger if we >>>>>>>>>> decrement from zero to -1. >>>>>>>>>> >>>>>>>>>>> Also, if we blindly decrement the refcount, we could end up >>>>>>>>>>> rolling >>>>>>>>>>> back >>>>>>>>>>> all the way from 0xffff to 0x0, at which point we may free the >>>>>>>>>>> Symbol >>>>>>>>>>> which may still be in use. >>>>>>>>>> >>>>>>>>>> ?? As soon as a bad decrement happens the assert is trigerred >>>>>>>>>> and the >>>>>>>>>> VM is dead. >>>>>>>>> >>>>>>>>> The assert doesn't happen in product VM. Currently the product >>>>>>>>> VM will >>>>>>>>> continue to work, and all symbols with underflown/overflown >>>>>>>>> refcounts >>>>>>>>> will be considered as permanent. >>>>>>>> >>>>>>>> Okay - note this bug is about an assertion failure :) >>>>>>>> >>>>>>>> So there are two problems: >>>>>>>> >>>>>>>> 1. There is a race related to the assert that the >>>>>>>> Atomic::add(-1) can >>>>>>>> fix. >>>>>>>> >>>>>>>> 2. There is the product-mode unconditional decrementing of the >>>>>>>> refcount which will mark everything as "permanent". I haven't >>>>>>>> see any >>>>>>>> suggestions to address that existing issue. One obvious >>>>>>>> suggestion is >>>>>>>> to use a different value than -1 so it doesn't easily get hit by >>>>>>>> unexpected underflow. >>>>>>>> >>>>>>> >>>>>>> Here's an updated version that's an improvement over my original >>>>>>> patch: >>>>>>> >>>>>>> + A negative refcount means permanent symbols. >>>>>>> + Only non-permanent symbols can be incremented/decremented >>>>>>> + Underflow is detected by 0 -> -1 transition. Real underflows are >>>>>>> always caught. >>>>>>> + Theoretically there's still a race condition for false asserts: >>>>>>> >>>>>>> [1] thread A observes that refcount is 0x7fff, proceeds to >>>>>>> decrement it >>>>>>> [2] but in the mean time, around 32768 threads come in. They all >>>>>>> observe >>>>>>> that the refcount is non-negative, and then all proceed to >>>>>>> increment the >>>>>>> refcount to 0x0. >>>>>>> [3] thread A decrements the refcount and observes that refcount >>>>>>> is -1 >>>>>>> afterwards >>>>>>> >>>>>>> but I think this can be safely ignored :-) >>>>>>> >>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v02/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I like this version as it doesn't change the logic for the >>>>>>> product VM >>>>>>> and only limits the possibility of false asserts. >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> ----- >>>>>>>>>> >>>>>>>>>>> The fundamental problem is we need a two-step operation and >>>>>>>>>>> we avoid >>>>>>>>>>> proper synchronization (for performance/avoiding deadlock/etc). >>>>>>>>>>> So we >>>>>>>>>>> will always have a race condition somewhere, and we just >>>>>>>>>>> need to >>>>>>>>>>> allow >>>>>>>>>>> only the benign ones. >>>>>>>>>>> >>>>>>>>>>> - Ioi >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> However, it's not worse than before. The old version also >>>>>>>>>>>>> has a >>>>>>>>>>>>> race >>>>>>>>>>>>> condition: >>>>>>>>>>>>> >>>>>>>>>>>>> refcount is 0 >>>>>>>>>>>>> thread A decrements >>>>>>>>>>>>> thread B increments >>>>>>>>>>>>> thread A checks for underflow >>>>>>>>>>>>> >>>>>>>>>>>>> the decrementing thread will read _refcount==0 at the end >>>>>>>>>>>>> so it >>>>>>>>>>>>> won't >>>>>>>>>>>>> detect the (transient) underflow. >>>>>>>>>>>>> >>>>>>>>>>>>> I think the failure to detect underflow is fine, since this >>>>>>>>>>>>> happens >>>>>>>>>>>>> only >>>>>>>>>>>>> with concurrent access. The kinds of underflow that we are >>>>>>>>>>>>> interested >>>>>>>>>>>>> usually can be caught in single-threaded situations. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> - Ioi >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 8/23/16 4:24 PM, Coleen Phillimore wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> This doesn't make sense for me and I have to go in gdb to >>>>>>>>>>>>>> print out >>>>>>>>>>>>>> what -16384 is. It appears that this is trying to detect >>>>>>>>>>>>>> that we >>>>>>>>>>>>>> went below zero from zero, which is an error, but this isn't >>>>>>>>>>>>>> clear at >>>>>>>>>>>>>> all. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It seems that >>>>>>>>>>>>>> >>>>>>>>>>>>>> if (_refcount >= 0) { >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Should be > 0 and we should assert if this is ever zero >>>>>>>>>>>>>> instead, >>>>>>>>>>>>>> and >>>>>>>>>>>>>> allow anything negative to mean that this count has gone >>>>>>>>>>>>>> immortal. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Kim thought it should use CAS rather than atomic >>>>>>>>>>>>>> increment and >>>>>>>>>>>>>> decrement, but maybe that isn't necessary, especially >>>>>>>>>>>>>> since there >>>>>>>>>>>>>> isn't a short version of cmpxchg. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> Coleen >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 8/23/16 6:01 AM, Ioi Lam wrote: >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161280 >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Summary: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The test was loading a lot of JCK classes into the same VM. >>>>>>>>>>>>>>> Many of >>>>>>>>>>>>>>> the JCK classes refer to "javasoft/sqe/javatest/Status", >>>>>>>>>>>>>>> so the >>>>>>>>>>>>>>> refcount (a signed short integer) of this Symbol would >>>>>>>>>>>>>>> run up >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> past 0x7fff. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The assert was caused by a race condition: the refcount >>>>>>>>>>>>>>> started >>>>>>>>>>>>>>> with >>>>>>>>>>>>>>> a large (16-bit) positive value such as 0x7fff, one >>>>>>>>>>>>>>> thread is >>>>>>>>>>>>>>> decrementing and several other threads are incrementing. >>>>>>>>>>>>>>> The >>>>>>>>>>>>>>> refcount >>>>>>>>>>>>>>> will end up being 0x8000 or slightly higher (limited to the >>>>>>>>>>>>>>> number of >>>>>>>>>>>>>>> concurrent threads that are running within a small >>>>>>>>>>>>>>> window of >>>>>>>>>>>>>>> several >>>>>>>>>>>>>>> instructions in the decrementing thread, so most likely it >>>>>>>>>>>>>>> will be >>>>>>>>>>>>>>> 0x800?). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As a result, the decrementing thread found that the >>>>>>>>>>>>>>> refecount is >>>>>>>>>>>>>>> negative after the operation, and thought that an >>>>>>>>>>>>>>> underflow had >>>>>>>>>>>>>>> happened. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The fix is to ignore any value that may appear in the >>>>>>>>>>>>>>> [0x8000 - >>>>>>>>>>>>>>> 0xbfff] range and do not flag these as underflows (since >>>>>>>>>>>>>>> they >>>>>>>>>>>>>>> are >>>>>>>>>>>>>>> most likely overflows -- overflows are already handled by >>>>>>>>>>>>>>> making the >>>>>>>>>>>>>>> Symbol permanent). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>> - Ioi >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>> >> > From volker.simonis at gmail.com Wed Aug 24 12:56:39 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 24 Aug 2016 14:56:39 +0200 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: <1981f29f-b96b-c00a-2aab-280937d275f4@oracle.com> References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> <1981f29f-b96b-c00a-2aab-280937d275f4@oracle.com> Message-ID: Looks good now. Thanks, Volker On Wed, Aug 24, 2016 at 7:21 AM, David Holmes wrote: > Hi Kim, > > Thanks for looking at this. > > Webrev updated in-place. Comments inline. > > On 24/08/2016 6:25 AM, Kim Barrett wrote: >>> >>> On Aug 23, 2016, at 4:55 AM, David Holmes >>> wrote: >>> >>> Hi Volker, Andrew, >>> >>> On 23/08/2016 12:27 AM, Volker Simonis wrote: >>>> >>>> Hi, >>>> >>>> I don't particularly like the const_casts as well. >>> >>> >>> I would have thought this was exactly the kind of thing const_cast was >>> good for - avoiding the need to define multiple overloads to deal with >>> volatile, non-volatile, const etc. >>> >>>> Why not change pointer_delta to accept pointers to volatiles as well: >>>> >>>> pointer_delta(const volatile void* left, const volatile void* right, >>> >>> >>> I can do that. I also have to make a similar change to align_ptr_down. >>> Now should I also change align_ptr_up for consistency (though I note they >>> are already inconsistent in that one takes void* and one takes const void*) >>> ? >>> >>> Alternative webrev at: >>> >>> http://cr.openjdk.java.net/~dholmes/8157904/webrev.v2/ >> >> >> >> ------------------------------------------------------------------------------ >> src/share/vm/runtime/atomic.hpp >> 155 assert(sizeof(jbyte) == 1, "assumption"); >> >> STATIC_ASSERT would be better here. > > > Changed. > >> >> ------------------------------------------------------------------------------ >> src/share/vm/utilities/globalDefinitions.hpp >> 524 inline void* align_ptr_down(volatile void* ptr, size_t alignment) { >> 525 return (void*)align_size_down((intptr_t)ptr, (intptr_t)alignment); >> 526 } >> >> I think implicitly (to the caller of align_ptr_down) casting away >> volatile like this is a mistake. I disagree with the rationale for >> this change; stripping off volatile (or const) *should* be annoyingly >> in your face with a const_cast. > > > Yep my bad - volatile in, volatile out: > > inline volatile void* align_ptr_down(volatile void* ptr, size_t alignment) { > return (volatile void*)align_size_down((intptr_t)ptr, > (intptr_t)alignment); > } > > This also leads to a change to the static_cast to be "volatile jint*". > >> The addition of volatile to pointer_delta is not the same sort of >> thing. I think that change is good, except I agree with Volker that >> only the one version is needed. > > > Fixed. I hadn't appreciated what Volker was saying about one version. > >> >> ------------------------------------------------------------------------------ >> >> Otherwise looks good to me. >> >> Regarding: >> >> Now should I also change align_ptr_up for consistency (though I note >> they are already inconsistent in that one takes void* and one takes >> const void*) ? >> >> I think there should be two overloads of each of these, one with const >> qualified argument and result, and one without const qualification for >> either. That way the result has the same const-ness as the argument. >> We could double the number of overloads by similarly dealing with >> volatile, but I doubt there are enough relevant callers for that to be >> worthwhile; just use const_cast to deal with volatile at the call >> sites. But this is all a different issue... > > > Agreed - separate issue if when it becomes an issue. > > Thanks, > David > > >> Another option would be to make the argument and result >> const-qualified, and make callers deal with the result, but there are >> probably enough call sites to make the second overload worthwhile. >> >> > From kirill.zhaldybin at oracle.com Wed Aug 24 15:47:57 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Wed, 24 Aug 2016 18:47:57 +0300 Subject: RFR(S): 8164738: Convert AltHashing_test to GTest Message-ID: <94b3c417-47ed-d158-c166-84dc8536f199@oracle.com> Dear all, Could you please review this fix for 8164738? To convert the test I added new friend class to AltHashing class so we could access private member function static juint murmur3_32(const int* data, int len). There are also few formating fixes. WebRev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164738/webrev.00/ CR: https://bugs.openjdk.java.net/browse/JDK-8164738 Regards, Kirill From kirill.zhaldybin at oracle.com Wed Aug 24 16:45:32 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Wed, 24 Aug 2016 19:45:32 +0300 Subject: RFR(XS): 8164743: Convert TestAsUtf8 to GTest Message-ID: Dear all, Could you please review this fix for 8164743? The test was converted to GTest. WebRev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164743/webrev.00/ CR: https://bugs.openjdk.java.net/browse/JDK-8164743 Thank you. Regards, Kirill From aph at redhat.com Wed Aug 24 17:05:13 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 24 Aug 2016 18:05:13 +0100 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: <34dbe14b-38d3-6755-aa78-9a48371be1da@oracle.com> References: <34dbe14b-38d3-6755-aa78-9a48371be1da@oracle.com> Message-ID: On 24/08/16 09:01, David Holmes wrote: > On 24/08/2016 5:36 PM, Andrew Haley wrote: >> On 22/08/16 08:54, David Holmes wrote: >>> An earlier code review noticed that the default shared implementation of >>> Atomic::cmpxchg(jbyte*) was missing the required post-memory-barrier in >>> case of an initial failure: >> >> Just to satisfy my curiosity, why is the post-memory-barrier in >> case of an initial failure required? Is there some specification >> somewhere that I can refer to? > > In atomic.hpp: > > // All of the atomic operations that imply a read-modify-write action > // guarantee a two-way memory barrier across that operation. > > and then each op has additional descriptions e.g.: > > // Performs atomic compare of *dest and compare_value, and exchanges > // *dest with exchange_value if the comparison succeeded. Returns prior > // value of *dest. cmpxchg*() provide: > // compare-and-exchange OK; I guess it doesn't much matter from a performance point of view. It's stronger than anything we get from Java and or C++11 intrinsics, which struck me as odd. Andrew. From kim.barrett at oracle.com Wed Aug 24 21:46:32 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 24 Aug 2016 17:46:32 -0400 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BD4649.6000804@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> Message-ID: <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> > On Aug 24, 2016, at 3:01 AM, Ioi Lam wrote: > > Hi David, > > Here's an updated version that added Atomic::add(jshort*, jshort) as you suggested. > > To appease the "unused" warnings, I just added (void)new_value. > > http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ > > I am running RBT with "--test hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" to make sure everything works. ------------------------------------------------------------------------------ src/share/vm/runtime/atomic.hpp 211 jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest-1)); 214 jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest)); Left-shift of a signed negative value is undefined behavior. ------------------------------------------------------------------------------ src/share/vm/runtime/atomic.hpp 216 return (jshort)(new_value >> 16); // preserves sign Right-shift of a signed negative value is implementation-defined. It may or may not sign-extend. (gcc defines it as sign-extending; I have no idea about other compilers.) ------------------------------------------------------------------------------ src/share/vm/runtime/atomic.hpp 220 (void)add(1, dest); 224 (void)add(-1, dest); I don't think the casts are needed here. ------------------------------------------------------------------------------ From david.holmes at oracle.com Wed Aug 24 21:50:31 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Aug 2016 07:50:31 +1000 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> Message-ID: Hi Kim, On 25/08/2016 7:46 AM, Kim Barrett wrote: >> On Aug 24, 2016, at 3:01 AM, Ioi Lam wrote: >> >> Hi David, >> >> Here's an updated version that added Atomic::add(jshort*, jshort) as you suggested. >> >> To appease the "unused" warnings, I just added (void)new_value. >> >> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ >> >> I am running RBT with "--test hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" to make sure everything works. > > ------------------------------------------------------------------------------ > src/share/vm/runtime/atomic.hpp > 211 jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest-1)); > 214 jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest)); > > Left-shift of a signed negative value is undefined behavior. Okay so how do we fix that? It seems pretty obvious/simple what we want to do. Do we just cast to unsigned, shift and cast back? > ------------------------------------------------------------------------------ > src/share/vm/runtime/atomic.hpp > 216 return (jshort)(new_value >> 16); // preserves sign > > Right-shift of a signed negative value is implementation-defined. It > may or may not sign-extend. (gcc defines it as sign-extending; I have > no idea about other compilers.) Ditto. Thanks, David > ------------------------------------------------------------------------------ > src/share/vm/runtime/atomic.hpp > 220 (void)add(1, dest); > 224 (void)add(-1, dest); > > I don't think the casts are needed here. > > ------------------------------------------------------------------------------ > From david.holmes at oracle.com Wed Aug 24 21:51:33 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Aug 2016 07:51:33 +1000 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> <1981f29f-b96b-c00a-2aab-280937d275f4@oracle.com> Message-ID: Thanks Volker! Just waiting for Kim to give the all clear. David On 24/08/2016 10:56 PM, Volker Simonis wrote: > Looks good now. > > Thanks, > Volker > > > On Wed, Aug 24, 2016 at 7:21 AM, David Holmes wrote: >> Hi Kim, >> >> Thanks for looking at this. >> >> Webrev updated in-place. Comments inline. >> >> On 24/08/2016 6:25 AM, Kim Barrett wrote: >>>> >>>> On Aug 23, 2016, at 4:55 AM, David Holmes >>>> wrote: >>>> >>>> Hi Volker, Andrew, >>>> >>>> On 23/08/2016 12:27 AM, Volker Simonis wrote: >>>>> >>>>> Hi, >>>>> >>>>> I don't particularly like the const_casts as well. >>>> >>>> >>>> I would have thought this was exactly the kind of thing const_cast was >>>> good for - avoiding the need to define multiple overloads to deal with >>>> volatile, non-volatile, const etc. >>>> >>>>> Why not change pointer_delta to accept pointers to volatiles as well: >>>>> >>>>> pointer_delta(const volatile void* left, const volatile void* right, >>>> >>>> >>>> I can do that. I also have to make a similar change to align_ptr_down. >>>> Now should I also change align_ptr_up for consistency (though I note they >>>> are already inconsistent in that one takes void* and one takes const void*) >>>> ? >>>> >>>> Alternative webrev at: >>>> >>>> http://cr.openjdk.java.net/~dholmes/8157904/webrev.v2/ >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> src/share/vm/runtime/atomic.hpp >>> 155 assert(sizeof(jbyte) == 1, "assumption"); >>> >>> STATIC_ASSERT would be better here. >> >> >> Changed. >> >>> >>> ------------------------------------------------------------------------------ >>> src/share/vm/utilities/globalDefinitions.hpp >>> 524 inline void* align_ptr_down(volatile void* ptr, size_t alignment) { >>> 525 return (void*)align_size_down((intptr_t)ptr, (intptr_t)alignment); >>> 526 } >>> >>> I think implicitly (to the caller of align_ptr_down) casting away >>> volatile like this is a mistake. I disagree with the rationale for >>> this change; stripping off volatile (or const) *should* be annoyingly >>> in your face with a const_cast. >> >> >> Yep my bad - volatile in, volatile out: >> >> inline volatile void* align_ptr_down(volatile void* ptr, size_t alignment) { >> return (volatile void*)align_size_down((intptr_t)ptr, >> (intptr_t)alignment); >> } >> >> This also leads to a change to the static_cast to be "volatile jint*". >> >>> The addition of volatile to pointer_delta is not the same sort of >>> thing. I think that change is good, except I agree with Volker that >>> only the one version is needed. >> >> >> Fixed. I hadn't appreciated what Volker was saying about one version. >> >>> >>> ------------------------------------------------------------------------------ >>> >>> Otherwise looks good to me. >>> >>> Regarding: >>> >>> Now should I also change align_ptr_up for consistency (though I note >>> they are already inconsistent in that one takes void* and one takes >>> const void*) ? >>> >>> I think there should be two overloads of each of these, one with const >>> qualified argument and result, and one without const qualification for >>> either. That way the result has the same const-ness as the argument. >>> We could double the number of overloads by similarly dealing with >>> volatile, but I doubt there are enough relevant callers for that to be >>> worthwhile; just use const_cast to deal with volatile at the call >>> sites. But this is all a different issue... >> >> >> Agreed - separate issue if when it becomes an issue. >> >> Thanks, >> David >> >> >>> Another option would be to make the argument and result >>> const-qualified, and make callers deal with the result, but there are >>> probably enough call sites to make the second overload worthwhile. >>> >>> >> From kim.barrett at oracle.com Wed Aug 24 22:07:36 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 24 Aug 2016 18:07:36 -0400 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> <1981f29f-b96b-c00a-2aab-280937d275f4@oracle.com> Message-ID: <3D95D28C-F9BF-4C1E-8EA6-F7ED4A1A1185@oracle.com> > On Aug 24, 2016, at 5:51 PM, David Holmes wrote: > > Thanks Volker! > > Just waiting for Kim to give the all clear. Sorry, I missed that you?d done an in-place update of the webrev. I would have left align_ptr_down alone and cast away the volatile in cmpxchg, but have no objection to the approach you?ve taken. Looks good. From david.holmes at oracle.com Wed Aug 24 22:19:08 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Aug 2016 08:19:08 +1000 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: <3D95D28C-F9BF-4C1E-8EA6-F7ED4A1A1185@oracle.com> References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> <1981f29f-b96b-c00a-2aab-280937d275f4@oracle.com> <3D95D28C-F9BF-4C1E-8EA6-F7ED4A1A1185@oracle.com> Message-ID: <9e5f616b-819d-fc4f-5652-1f066de72131@oracle.com> On 25/08/2016 8:07 AM, Kim Barrett wrote: >> On Aug 24, 2016, at 5:51 PM, David Holmes wrote: >> >> Thanks Volker! >> >> Just waiting for Kim to give the all clear. > > Sorry, I missed that you?d done an in-place update of the webrev. > > I would have left align_ptr_down alone and cast away the volatile in cmpxchg, But that would have restored a cast the Volker and Andrew objected to. > but have no objection to the approach you?ve taken. Great! Thanks for the assistance with this. David > Looks good. > From kim.barrett at oracle.com Wed Aug 24 23:17:18 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 24 Aug 2016 19:17:18 -0400 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: <9e5f616b-819d-fc4f-5652-1f066de72131@oracle.com> References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> <1981f29f-b96b-c00a-2aab-280937d275f4@oracle.com> <3D95D28C-F9BF-4C1E-8EA6-F7ED4A1A1185@oracle.com> <9e5f616b-819d-fc4f-5652-1f066de72131@oracle.com> Message-ID: <3BC4857E-A889-4F83-8021-48D7587E30BF@oracle.com> > On Aug 24, 2016, at 6:19 PM, David Holmes wrote: > > On 25/08/2016 8:07 AM, Kim Barrett wrote: >>> On Aug 24, 2016, at 5:51 PM, David Holmes wrote: >>> >>> Thanks Volker! >>> >>> Just waiting for Kim to give the all clear. >> >> Sorry, I missed that you?d done an in-place update of the webrev. >> >> I would have left align_ptr_down alone and cast away the volatile in cmpxchg, > > But that would have restored a cast the Volker and Andrew objected to. It would. I disagree with the objection. I think that?s an appropriate place for a const_cast. >> but have no objection to the approach you?ve taken. > > Great! Thanks for the assistance with this. > > David > >> Looks good. From kim.barrett at oracle.com Thu Aug 25 00:06:44 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 24 Aug 2016 20:06:44 -0400 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> Message-ID: <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> > On Aug 24, 2016, at 5:50 PM, David Holmes wrote: > > Hi Kim, > > On 25/08/2016 7:46 AM, Kim Barrett wrote: >>> On Aug 24, 2016, at 3:01 AM, Ioi Lam wrote: >>> >>> Hi David, >>> >>> Here's an updated version that added Atomic::add(jshort*, jshort) as you suggested. >>> >>> To appease the "unused" warnings, I just added (void)new_value. >>> >>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ >>> >>> I am running RBT with "--test hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" to make sure everything works. >> >> ------------------------------------------------------------------------------ >> src/share/vm/runtime/atomic.hpp >> 211 jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest-1)); >> 214 jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest)); >> >> Left-shift of a signed negative value is undefined behavior. > > Okay so how do we fix that? It seems pretty obvious/simple what we want to do. Do we just cast to unsigned, shift and cast back? > >> ------------------------------------------------------------------------------ >> src/share/vm/runtime/atomic.hpp >> 216 return (jshort)(new_value >> 16); // preserves sign >> >> Right-shift of a signed negative value is implementation-defined. It >> may or may not sign-extend. (gcc defines it as sign-extending; I have >> no idea about other compilers.) > > Ditto. Unfortunately, I don't have a particularly good answer. This is a rather ugly corner of C/C++. One option might be to change the increment and decrement operations to use unsigned arithmetic with unsigned range checks that correspond to the ranges of interest. This would need Atomic::add for jushort instead. This should dodge all the signed arithmetic issues. To help with this we might extract into a helper the "safe" unsigned to signed conversion support from JAVA_INTEGER_OP. This is clumsy and somewhat obfuscated. Alternatively, we can try other mechanisms for working around the signed arithmetic issues. For left-shift, the simplest solution might be to add JAVA_INTEGER_OP for << to the block of such near the end of globalDefinitions.hpp, e.g. JAVA_INTEGER_OP(<<, java_shift_left, jint, juint) JAVA_INTEGER_OP(<<, java_shift_left, jlong, julong) and update the comments describing these operations, since these don't wrap, they just silently discard overflow. And that isn't a constant expression (until we can use C++11 constexpr (might require C++14 constexpr)), which limits where we can use it. That's not entirely nice, since it expands the usage of these operations from "emulate Java operations" to more generally working around the specification of C/C++ arithmetic. Right-shift doesn't have that option. I don't know of a portable and reliably efficient way to do a sign-extending right shift. "Hacker's Delight" provides several formulas for it, but they all take 5-6 instructions. I wasn't able to provoke gcc into recognizing any of them and generating the desired single instruction. Other compilers might do better. A pragmatic answer might be to just assume all the platforms we care about are sign-extending (which is likely what we've been doing all along), and add a little startup test to verify that assumption. If the test trips, then figure out what to do. From ioi.lam at oracle.com Thu Aug 25 00:46:43 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 24 Aug 2016 17:46:43 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> Message-ID: <57BE3FF3.9000408@oracle.com> Hi Kim, Thanks for pointing out the problems with the shift operators. I never knew that! Since I am shifting only by 16, can change the expressions to these? jint(add_value) * 0x10000 jshort(new_value / 0x10000) Does C/C++ preserve signs when multiplying/dividing with a positive constant? Thanks - Ioi On 8/24/16 5:06 PM, Kim Barrett wrote: >> On Aug 24, 2016, at 5:50 PM, David Holmes wrote: >> >> Hi Kim, >> >> On 25/08/2016 7:46 AM, Kim Barrett wrote: >>>> On Aug 24, 2016, at 3:01 AM, Ioi Lam wrote: >>>> >>>> Hi David, >>>> >>>> Here's an updated version that added Atomic::add(jshort*, jshort) as you suggested. >>>> >>>> To appease the "unused" warnings, I just added (void)new_value. >>>> >>>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v03/ >>>> >>>> I am running RBT with "--test hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist" to make sure everything works. >>> ------------------------------------------------------------------------------ >>> src/share/vm/runtime/atomic.hpp >>> 211 jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest-1)); >>> 214 jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest)); >>> >>> Left-shift of a signed negative value is undefined behavior. >> Okay so how do we fix that? It seems pretty obvious/simple what we want to do. Do we just cast to unsigned, shift and cast back? >> >>> ------------------------------------------------------------------------------ >>> src/share/vm/runtime/atomic.hpp >>> 216 return (jshort)(new_value >> 16); // preserves sign >>> >>> Right-shift of a signed negative value is implementation-defined. It >>> may or may not sign-extend. (gcc defines it as sign-extending; I have >>> no idea about other compilers.) >> Ditto. > Unfortunately, I don't have a particularly good answer. This is a > rather ugly corner of C/C++. > > One option might be to change the increment and decrement operations > to use unsigned arithmetic with unsigned range checks that correspond > to the ranges of interest. This would need Atomic::add for jushort > instead. This should dodge all the signed arithmetic issues. To help > with this we might extract into a helper the "safe" unsigned to signed > conversion support from JAVA_INTEGER_OP. This is clumsy and somewhat > obfuscated. > > Alternatively, we can try other mechanisms for working around the > signed arithmetic issues. For left-shift, the simplest solution might > be to add JAVA_INTEGER_OP for << to the block of such near the end of > globalDefinitions.hpp, e.g. > > JAVA_INTEGER_OP(<<, java_shift_left, jint, juint) > JAVA_INTEGER_OP(<<, java_shift_left, jlong, julong) > > and update the comments describing these operations, since these don't > wrap, they just silently discard overflow. And that isn't a constant > expression (until we can use C++11 constexpr (might require C++14 > constexpr)), which limits where we can use it. > > That's not entirely nice, since it expands the usage of these > operations from "emulate Java operations" to more generally working > around the specification of C/C++ arithmetic. > > Right-shift doesn't have that option. I don't know of a portable and > reliably efficient way to do a sign-extending right shift. "Hacker's > Delight" provides several formulas for it, but they all take 5-6 > instructions. I wasn't able to provoke gcc into recognizing any of > them and generating the desired single instruction. Other compilers > might do better. > > A pragmatic answer might be to just assume all the platforms we care > about are sign-extending (which is likely what we've been doing all > along), and add a little startup test to verify that assumption. If > the test trips, then figure out what to do. > From kim.barrett at oracle.com Thu Aug 25 06:52:43 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 25 Aug 2016 02:52:43 -0400 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BE3FF3.9000408@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> <57BE3FF3.9000408@oracle.com> Message-ID: <3468A67D-34C8-44B6-8071-DD5C4BFDFD9A@oracle.com> > On Aug 24, 2016, at 8:46 PM, Ioi Lam wrote: > > Hi Kim, > > Thanks for pointing out the problems with the shift operators. I never knew that! > > Since I am shifting only by 16, can change the expressions to these? > > jint(add_value) * 0x10000 Yes. I remembered that technique shortly after hitting send. Multiplying a signed value doesn't work (is UB) in the general case because of overflow, but we know the value ranges here are safe from that. > jshort(new_value / 0x10000) No, because under 2s complement arithmetic, arithmetic right shift of a negative number is not necessarily equivalent to division by the corresponding power of 2. [See, for example, "Arithmetic shifting considered harmful", Guy Steele, ACM SIGPLAN Notices, 11/1977.] Consider the 32bit value with all 1s in the upper 16 bits, and a non-zero value in the lower 16 bits. If division is truncate, which it is defined to be for C99/C++11 (*), that value / 0x10000 == 0, rather than the desired -1. Clear the low 16bits first and then divide, and I think it works for the case at hand, though I haven't proved it. But pragmatically we?re probably better off assuming right shift works as expected, though perhaps in a wrapper to help indicate we?ve actually thought about the issue. (*) For C89/C++98 the rounding of division involving negative operands is implementation defined, perhaps in part to allow the "optimization" of division by a power of 2 to arithmetic right shift. > Does C/C++ preserve signs when multiplying/dividing with a positive constant? From marcus.larsson at oracle.com Thu Aug 25 07:51:48 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Thu, 25 Aug 2016 09:51:48 +0200 Subject: [PING] Re: RFR: 8150894: Unused -Xlog tag sequences are silently ignored. In-Reply-To: <307928f7-55b3-999d-b3e8-ffd5966457a1@oracle.com> References: <5704B0B0.4010404@oracle.com> <570D02CF.7070708@oracle.com> <307928f7-55b3-999d-b3e8-ffd5966457a1@oracle.com> Message-ID: <85b62937-09e4-f5e5-86c4-c02ee487c63c@oracle.com> On 08/23/2016 01:17 PM, Marcus Larsson wrote: > Hi, > > Still looking for a Reviewer for this. (Rebased webrev in-place.) > > Thanks, > Marcus > > > On 04/12/2016 04:14 PM, Marcus Larsson wrote: >> Ping! >> >> On 04/06/2016 08:46 AM, Marcus Larsson wrote: >>> Hi, >>> >>> Please review the following patch to add a warning for when tag >>> selections in -Xlog or VM.log don't match any tag sets used in the VM. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mlarsson/8150894/webrev.00/ >>> >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8150894 >>> >>> Testing: >>> Internal VM tests with RBT >>> >>> Thanks, >>> Marcus >> > From staffan.larsen at oracle.com Thu Aug 25 08:15:38 2016 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 25 Aug 2016 10:15:38 +0200 Subject: [PING] RFR: 8150894: Unused -Xlog tag sequences are silently ignored. In-Reply-To: <85b62937-09e4-f5e5-86c4-c02ee487c63c@oracle.com> References: <5704B0B0.4010404@oracle.com> <570D02CF.7070708@oracle.com> <307928f7-55b3-999d-b3e8-ffd5966457a1@oracle.com> <85b62937-09e4-f5e5-86c4-c02ee487c63c@oracle.com> Message-ID: <4467DDF3-0FE6-4891-AF50-087944881D98@oracle.com> Looks ok to me. /Staffan > On 25 aug. 2016, at 09:51, Marcus Larsson wrote: > > > On 08/23/2016 01:17 PM, Marcus Larsson wrote: >> Hi, >> >> Still looking for a Reviewer for this. (Rebased webrev in-place.) >> >> Thanks, >> Marcus >> >> >> On 04/12/2016 04:14 PM, Marcus Larsson wrote: >>> Ping! >>> >>> On 04/06/2016 08:46 AM, Marcus Larsson wrote: >>>> Hi, >>>> >>>> Please review the following patch to add a warning for when tag selections in -Xlog or VM.log don't match any tag sets used in the VM. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~mlarsson/8150894/webrev.00/ >>>> >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8150894 >>>> >>>> Testing: >>>> Internal VM tests with RBT >>>> >>>> Thanks, >>>> Marcus >>> >> > From marcus.larsson at oracle.com Thu Aug 25 08:18:52 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Thu, 25 Aug 2016 10:18:52 +0200 Subject: [PING] RFR: 8150894: Unused -Xlog tag sequences are silently ignored. In-Reply-To: <4467DDF3-0FE6-4891-AF50-087944881D98@oracle.com> References: <5704B0B0.4010404@oracle.com> <570D02CF.7070708@oracle.com> <307928f7-55b3-999d-b3e8-ffd5966457a1@oracle.com> <85b62937-09e4-f5e5-86c4-c02ee487c63c@oracle.com> <4467DDF3-0FE6-4891-AF50-087944881D98@oracle.com> Message-ID: Thanks Staffan! Marcus On 08/25/2016 10:15 AM, Staffan Larsen wrote: > Looks ok to me. > > /Staffan > >> On 25 aug. 2016, at 09:51, Marcus Larsson wrote: >> >> >> On 08/23/2016 01:17 PM, Marcus Larsson wrote: >>> Hi, >>> >>> Still looking for a Reviewer for this. (Rebased webrev in-place.) >>> >>> Thanks, >>> Marcus >>> >>> >>> On 04/12/2016 04:14 PM, Marcus Larsson wrote: >>>> Ping! >>>> >>>> On 04/06/2016 08:46 AM, Marcus Larsson wrote: >>>>> Hi, >>>>> >>>>> Please review the following patch to add a warning for when tag selections in -Xlog or VM.log don't match any tag sets used in the VM. >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~mlarsson/8150894/webrev.00/ >>>>> >>>>> Issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8150894 >>>>> >>>>> Testing: >>>>> Internal VM tests with RBT >>>>> >>>>> Thanks, >>>>> Marcus From ioi.lam at oracle.com Thu Aug 25 08:26:19 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 25 Aug 2016 01:26:19 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <3468A67D-34C8-44B6-8071-DD5C4BFDFD9A@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> <57BE3FF3.9000408@oracle.com> <3468A67D-34C8-44B6-8071-DD5C4BFDFD9A@oracle.com> Message-ID: <57BEABAB.1030503@oracle.com> On 8/24/16 11:52 PM, Kim Barrett wrote: >> On Aug 24, 2016, at 8:46 PM, Ioi Lam wrote: >> >> Hi Kim, >> >> Thanks for pointing out the problems with the shift operators. I never knew that! >> >> Since I am shifting only by 16, can change the expressions to these? >> >> jint(add_value) * 0x10000 > Yes. I remembered that technique shortly after hitting send. > Multiplying a signed value doesn't work (is UB) in the general case > because of overflow, but we know the value ranges here are safe from > that. > >> jshort(new_value / 0x10000) > No, because under 2s complement arithmetic, arithmetic right shift of > a negative number is not necessarily equivalent to division by the > corresponding power of 2. [See, for example, "Arithmetic shifting > considered harmful", Guy Steele, ACM SIGPLAN Notices, 11/1977.] > > Consider the 32bit value with all 1s in the upper 16 bits, and a > non-zero value in the lower 16 bits. If division is truncate, which > it is defined to be for C99/C++11 (*), that value / 0x10000 == 0, > rather than the desired -1. Clear the low 16bits first and then > divide, and I think it works for the case at hand, though I haven't > proved it. But pragmatically we?re probably better off assuming > right shift works as expected, though perhaps in a wrapper to > help indicate we?ve actually thought about the issue. > > (*) For C89/C++98 the rounding of division involving negative operands > is implementation defined, perhaps in part to allow the "optimization" > of division by a power of 2 to arithmetic right shift. > >> Does C/C++ preserve signs when multiplying/dividing with a positive constant? Hi Kim, I looked for the use of >> in our source code: globalDefinitions.hpp: inline jint high(jlong value) { return jint(value >> 32); } sharedRuntimeTrans.cpp: static double __ieee754_log(double x) { double hfsq,f,s,z,R,w,t1,t2,dk; int k,hx,i,j; ... k += (hx>>20)-1023; So maybe we already assume that >> "does the right thing" for us? ------------------------------- Since I am doing something very specific (setting/extracting the top 16 bits of a jint), I am a bit hesitant to add routines for general shifting. globalDefinitions.hpp has these: inline int extract_low_short_from_int(jint x) { return x & 0xffff; } inline int extract_high_short_from_int(jint x) { return (x >> 16) & 0xffff; } inline int build_int_from_shorts( jushort low, jushort high ) { return ((int)((unsigned int)high << 16) | (unsigned int)low); } I am thinking of adding: inline int extract_signed_high_short_from_int(jint x) { if (x >= 0) { return (x >> 16) & 0xffff; } else { return int((unsigned int)x >> 16) | 0xffff0000); } } inline int build_int_from_shorts(jshort low, jshort high) { return build_int_from_shorts(jushort(low) & 0xffff, jushort(high) & 0xffff); } What do you think? - Ioi From schulzs at Mathematik.Uni-Marburg.de Thu Aug 25 08:36:09 2016 From: schulzs at Mathematik.Uni-Marburg.de (Stefan Schulz) Date: Thu, 25 Aug 2016 10:36:09 +0200 Subject: Store and provide information about boolean expressions Message-ID: <84A5C32D-9FEB-416E-8F49-F5B9643AB691@informatik.uni-marburg.de> Hello, currently I'm working on my Master?s Thesis and I?m kind of lost. I?m new to OpenJDK development so please bear with me. The idea is so store detailed information about evaluated boolean expressions and provide it through JDI. Consider this example of a typical newcomer error: String a1 = ?a?; String a2 = ?a?; if (a1 == a2) { //Do something } I want so store the following information about the expression: - Value the expression evaluated to (false) - Kind of evaluation used (by object reference) - Compared values (object refences) This is my setup: - Ubuntu 16.04.1 - x86_64 architecture - OpenJDK 9 with Jigsaw bootstrapped by OpenJDK 8 from the Ubuntu software repos - Netbeans 8.1 - JVM is running in interpreter mode using the XInt-flag for simplicity?s sake My first approach was to look for occurrences of the bytecodes if_acmpne and if_acmpeq in the code, and just print the results when the instruction is resolved. I?m having a hard time to find the exact point where they are interpreted. Since the example application initializes the String values right before they are compared, I know that the bytecode invokespecial (where the String constructor is called) needs to be run twice beforehand. I?ve been able to track those invocations down to calls of hotspot/src/share/vm/runtime/javaCalls::call_special, but I don?t understand how the comparison is executed afterwards. I?ve attached the example code I?m analyzing. Could you point me to a direction or explain to me how this is implemented? Best regards, Stefan From marcus.larsson at oracle.com Thu Aug 25 08:50:19 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Thu, 25 Aug 2016 10:50:19 +0200 Subject: RFR(S): 8150823: UL disables log outputs incorrectly Message-ID: Hi, Please review the following patch to fix a bug in UL where -Xlog:disable would not disable all logging if there are multiple LogFileOutputs configured. The problem is that disable_logging() iterates the list of log outputs from start to end, removing outputs as it goes and thus modifying the list it iterates over. The fix is to let it pop outputs off the end of the list instead. Webrev: http://cr.openjdk.java.net/~mlarsson/8150823/webrev.00/ Issue: https://bugs.openjdk.java.net/browse/JDK-8150823 Testing: Modifed test through JPRT Thanks, Marcus From robbin.ehn at oracle.com Thu Aug 25 09:12:28 2016 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 25 Aug 2016 11:12:28 +0200 Subject: RFR(S): 8150823: UL disables log outputs incorrectly In-Reply-To: References: Message-ID: Looks good, thanks! /Robbin On 08/25/2016 10:50 AM, Marcus Larsson wrote: > Hi, > > Please review the following patch to fix a bug in UL where -Xlog:disable > would not disable all logging if there are multiple LogFileOutputs > configured. The problem is that disable_logging() iterates the list of > log outputs from start to end, removing outputs as it goes and thus > modifying the list it iterates over. The fix is to let it pop outputs > off the end of the list instead. > > Webrev: > http://cr.openjdk.java.net/~mlarsson/8150823/webrev.00/ > > Issue: > https://bugs.openjdk.java.net/browse/JDK-8150823 > > Testing: > Modifed test through JPRT > > Thanks, > Marcus From marcus.larsson at oracle.com Thu Aug 25 09:14:47 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Thu, 25 Aug 2016 11:14:47 +0200 Subject: RFR(S): 8150823: UL disables log outputs incorrectly In-Reply-To: References: Message-ID: <7f9b4162-00a0-0b02-0fa6-ac3a76f5240a@oracle.com> Thanks Robbin! Marcus On 08/25/2016 11:12 AM, Robbin Ehn wrote: > Looks good, thanks! > > /Robbin > > On 08/25/2016 10:50 AM, Marcus Larsson wrote: >> Hi, >> >> Please review the following patch to fix a bug in UL where -Xlog:disable >> would not disable all logging if there are multiple LogFileOutputs >> configured. The problem is that disable_logging() iterates the list of >> log outputs from start to end, removing outputs as it goes and thus >> modifying the list it iterates over. The fix is to let it pop outputs >> off the end of the list instead. >> >> Webrev: >> http://cr.openjdk.java.net/~mlarsson/8150823/webrev.00/ >> >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8150823 >> >> Testing: >> Modifed test through JPRT >> >> Thanks, >> Marcus From staffan.larsen at oracle.com Thu Aug 25 09:17:55 2016 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 25 Aug 2016 11:17:55 +0200 Subject: RFR(S): 8150823: UL disables log outputs incorrectly In-Reply-To: References: Message-ID: Looks good! Thanks, /Staffan > On 25 aug. 2016, at 10:50, Marcus Larsson wrote: > > Hi, > > Please review the following patch to fix a bug in UL where -Xlog:disable would not disable all logging if there are multiple LogFileOutputs configured. The problem is that disable_logging() iterates the list of log outputs from start to end, removing outputs as it goes and thus modifying the list it iterates over. The fix is to let it pop outputs off the end of the list instead. > > Webrev: > http://cr.openjdk.java.net/~mlarsson/8150823/webrev.00/ > > Issue: > https://bugs.openjdk.java.net/browse/JDK-8150823 > > Testing: > Modifed test through JPRT > > Thanks, > Marcus From marcus.larsson at oracle.com Thu Aug 25 09:20:08 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Thu, 25 Aug 2016 11:20:08 +0200 Subject: RFR(S): 8150823: UL disables log outputs incorrectly In-Reply-To: References: Message-ID: <468d16a0-9012-9b2a-6526-583271ac8844@oracle.com> Thanks Staffan! Marcus On 08/25/2016 11:17 AM, Staffan Larsen wrote: > Looks good! > > Thanks, > /Staffan > >> On 25 aug. 2016, at 10:50, Marcus Larsson wrote: >> >> Hi, >> >> Please review the following patch to fix a bug in UL where -Xlog:disable would not disable all logging if there are multiple LogFileOutputs configured. The problem is that disable_logging() iterates the list of log outputs from start to end, removing outputs as it goes and thus modifying the list it iterates over. The fix is to let it pop outputs off the end of the list instead. >> >> Webrev: >> http://cr.openjdk.java.net/~mlarsson/8150823/webrev.00/ >> >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8150823 >> >> Testing: >> Modifed test through JPRT >> >> Thanks, >> Marcus From marcus.larsson at oracle.com Thu Aug 25 09:31:30 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Thu, 25 Aug 2016 11:31:30 +0200 Subject: RFR: 8157948: UL allows same log file with multiple file= Message-ID: Hi, Please review the following patch to fix the issue where you could have the same file added twice as different log outputs in UL if it had the "file=" prefix or if it was quoted. Log output names are now normalized during log argument parsing to ensure they are always normalized when finding existing or adding new outputs. Webrev: http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00/ Issue: https://bugs.openjdk.java.net/browse/JDK-8157948 Testing: New unit test through JPRT Thanks, Marcus From david.simms at oracle.com Thu Aug 25 10:05:22 2016 From: david.simms at oracle.com (David Simms) Date: Thu, 25 Aug 2016 12:05:22 +0200 Subject: RFR (S): JDK-8164086: Checked JNI pending exception check should be cleared when returning to Java frame In-Reply-To: References: <6b96634e-a9c0-47b2-c17b-f6b2057a5f14@oracle.com> Message-ID: <7b2fe986-fca9-dbd1-d20e-17934ac1bc08@oracle.com> Updated the webrev here: http://cr.openjdk.java.net/~dsimms/8164086/webrev1/ core-libs & Kumar: java launcher: are you okay with the CHECK_EXCEPTION_PRINT macro, or would you rather it was silent (i.e. CHECK_EXCEPTION_RETURN) ? In-line... On 23/08/16 14:16, David Holmes wrote: > Hi David > > On 23/08/2016 8:24 PM, David Simms wrote: >> >> Reply in-line... >> >> On 19/08/16 14:29, David Holmes wrote: >>> Hi David, >>> >>> >>> The changes in the native wrapper seem okay though I'm not an expert >>> on the machine specific encodings. >>> >>> I'm a little surprised there are not more things that need changing >>> though. Does the JIT use those wrappers too? >> >> Yeah they do, I double checked Nils from compiler group. I also tested >> with -Xcomp, test failed without sharedRuntime fix. The test execution >> time was over 10 seconds, so I removed it from the jtreg test itself >> (hard-coded ProcessTools.executeTestJVM()) since it is part of >> "hotspot_fast_runtime". >> >>> Can we transition from Java to VM to native and then back - and if so >>> might we need to clear the pending exception check? (I'm not sure if >>> from in the VM a native call could actually be a JNI call, or will >>> only be a direct native call?). >> >> At first I thought JavaCallWrapper needs it, following all the places we >> manipulate the thread's active handle block (besides manual push/pop). >> But then call helper just ends up calling the native wrapper, which >> takes care of it. Not a direct native call. So I left it, as-is. > > That's not the case I was thinking of. We have ThreadToNativeFromVM > and then we do native stuff - if any of that were JNI-based (perhaps > it is not) then we would enable the check but not disable it again > when returning from VM to Java. > Got you now: Java->VM->Native i.e. VM code using JNI may miss an exception check. So I check the call hierarchy from "ThreadToNativeFromVM" and found whitebox.cpp had a few spots where checks were missing, added them in now. There's an extra comment stating ThreadToNativeFromVM is expected to be "well behaved" (i.e. check for exceptions), which it is with the whitebox.cpp fixes, so we don't require any extra code or overhead in VM->Java transitions. As far as maintaining "well behaved" JNI code, we do static code checking with "Parfait" as part of testing, and there are a few other related bugs that already exist to address these issues. >>> >>> Did you intend to leave in the changes to >>> jdk/src/java.base/share/native/libjli/java.c? It looks like debug/test >>> code to me. >> >> The launcher produces warnings (Java method invokes) that break the >> jtreg test, so yeah, thought it was best to check and print them. Some >> of the existing code checks and silently returns, I followed the same >> pattern where that pattern was in place. > > This needs to be looked at closer then and reviewed by the launcher > folk (ie Kumar). CC:ed core-libs & Kumar. Thanks for pointing that out. > >>> >>> The test I'm finding a bit hard to follow but don't you need to check >>> for pending exceptions here: >>> >>> 29 static jmethodID get_method_id(JNIEnv *env, jclass clz, jstring >>> jname, jstring jsig) { >>> 30 jmethodID mid; >>> 31 const char *name, *sig; >>> 32 name = (*env)->GetStringUTFChars(env, jname, NULL); >>> 33 sig = (*env)->GetStringUTFChars(env, jsig, NULL); >>> 34 mid = (*env)->GetMethodID(env, clz, name, sig); >>> >>> to avoid triggering the warning? >>> >> Those methods don't require an explicit check since there return values >> denote an error condition. >> >> Whilst Java invoke return values are user defined, so they do need >> it >> https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/design.html#asynchronous_exceptions). >> >> Technically array stores need to check for AIOOBE, but given most >> code handles index/bounds checks, it seemed way too pedantic >> (commented in jniCheck.cpp:176). > > Not following. GetStringUTFChars can post OOME so we would enable the > check-flag if that happens on the first call above, then the second > call would be made with the exception pending and trigger the warning. So as we mentioned off-list, yes this test code should also follow spec, updated. Thanks for looking at this, David /David Simms From ivankrylov.java at gmail.com Thu Aug 25 10:56:21 2016 From: ivankrylov.java at gmail.com (Ivan Krylov) Date: Thu, 25 Aug 2016 13:56:21 +0300 Subject: Store and provide information about boolean expressions In-Reply-To: <84A5C32D-9FEB-416E-8F49-F5B9643AB691@informatik.uni-marburg.de> References: <84A5C32D-9FEB-416E-8F49-F5B9643AB691@informatik.uni-marburg.de> Message-ID: Hi Stefan, First, regarding the example: what do you want to compare, strings or references? Here it is sort of special, since this string literal is in a constant pool and both references are pointing to it. I guess you might want to start with comparing either integers or references. String comparison in java 9 got a much more complicated with compact strings, compact strings, etc. So 1 step at a time, if you want to store a result of a boolean evaluation, start with primitives or references, Second, the template interpreter is not an easy thing to trace. Each bytecode like if_acmpeq is portion in a large template code blob. If you are comfortable with reading x86 assembly, use the -XX:+PrintInterpreter to learn how the interpreter does the the comparisons. https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly is a good reference. Hacking on the template interpreter is even more... fun because it mixes the java and the native stacks in a frame. But the next thing for you would be to find the hotspot code that contains templates and uses that code to generate the actual interpreter. You can look up some videos that talk about the Runtime mechanics, including the template interpreter, and how to monitor/debug hotspot. Look for videos from Volker Simonis or Chris Newland. If by chance you are at JavaZone in two weeks - I will have a couple of slides on the template interpreter in my talk. hth, Ivan On 25/08/16 11:36, Stefan Schulz wrote: > Hello, > > currently I'm working on my Master?s Thesis and I?m kind of lost. > I?m new to OpenJDK development so please bear with me. > > The idea is so store detailed information about evaluated boolean expressions and provide it through JDI. > Consider this example of a typical newcomer error: > > String a1 = ?a?; > String a2 = ?a?; > > if (a1 == a2) { > //Do something > } > > I want so store the following information about the expression: > - Value the expression evaluated to (false) > - Kind of evaluation used (by object reference) > - Compared values (object refences) > > This is my setup: > - Ubuntu 16.04.1 > - x86_64 architecture > - OpenJDK 9 with Jigsaw bootstrapped by OpenJDK 8 from the Ubuntu software repos > - Netbeans 8.1 > - JVM is running in interpreter mode using the XInt-flag for simplicity?s sake > > My first approach was to look for occurrences of the bytecodes if_acmpne and if_acmpeq in the code, > and just print the results when the instruction is resolved. I?m having a hard time to find the exact point > where they are interpreted. Since the example application initializes the String values right before they are compared, > I know that the bytecode invokespecial (where the String constructor is called) needs to be run twice beforehand. > > I?ve been able to track those invocations down to calls of hotspot/src/share/vm/runtime/javaCalls::call_special, > but I don?t understand how the comparison is executed afterwards. I?ve attached the example code I?m analyzing. > > Could you point me to a direction or explain to me how this is implemented? > > Best regards, > Stefan > > > From lois.foltan at oracle.com Thu Aug 25 12:42:52 2016 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 25 Aug 2016 08:42:52 -0400 Subject: RFR: 8148854: Class names "SomeClass" and "LSomeClass;" treated by JVM as an equivalent In-Reply-To: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> References: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> Message-ID: <57BEE7CC.1070801@oracle.com> Hi Rachel, Looks good. Only a stylistic comment: - src/share/vm/classfile/classFileParser.cpp Consider changing the new relax_format_check_for() method to only take one parameter, "ClassLoaderData loader_data" and change the setting of the local variable "trusted" to: bool trusted = (loader_data->is_the_null_class_loader_data() || SystemDictionary::is_platform_class_loader(loader_data->class_loader())); Thanks, Lois On 8/16/2016 4:21 PM, Rachel Protacio wrote: > Hi, > > Bug summary: fuzzing a class file so that the class name "SomeClass" > is instead "LSomeClass;" passed unnoticed through the VM because it > was not format checked by default and the L; were stripped off before > lookup. > > This fix makes sure that all class names loaded by the app class > loader are format checked by default. The Verifier::relax_verify_for() > function that was previously used for both format checking (setting > _relax_verify) and reflection (as an access check) has been renamed to > relax_access_for() specifically for its use in reflection.cpp. A > relax_format_check_for() function has been added to > classFileParser.cpp to address the format checking, only "relaxing" > the check if loaded by the boot loader or platform class loader. > > This fix adds a jtreg test, and the change passes JCK vm tests and WLS > tests, in addition to JPRT and RBT hotspot_all and non-colo tests. A > compatibility request has been approved for this change. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8148854 > Open webrev: http://cr.openjdk.java.net/~rprotacio/8148854.00/ > > Thanks! > Rachel From rachel.protacio at oracle.com Thu Aug 25 13:29:17 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Thu, 25 Aug 2016 09:29:17 -0400 Subject: RFR: 8148854: Class names "SomeClass" and "LSomeClass;" treated by JVM as an equivalent In-Reply-To: <57BEE7CC.1070801@oracle.com> References: <76d9e012-7464-69e1-34a9-e54a3acecf77@oracle.com> <57BEE7CC.1070801@oracle.com> Message-ID: <969482c6-7f27-644d-2037-45d288449421@oracle.com> Thanks for the review, Lois! I'll make that alteration and check it in. Rachel On 8/25/2016 8:42 AM, Lois Foltan wrote: > Hi Rachel, > > Looks good. Only a stylistic comment: > > - src/share/vm/classfile/classFileParser.cpp > Consider changing the new relax_format_check_for() method to only take > one parameter, "ClassLoaderData loader_data" and change the setting of > the local variable "trusted" to: > > bool trusted = (loader_data->is_the_null_class_loader_data() || > SystemDictionary::is_platform_class_loader(loader_data->class_loader())); > > Thanks, > Lois > > On 8/16/2016 4:21 PM, Rachel Protacio wrote: >> Hi, >> >> Bug summary: fuzzing a class file so that the class name "SomeClass" >> is instead "LSomeClass;" passed unnoticed through the VM because it >> was not format checked by default and the L; were stripped off before >> lookup. >> >> This fix makes sure that all class names loaded by the app class >> loader are format checked by default. The >> Verifier::relax_verify_for() function that was previously used for >> both format checking (setting _relax_verify) and reflection (as an >> access check) has been renamed to relax_access_for() specifically for >> its use in reflection.cpp. A relax_format_check_for() function has >> been added to classFileParser.cpp to address the format checking, >> only "relaxing" the check if loaded by the boot loader or platform >> class loader. >> >> This fix adds a jtreg test, and the change passes JCK vm tests and >> WLS tests, in addition to JPRT and RBT hotspot_all and non-colo >> tests. A compatibility request has been approved for this change. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8148854 >> Open webrev: http://cr.openjdk.java.net/~rprotacio/8148854.00/ >> >> Thanks! >> Rachel > From kim.barrett at oracle.com Thu Aug 25 17:37:06 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 25 Aug 2016 13:37:06 -0400 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57BEABAB.1030503@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> <57BE3FF3.9000408@oracle.com> <3468A67D-34C8-44B6-8071-DD5C4BFDFD9A@oracle.com> <57BEABAB.1030503@oracle.com> Message-ID: > On Aug 25, 2016, at 4:26 AM, Ioi Lam wrote: > > On 8/24/16 11:52 PM, Kim Barrett wrote: >>> On Aug 24, 2016, at 8:46 PM, Ioi Lam wrote: >>> >>> Hi Kim, >>> >>> Thanks for pointing out the problems with the shift operators. I never knew that! >>> >>> Since I am shifting only by 16, can change the expressions to these? >>> >>> jint(add_value) * 0x10000 >> Yes. I remembered that technique shortly after hitting send. >> Multiplying a signed value doesn't work (is UB) in the general case >> because of overflow, but we know the value ranges here are safe from >> that. >> >>> jshort(new_value / 0x10000) >> No, because under 2s complement arithmetic, arithmetic right shift of >> a negative number is not necessarily equivalent to division by the >> corresponding power of 2. [See, for example, "Arithmetic shifting >> considered harmful", Guy Steele, ACM SIGPLAN Notices, 11/1977.] >> >> Consider the 32bit value with all 1s in the upper 16 bits, and a >> non-zero value in the lower 16 bits. If division is truncate, which >> it is defined to be for C99/C++11 (*), that value / 0x10000 == 0, >> rather than the desired -1. Clear the low 16bits first and then >> divide, and I think it works for the case at hand, though I haven't >> proved it. But pragmatically we?re probably better off assuming >> right shift works as expected, though perhaps in a wrapper to >> help indicate we?ve actually thought about the issue. >> >> (*) For C89/C++98 the rounding of division involving negative operands >> is implementation defined, perhaps in part to allow the "optimization" >> of division by a power of 2 to arithmetic right shift. >> >>> Does C/C++ preserve signs when multiplying/dividing with a positive constant? > > Hi Kim, > > I looked for the use of >> in our source code: > > globalDefinitions.hpp: > inline jint high(jlong value) { return jint(value >> 32); } > > sharedRuntimeTrans.cpp: > static double __ieee754_log(double x) { > double hfsq,f,s,z,R,w,t1,t2,dk; > int k,hx,i,j; > ... > k += (hx>>20)-1023; > > So maybe we already assume that >> "does the right thing" for us? Yes, that's what I expected to see. > ------------------------------- > > Since I am doing something very specific (setting/extracting the top 16 bits of a jint), I am a bit hesitant to add routines for general shifting. globalDefinitions.hpp has these: The suggestion of using a wrapper was a "perhaps", and not really intended for you to deal with while addressing the problem at hand. Sorry I was confusing about that. If we do something along that line (as a separate project), I suggest we keep with the names we've already got for similar operations, e.g. use java_shift_{left,right} to be consistent with java_add and friends. > > inline int extract_low_short_from_int(jint x) { > return x & 0xffff; > } > inline int extract_high_short_from_int(jint x) { > return (x >> 16) & 0xffff; > } > inline int build_int_from_shorts( jushort low, jushort high ) { > return ((int)((unsigned int)high << 16) | (unsigned int)low); > } > > I am thinking of adding: > > inline int extract_signed_high_short_from_int(jint x) { > if (x >= 0) { > return (x >> 16) & 0xffff; > } else { > return int((unsigned int)x >> 16) | 0xffff0000); > } > } > > inline int build_int_from_shorts(jshort low, jshort high) { > return build_int_from_shorts(jushort(low) & 0xffff, jushort(high) & 0xffff); > } > > What do you think? > - Ioi From cnewland at chrisnewland.com Thu Aug 25 20:29:03 2016 From: cnewland at chrisnewland.com (Chris Newland) Date: Thu, 25 Aug 2016 21:29:03 +0100 Subject: Store and provide information about boolean expressions In-Reply-To: <84A5C32D-9FEB-416E-8F49-F5B9643AB691@informatik.uni-marburg.de> References: <84A5C32D-9FEB-416E-8F49-F5B9643AB691@informatik.uni-marburg.de> Message-ID: <9fdfe92895e8ec5d0a101ea90d322a2c.squirrel@excalibur.xssl.net> Hi Stefan, Sounds like an interesting project! Is there a requirement to track this using HotSpot modifications? You might be able to get what you need with bytecode instrumentation or by modifying the javac compiler. fyi: You might find the branch taken information output by the HotSpot C2 compiler useful while you're getting started. Use the -XX:+UnlockDiagnosticVMOptions and -XX:+LogCompilation switches and load the hotspot log file into JITWatch (https://github.com/AdoptOpenJDK/jitwatch) to get a visualisation like this: https://www.chrisnewland.com/images/jitwatch/branchtaken.png Cheers, Chris On Thu, August 25, 2016 09:36, Stefan Schulz wrote: > Hello, > > > currently I'm working on my Master?s Thesis and I?m kind of lost. I?m new > to OpenJDK development so please bear with me. > > The idea is so store detailed information about evaluated boolean > expressions and provide it through JDI. Consider this example of a typical > newcomer error: > > String a1 = ?a?; > String a2 = ?a?; > > > if (a1 == a2) { //Do something > } > > > I want so store the following information about the expression: > - Value the expression evaluated to (false) > - Kind of evaluation used (by object reference) > - Compared values (object refences) > > > This is my setup: > - Ubuntu 16.04.1 > - x86_64 architecture > - OpenJDK 9 with Jigsaw bootstrapped by OpenJDK 8 from the Ubuntu software > repos - Netbeans 8.1 > - JVM is running in interpreter mode using the XInt-flag for simplicity?s > sake > > My first approach was to look for occurrences of the bytecodes if_acmpne > and if_acmpeq in the code, and just print the results when the instruction > is resolved. I?m having a hard time to find the exact point where they are > interpreted. Since the example application initializes the String values > right before they are compared, I know that the bytecode invokespecial > (where the String constructor is called) needs to be run twice > beforehand. > > I?ve been able to track those invocations down to calls of > hotspot/src/share/vm/runtime/javaCalls::call_special, > but I don?t understand how the comparison is executed afterwards. I?ve > attached the example code I?m analyzing. > > Could you point me to a direction or explain to me how this is > implemented? > > Best regards, > Stefan > > > > > From david.holmes at oracle.com Fri Aug 26 00:27:31 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Aug 2016 10:27:31 +1000 Subject: RFR (S): JDK-8164086: Checked JNI pending exception check should be cleared when returning to Java frame In-Reply-To: <7b2fe986-fca9-dbd1-d20e-17934ac1bc08@oracle.com> References: <6b96634e-a9c0-47b2-c17b-f6b2057a5f14@oracle.com> <7b2fe986-fca9-dbd1-d20e-17934ac1bc08@oracle.com> Message-ID: <58f4a6da-b3d0-a1c7-9086-a34883e6b590@oracle.com> Hi David, I'm missing some pieces of this puzzle I'm afraid. On 25/08/2016 8:05 PM, David Simms wrote: > > Updated the webrev here: > http://cr.openjdk.java.net/~dsimms/8164086/webrev1/ hotspot/src/share/vm/prims/whitebox.cpp First I'm not sure that Whitebox isn't a special case here that could be handled in the WB_START/END macros - see below. More generally you state below that the transition from native back to the VM doesn't have to do anything with the pending_exception_check flag because well behaved native code in that context will explicitly check for exceptions, and so the pending-exception-check will already be disabled before returning to Java. First, if that is the case then we should assert that it is so in the native->VM return transition. Second though, it doesn't seem to be the case in Whitebox because the CHECK_JNI_EXCEPTION_ macro simply calls HAS_PENDING_EXCEPTION and so won't touch the pending-exception-check flag. ?? It was a good pick up that some whitebox code was using values that might be NULL because an exception had occurred. There are a couple of changes that are unnecessary though: 1235 result = env->NewObjectArray(5, clazz, NULL); 1236 CHECK_JNI_EXCEPTION_(env, NULL); 1237 if (result == NULL) { 1238 return result; 1239 } (and similarly at 1322) result will be NULL iff there is a pending exception; and vice-versa. So the existing check for NULL suffices for correctness. If you want to check exceptions for the side-effect of clearing the pending-exception-check flag then lines 1237-1239 can be deleted. However I would suggest that if you truly do want to clear the pending-exception-check flag then the place to do it is the WB_END macro. That allows allows exception checks at the end of methods, eg: 1261 env->SetObjectArrayElement(result, 4, entry_point); 1262 CHECK_JNI_EXCEPTION_(env, NULL); 1263 1264 return result; to be elided. --- hotspot/src/share/vm/runtime/thread.hpp ! // which function name. Returning to a Java frame should implicitly clear the ! // need for, this is done for Native->Java transitions. Seems to be some text missing after "need for". --- For the tests we no longer use bug numbers as part of the test names. Looks like some recent tests slipped by unfortunately. :( You should be able to get rid of the: * @modules java.base/jdk.internal.misc with Christian's just pushed changes to ProcessTools to isolate the Unsafe dependency. > core-libs & Kumar: java launcher: are you okay with the > CHECK_EXCEPTION_PRINT macro, or would you rather it was silent (i.e. > CHECK_EXCEPTION_RETURN) ? I'm not seeing the point of this logic. Any exceptions that remain pending when the main thread detaches from the VM will be reported by the uncaught-exception handling logic. The checks you put in are in most cases immediately before a return so there is no need to check for a pending exception and do an earlier return. And in one case you would bypass tracing logic by doing an early return. I had assumed this was just some debugging code you had left in by mistake. Thanks, David H. ------- > In-line... > > > On 23/08/16 14:16, David Holmes wrote: >> Hi David >> >> On 23/08/2016 8:24 PM, David Simms wrote: >>> >>> Reply in-line... >>> >>> On 19/08/16 14:29, David Holmes wrote: >>>> Hi David, >>>> >>>> >>>> The changes in the native wrapper seem okay though I'm not an expert >>>> on the machine specific encodings. >>>> >>>> I'm a little surprised there are not more things that need changing >>>> though. Does the JIT use those wrappers too? >>> >>> Yeah they do, I double checked Nils from compiler group. I also tested >>> with -Xcomp, test failed without sharedRuntime fix. The test execution >>> time was over 10 seconds, so I removed it from the jtreg test itself >>> (hard-coded ProcessTools.executeTestJVM()) since it is part of >>> "hotspot_fast_runtime". >>> >>>> Can we transition from Java to VM to native and then back - and if so >>>> might we need to clear the pending exception check? (I'm not sure if >>>> from in the VM a native call could actually be a JNI call, or will >>>> only be a direct native call?). >>> >>> At first I thought JavaCallWrapper needs it, following all the places we >>> manipulate the thread's active handle block (besides manual push/pop). >>> But then call helper just ends up calling the native wrapper, which >>> takes care of it. Not a direct native call. So I left it, as-is. >> >> That's not the case I was thinking of. We have ThreadToNativeFromVM >> and then we do native stuff - if any of that were JNI-based (perhaps >> it is not) then we would enable the check but not disable it again >> when returning from VM to Java. >> > > > Got you now: Java->VM->Native i.e. VM code using JNI may miss an > exception check. So I check the call hierarchy from > "ThreadToNativeFromVM" and found whitebox.cpp had a few spots where > checks were missing, added them in now. > > There's an extra comment stating ThreadToNativeFromVM is expected to be > "well behaved" (i.e. check for exceptions), which it is with the > whitebox.cpp fixes, so we don't require any extra code or overhead in > VM->Java transitions. As far as maintaining "well behaved" JNI code, we > do static code checking with "Parfait" as part of testing, and there are > a few other related bugs that already exist to address these issues. > >>>> >>>> Did you intend to leave in the changes to >>>> jdk/src/java.base/share/native/libjli/java.c? It looks like debug/test >>>> code to me. >>> >>> The launcher produces warnings (Java method invokes) that break the >>> jtreg test, so yeah, thought it was best to check and print them. Some >>> of the existing code checks and silently returns, I followed the same >>> pattern where that pattern was in place. >> >> This needs to be looked at closer then and reviewed by the launcher >> folk (ie Kumar). > > CC:ed core-libs & Kumar. Thanks for pointing that out. > >> >>>> >>>> The test I'm finding a bit hard to follow but don't you need to check >>>> for pending exceptions here: >>>> >>>> 29 static jmethodID get_method_id(JNIEnv *env, jclass clz, jstring >>>> jname, jstring jsig) { >>>> 30 jmethodID mid; >>>> 31 const char *name, *sig; >>>> 32 name = (*env)->GetStringUTFChars(env, jname, NULL); >>>> 33 sig = (*env)->GetStringUTFChars(env, jsig, NULL); >>>> 34 mid = (*env)->GetMethodID(env, clz, name, sig); >>>> >>>> to avoid triggering the warning? >>>> >>> Those methods don't require an explicit check since there return values >>> denote an error condition. >>> >>> Whilst Java invoke return values are user defined, so they do need >>> it >>> https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/design.html#asynchronous_exceptions). >>> >>> >>> Technically array stores need to check for AIOOBE, but given most >>> code handles index/bounds checks, it seemed way too pedantic >>> (commented in jniCheck.cpp:176). >> >> Not following. GetStringUTFChars can post OOME so we would enable the >> check-flag if that happens on the first call above, then the second >> call would be made with the exception pending and trigger the warning. > > So as we mentioned off-list, yes this test code should also follow spec, > updated. > > Thanks for looking at this, David > /David Simms > From david.holmes at oracle.com Fri Aug 26 01:44:07 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Aug 2016 11:44:07 +1000 Subject: RFR: 8157948: UL allows same log file with multiple file= In-Reply-To: References: Message-ID: <98d3f2fe-5937-07b7-e591-c1c7a0ea7a2f@oracle.com> Hi Marcus, We really need a better way to specify and verify these mini-grammars for command-line options. :( On 25/08/2016 7:31 PM, Marcus Larsson wrote: > Hi, > > Please review the following patch to fix the issue where you could have > the same file added twice as different log outputs in UL if it had the > "file=" prefix or if it was quoted. Log output names are now normalized > during log argument parsing to ensure they are always normalized when > finding existing or adding new outputs. So does this mean that whereas today -Xlog:gc=debug:foo assumes foo is the log file, with this fix you will get an error? > Webrev: > http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00/ src/share/vm/logging/logFileOutput.cpp Suggestion: const char* prefix = "file="; assert(strstr(name, prefix) == name, "invalid output name '%s': missing prefix: %s", name, prefix); _file_name = make_file_name(name + strlen(prefix), _pid_str, _vm_start_time_str); --- src/share/vm/logging/logConfiguration.cpp Suggestion: static const char* prefix = "file="; In normalize_output_name it is hard for me to work out what the possible "grammar" is, or how different cases will be handled. Currently -Xlog:gc=debug:"file"=foo is treated as -Xlog:gc=debug:file=foo. But with your changes I think the quoting will be handled differently. Thanks, David > Issue: > https://bugs.openjdk.java.net/browse/JDK-8157948 > > Testing: > New unit test through JPRT > > Thanks, > Marcus From david.holmes at oracle.com Fri Aug 26 03:38:00 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Aug 2016 13:38:00 +1000 Subject: RFR(S): 8164738: Convert AltHashing_test to GTest In-Reply-To: <94b3c417-47ed-d158-c166-84dc8536f199@oracle.com> References: <94b3c417-47ed-d158-c166-84dc8536f199@oracle.com> Message-ID: On 25/08/2016 1:47 AM, Kirill Zhaldybin wrote: > Dear all, > > Could you please review this fix for 8164738? Seems okay. > To convert the test I added new friend class to AltHashing class so we > could access private member function static juint murmur3_32(const int* > data, int len). There are also few formating fixes. Any reason all the murmur functions shouldn't be public? I'm not a fan of friends. No big deal either way. Thanks, David > > WebRev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164738/webrev.00/ > CR: https://bugs.openjdk.java.net/browse/JDK-8164738 > > Regards, Kirill From david.holmes at oracle.com Fri Aug 26 03:39:44 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Aug 2016 13:39:44 +1000 Subject: RFR(XS): 8164743: Convert TestAsUtf8 to GTest In-Reply-To: References: Message-ID: <85c5d756-e502-72b3-b247-9dd9a507e136@oracle.com> Looks fine. Thanks, David ----- On 25/08/2016 2:45 AM, Kirill Zhaldybin wrote: > Dear all, > > Could you please review this fix for 8164743? > The test was converted to GTest. > > WebRev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164743/webrev.00/ > CR: https://bugs.openjdk.java.net/browse/JDK-8164743 > > Thank you. > > Regards, Kirill From dmitry.samersoff at oracle.com Fri Aug 26 08:35:30 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 26 Aug 2016 11:35:30 +0300 Subject: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools In-Reply-To: <9fc8324f-4f49-bb09-4f97-ba478d5368be@oracle.com> References: <12dc677b-e33c-0345-4680-e97cc1604cbe@oracle.com> <57BC4B41.60305@oracle.com> <938232c7-c206-f0db-7446-78960537ad2b@oracle.com> <1f8101d1fd71$feb399d0$fc1acd70$@oracle.com> <9fc8324f-4f49-bb09-4f97-ba478d5368be@oracle.com> Message-ID: <1056deb7-ab07-78c6-c6c6-f8c7c0a93213@oracle.com> Christian, Ioi, Are you OK with this changes? http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.03/ -Dmitry On 2016-08-24 14:42, Dmitry Samersoff wrote: > Christian, > > Thank you for the review. > > Please see updated webrev: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.03/ > > I still have no ideas why this @build construction works with > @run driver but doesn't work with @run main/othervm. > > Is there a chance to have all such knowledge documented? > >> You don't need to explicitly build JpsHelper, > > I would prefer to leave it as is - it's harmless but highlights > TestJpsJar dependency. > >> would it make sense to change this to use the /test/lib ones and > > I'd tried it[1] and it doesn't work. jtreg claims that package > jdk.test.lib doesn't exist.[2] > > > 1. > http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.02.bad/ > > 2. > http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.02.bad/TestJpsClass.jtr > > -Dmitry > > On 2016-08-23 22:10, Christian Tornqvist wrote: >> Hi Dmitry, >> >> You don't need to explicitly build JpsHelper, >> I also noticed that >> you're using ProcessTools and OutputAnalyzer from /lib/testlibrary , >> would it make sense to change this to use the /test/lib ones and >> simply have: >> >> @library /test/lib >> >> ? >> >> Thanks, Christian -----Original Message----- From: >> hotspot-runtime-dev >> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of >> Dmitry Samersoff Sent: Tuesday, August 23, 2016 3:02 PM To: Ioi Lam >> ; serviceability-dev at openjdk.java.net; >> hotspot-runtime-dev Subject: >> Re: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java >> fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools >> >> Ioi, >> >> Thank you for review. >> >> Hmm. It looks like changes below solves the problem. >> >> - * @build jdk.testlibrary.* JpsHelper JpsBase + * @build JpsHelper >> JpsBase >> >> I'm running rbt job to verify it. >> >> -Dmitry >> >> On 2016-08-23 16:10, Ioi Lam wrote: >>> Hi Dmitry, >>> >>> Why are you adding /test/lib: >>> >>> - * @library /lib/testlibrary + * @library /lib/testlibrary >>> /test/lib >>> >>> The only class used by jdk/test/sun/tools/jps/*.java in /test/lib >>> is here: >>> >>> TestJpsSanity.java:import jdk.test.lib.apps.LingeredApp; >>> >>> But TestJpsSanity.java is not use by this test -- I ran the test >>> with your patch in a clean jtreg directory and the test passed, but >>> I don't see TestJpsSanity.class, or any jdk.test.lib.* class. >>> >>> So I don't think you need to add /test/lib. >>> >>> - Ioi >>> >>> On 8/23/16 5:34 AM, Dmitry Samersoff wrote: >>>> On 2016-08-17 10:51, Dmitry Samersoff wrote: >>>>> Everybody, >>>>> >>>>> Please review the changes: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.01/ >>>>> >>>>> -Dmitry >>>>> >>>> >>> >> >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >> Russia * I would love to change the world, but they won't give me the >> sources. >> > > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Fri Aug 26 11:00:15 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 26 Aug 2016 14:00:15 +0300 Subject: RFR(S): 8163994: Nightly test crashed in jvmtiAllocate Message-ID: Everybody, Please review the fix. http://cr.openjdk.java.net/~dsamersoff/JDK-8163994/webrev.02/ *Problem* Under some circumstances, when JVMTI_ERROR_WRONG_PHASE(112) is received, jvmtiAllocate could be called after call to cbEarlyVMDeath. cbEarlyVMDeath set gdata->jvmti to NULL, so jvmtiAllocate crashes. The problem appears only once in nightly testing and I was not able to reproduce it locally. *Solution* Guard added to jvmtiAllocate to get meaningful error message instead of crash. These fix doesn't fix root cause - JVMTI_ERROR_WRONG_PHASE problem is going to be addressed under JDK-8134103. -Dmitry -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.simms at oracle.com Fri Aug 26 11:55:48 2016 From: david.simms at oracle.com (David Simms) Date: Fri, 26 Aug 2016 13:55:48 +0200 Subject: RFR (S): JDK-8164086: Checked JNI pending exception check should be cleared when returning to Java frame In-Reply-To: <58f4a6da-b3d0-a1c7-9086-a34883e6b590@oracle.com> References: <6b96634e-a9c0-47b2-c17b-f6b2057a5f14@oracle.com> <7b2fe986-fca9-dbd1-d20e-17934ac1bc08@oracle.com> <58f4a6da-b3d0-a1c7-9086-a34883e6b590@oracle.com> Message-ID: <70aacfa1-ffa8-a4a9-8092-4cc12949d027@oracle.com> Hi David, Updated webrev: http://cr.openjdk.java.net/~dsimms/8164086/webrev2/ On 26/08/16 02:27, David Holmes wrote: > Hi David, > > I'm missing some pieces of this puzzle I'm afraid. > > On 25/08/2016 8:05 PM, David Simms wrote: >> >> Updated the webrev here: >> http://cr.openjdk.java.net/~dsimms/8164086/webrev1/ > > hotspot/src/share/vm/prims/whitebox.cpp > > First I'm not sure that Whitebox isn't a special case here that could > be handled in the WB_START/END macros - see below. > > More generally you state below that the transition from native back to > the VM doesn't have to do anything with the pending_exception_check > flag because well behaved native code in that context will explicitly > check for exceptions, and so the pending-exception-check will already > be disabled before returning to Java. First, if that is the case then > we should assert that it is so in the native->VM return transition. Agreed, inserted assert. > > Second though, it doesn't seem to be the case in Whitebox because the > CHECK_JNI_EXCEPTION_ macro simply calls HAS_PENDING_EXCEPTION and so > won't touch the pending-exception-check flag. ?? Doh, you are correct...I mistook this for the CHECK_JNI_EXCEPTION macro in "java.c" which does perform check... > > It was a good pick up that some whitebox code was using values that > might be NULL because an exception had occurred. There are a couple of > changes that are unnecessary though: > > 1235 result = env->NewObjectArray(5, clazz, NULL); > 1236 CHECK_JNI_EXCEPTION_(env, NULL); > 1237 if (result == NULL) { > 1238 return result; > 1239 } > > (and similarly at 1322) > > result will be NULL iff there is a pending exception; and vice-versa. > So the existing check for NULL suffices for correctness. If you want > to check exceptions for the side-effect of clearing the > pending-exception-check flag then lines 1237-1239 can be deleted. > However I would suggest that if you truly do want to clear the > pending-exception-check flag then the place to do it is the WB_END > macro. That allows allows exception checks at the end of methods, eg: > > 1261 env->SetObjectArrayElement(result, 4, entry_point); > 1262 CHECK_JNI_EXCEPTION_(env, NULL); > 1263 > 1264 return result; > > to be elided. > Agreed, introduce StackObj with appropriate destructor, removed the checks above. > --- > > hotspot/src/share/vm/runtime/thread.hpp > > ! // which function name. Returning to a Java frame should > implicitly clear the > ! // need for, this is done for Native->Java transitions. > > Seems to be some text missing after "need for". Thanks for seeing that, fixed. > > --- > > For the tests we no longer use bug numbers as part of the test names. > Looks like some recent tests slipped by unfortunately. :( > Moved to "test/runtime/jni/checked" > You should be able to get rid of the: > > * @modules java.base/jdk.internal.misc > > with Christian's just pushed changes to ProcessTools to isolate the > Unsafe dependency. > Done >> core-libs & Kumar: java launcher: are you okay with the >> CHECK_EXCEPTION_PRINT macro, or would you rather it was silent (i.e. >> CHECK_EXCEPTION_RETURN) ? > > I'm not seeing the point of this logic. Any exceptions that remain > pending when the main thread detaches from the VM will be reported by > the uncaught-exception handling logic. The checks you put in are in > most cases immediately before a return so there is no need to check > for a pending exception and do an earlier return. And in one case you > would bypass tracing logic by doing an early return. Removed all the extra checks, add JNI exception check to within the existing CHECK_NULL0 macro (make more sense there). > > I had assumed this was just some debugging code you had left in by > mistake. The method invocations needed to find main class needs to check for the test to pass. Cheers /David From marcus.larsson at oracle.com Fri Aug 26 12:11:36 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Fri, 26 Aug 2016 14:11:36 +0200 Subject: RFR: 8157948: UL allows same log file with multiple file= In-Reply-To: <98d3f2fe-5937-07b7-e591-c1c7a0ea7a2f@oracle.com> References: <98d3f2fe-5937-07b7-e591-c1c7a0ea7a2f@oracle.com> Message-ID: Hi David, Thanks for looking at this! New webrev: http://cr.openjdk.java.net/~mlarsson/8157948/webrev.01/ Incremental: http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00-01/ See replies below. On 08/26/2016 03:44 AM, David Holmes wrote: > Hi Marcus, > > We really need a better way to specify and verify these mini-grammars > for command-line options. :( Yeah, I'm all for something like that. > > On 25/08/2016 7:31 PM, Marcus Larsson wrote: >> Hi, >> >> Please review the following patch to fix the issue where you could have >> the same file added twice as different log outputs in UL if it had the >> "file=" prefix or if it was quoted. Log output names are now normalized >> during log argument parsing to ensure they are always normalized when >> finding existing or adding new outputs. > > So does this mean that whereas today > > -Xlog:gc=debug:foo > > assumes foo is the log file, with this fix you will get an error? No, the file= prefix will be assumed just like before. The parse step will now explicitly add it in the case that it wasn't specified. So every LogFileOutput instance created will have the prefix in its name. > >> Webrev: >> http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00/ > > src/share/vm/logging/logFileOutput.cpp > > Suggestion: > > const char* prefix = "file="; > assert(strstr(name, prefix) == name, "invalid output name '%s': > missing prefix: %s", name, prefix); > _file_name = make_file_name(name + strlen(prefix), _pid_str, > _vm_start_time_str); Fixed, see below. > > --- > > src/share/vm/logging/logConfiguration.cpp > > Suggestion: > > static const char* prefix = "file="; I've refactored all "file=" literals into constants, but I made the constant a field of LogFileOutput. I think it fits better there, let me know if you think otherwise. > > In normalize_output_name it is hard for me to work out what the > possible "grammar" is, or how different cases will be handled. > Currently -Xlog:gc=debug:"file"=foo is treated as > -Xlog:gc=debug:file=foo. But with your changes I think the quoting > will be handled differently. Actually -Xlog:gc=debug:"file"=foo should give an error, since quoting the output types isn't supported (only the name can be quoted). This should just be a refactoring to make sure we're always managing the output names in a uniform manner (so that file="foo" and file=foo isn't treated as two different log outputs). BTW, take care if you're testing this on the command line, as the shell might be stripping away quotes in the arguments for you. Thanks, Marcus > > Thanks, > David > >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8157948 >> >> Testing: >> New unit test through JPRT >> >> Thanks, >> Marcus From ioi.lam at oracle.com Fri Aug 26 13:16:56 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 26 Aug 2016 06:16:56 -0700 Subject: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools In-Reply-To: <1056deb7-ab07-78c6-c6c6-f8c7c0a93213@oracle.com> References: <12dc677b-e33c-0345-4680-e97cc1604cbe@oracle.com> <57BC4B41.60305@oracle.com> <938232c7-c206-f0db-7446-78960537ad2b@oracle.com> <1f8101d1fd71$feb399d0$fc1acd70$@oracle.com> <9fc8324f-4f49-bb09-4f97-ba478d5368be@oracle.com> <1056deb7-ab07-78c6-c6c6-f8c7c0a93213@oracle.com> Message-ID: <57C04148.5080502@oracle.com> Looks good. Thanks Dmitry! - Ioi On 8/26/16 1:35 AM, Dmitry Samersoff wrote: > Christian, Ioi, > > Are you OK with this changes? > > http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.03/ > > -Dmitry > > > On 2016-08-24 14:42, Dmitry Samersoff wrote: >> Christian, >> >> Thank you for the review. >> >> Please see updated webrev: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.03/ >> >> I still have no ideas why this @build construction works with >> @run driver but doesn't work with @run main/othervm. >> >> Is there a chance to have all such knowledge documented? >> >>> You don't need to explicitly build JpsHelper, >> I would prefer to leave it as is - it's harmless but highlights >> TestJpsJar dependency. >> >>> would it make sense to change this to use the /test/lib ones and >> I'd tried it[1] and it doesn't work. jtreg claims that package >> jdk.test.lib doesn't exist.[2] >> >> >> 1. >> http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.02.bad/ >> >> 2. >> http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.02.bad/TestJpsClass.jtr >> >> -Dmitry >> >> On 2016-08-23 22:10, Christian Tornqvist wrote: >>> Hi Dmitry, >>> >>> You don't need to explicitly build JpsHelper, >>> I also noticed that >>> you're using ProcessTools and OutputAnalyzer from /lib/testlibrary , >>> would it make sense to change this to use the /test/lib ones and >>> simply have: >>> >>> @library /test/lib >>> >>> ? >>> >>> Thanks, Christian -----Original Message----- From: >>> hotspot-runtime-dev >>> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of >>> Dmitry Samersoff Sent: Tuesday, August 23, 2016 3:02 PM To: Ioi Lam >>> ; serviceability-dev at openjdk.java.net; >>> hotspot-runtime-dev Subject: >>> Re: PING! Re: RFR(XS): JDK-8160923: sun/tools/jps/TestJpsJar.java >>> fails due to ClassNotFoundException: jdk.testlibrary.ProcessTools >>> >>> Ioi, >>> >>> Thank you for review. >>> >>> Hmm. It looks like changes below solves the problem. >>> >>> - * @build jdk.testlibrary.* JpsHelper JpsBase + * @build JpsHelper >>> JpsBase >>> >>> I'm running rbt job to verify it. >>> >>> -Dmitry >>> >>> On 2016-08-23 16:10, Ioi Lam wrote: >>>> Hi Dmitry, >>>> >>>> Why are you adding /test/lib: >>>> >>>> - * @library /lib/testlibrary + * @library /lib/testlibrary >>>> /test/lib >>>> >>>> The only class used by jdk/test/sun/tools/jps/*.java in /test/lib >>>> is here: >>>> >>>> TestJpsSanity.java:import jdk.test.lib.apps.LingeredApp; >>>> >>>> But TestJpsSanity.java is not use by this test -- I ran the test >>>> with your patch in a clean jtreg directory and the test passed, but >>>> I don't see TestJpsSanity.class, or any jdk.test.lib.* class. >>>> >>>> So I don't think you need to add /test/lib. >>>> >>>> - Ioi >>>> >>>> On 8/23/16 5:34 AM, Dmitry Samersoff wrote: >>>>> On 2016-08-17 10:51, Dmitry Samersoff wrote: >>>>>> Everybody, >>>>>> >>>>>> Please review the changes: >>>>>> >>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8160923/webrev.01/ >>>>>> >>>>>> -Dmitry >>>>>> >>> >>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >>> Russia * I would love to change the world, but they won't give me the >>> sources. >>> >> > From ioi.lam at oracle.com Fri Aug 26 13:34:01 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 26 Aug 2016 06:34:01 -0700 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> <57BE3FF3.9000408@oracle.com> <3468A67D-34C8-44B6-8071-DD5C4BFDFD9A@oracle.com> <57BEABAB.1030503@oracle.com> Message-ID: <57C04549.4030401@oracle.com> On 8/25/16 10:37 AM, Kim Barrett wrote: >> On Aug 25, 2016, at 4:26 AM, Ioi Lam wrote: >> >> On 8/24/16 11:52 PM, Kim Barrett wrote: >>>> On Aug 24, 2016, at 8:46 PM, Ioi Lam wrote: >>>> >>>> Hi Kim, >>>> >>>> Thanks for pointing out the problems with the shift operators. I never knew that! >>>> >>>> Since I am shifting only by 16, can change the expressions to these? >>>> >>>> jint(add_value) * 0x10000 >>> Yes. I remembered that technique shortly after hitting send. >>> Multiplying a signed value doesn't work (is UB) in the general case >>> because of overflow, but we know the value ranges here are safe from >>> that. >>> >>>> jshort(new_value / 0x10000) >>> No, because under 2s complement arithmetic, arithmetic right shift of >>> a negative number is not necessarily equivalent to division by the >>> corresponding power of 2. [See, for example, "Arithmetic shifting >>> considered harmful", Guy Steele, ACM SIGPLAN Notices, 11/1977.] >>> >>> Consider the 32bit value with all 1s in the upper 16 bits, and a >>> non-zero value in the lower 16 bits. If division is truncate, which >>> it is defined to be for C99/C++11 (*), that value / 0x10000 == 0, >>> rather than the desired -1. Clear the low 16bits first and then >>> divide, and I think it works for the case at hand, though I haven't >>> proved it. But pragmatically we?re probably better off assuming >>> right shift works as expected, though perhaps in a wrapper to >>> help indicate we?ve actually thought about the issue. >>> >>> (*) For C89/C++98 the rounding of division involving negative operands >>> is implementation defined, perhaps in part to allow the "optimization" >>> of division by a power of 2 to arithmetic right shift. >>> >>>> Does C/C++ preserve signs when multiplying/dividing with a positive constant? >> Hi Kim, >> >> I looked for the use of >> in our source code: >> >> globalDefinitions.hpp: >> inline jint high(jlong value) { return jint(value >> 32); } >> >> sharedRuntimeTrans.cpp: >> static double __ieee754_log(double x) { >> double hfsq,f,s,z,R,w,t1,t2,dk; >> int k,hx,i,j; >> ... >> k += (hx>>20)-1023; >> >> So maybe we already assume that >> "does the right thing" for us? > Yes, that's what I expected to see. > >> ------------------------------- >> >> Since I am doing something very specific (setting/extracting the top 16 bits of a jint), I am a bit hesitant to add routines for general shifting. globalDefinitions.hpp has these: > The suggestion of using a wrapper was a "perhaps", and not really > intended for you to deal with while addressing the problem at hand. > Sorry I was confusing about that. > > If we do something along that line (as a separate project), I suggest > we keep with the names we've already got for similar operations, > e.g. use java_shift_{left,right} to be consistent with java_add and > friends. Hi Kim, Thanks for the clarification. My RBT tests passed, so I will check in the code as is in my last webrev using the >> and << operators. I'll leave the general problem of java_shift_left/right as a future improvement. Thanks - Ioi >> inline int extract_low_short_from_int(jint x) { >> return x & 0xffff; >> } >> inline int extract_high_short_from_int(jint x) { >> return (x >> 16) & 0xffff; >> } >> inline int build_int_from_shorts( jushort low, jushort high ) { >> return ((int)((unsigned int)high << 16) | (unsigned int)low); >> } >> >> I am thinking of adding: >> >> inline int extract_signed_high_short_from_int(jint x) { >> if (x >= 0) { >> return (x >> 16) & 0xffff; >> } else { >> return int((unsigned int)x >> 16) | 0xffff0000); >> } >> } >> >> inline int build_int_from_shorts(jshort low, jshort high) { >> return build_int_from_shorts(jushort(low) & 0xffff, jushort(high) & 0xffff); >> } >> >> What do you think? >> - Ioi > From rachel.protacio at oracle.com Fri Aug 26 14:44:34 2016 From: rachel.protacio at oracle.com (Rachel Protacio) Date: Fri, 26 Aug 2016 10:44:34 -0400 Subject: RFR(XS): 8164743: Convert TestAsUtf8 to GTest In-Reply-To: References: Message-ID: Looks good to me too. Rachel On 8/24/2016 12:45 PM, Kirill Zhaldybin wrote: > Dear all, > > Could you please review this fix for 8164743? > The test was converted to GTest. > > WebRev: > http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164743/webrev.00/ > CR: https://bugs.openjdk.java.net/browse/JDK-8164743 > > Thank you. > > Regards, Kirill From kim.barrett at oracle.com Fri Aug 26 18:05:39 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 26 Aug 2016 14:05:39 -0400 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57C04549.4030401@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> <57BE3FF3.9000408@oracle.com> <3468A67D-34C8-44B6-8071-DD5C4BFDFD9A@oracle.com> <57BEABAB.1030503@oracle.com> <57C04549.4030401@oracle.com> Message-ID: > On Aug 26, 2016, at 9:34 AM, Ioi Lam wrote: >>> Since I am doing something very specific (setting/extracting the top 16 bits of a jint), I am a bit hesitant to add routines for general shifting. globalDefinitions.hpp has these: >> The suggestion of using a wrapper was a "perhaps", and not really >> intended for you to deal with while addressing the problem at hand. >> Sorry I was confusing about that. >> >> If we do something along that line (as a separate project), I suggest >> we keep with the names we've already got for similar operations, >> e.g. use java_shift_{left,right} to be consistent with java_add and >> friends. > > Hi Kim, > > Thanks for the clarification. > > My RBT tests passed, so I will check in the code as is in my last webrev using the >> and << operators. I'll leave the general problem of java_shift_left/right as a future improvement. > > Thanks > - Ioi > It seems my attempt at clarification has led to further confusion. Recent discussion in this thread has been focused on the right shift, e.g. assuming it "does the right thing". The left shift is *broken*. Recent versions of gcc *will* do something other than what is being expected. We've already seen reports of (and fixed) problems encountered by folks using gcc6 for exactly this sort of thing. See, for example, JDK-8157758. In the specific case at hand, the compiler can trivially prove that undefined behavior is being invoked, because of the constant -1 being passed to an inline function where the shift occurs. What it does from there is anyone's guess; gcc6 seems to be treating such things as unreachable code and optimizing accordingly. So not fixing the left shift is just leaving a land mine for someone else to step on. Please don't do that. From frederic.parain at oracle.com Fri Aug 26 20:00:53 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Fri, 26 Aug 2016 16:00:53 -0400 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit Message-ID: Hi, Please review this fix for bug JDK-8137035 The bug is confidential but it is related to several VM crashes that occurred on the Windows 64 bits platform in stack overflow conditions. I've copied/pasted the analysis of the bug and the description of the fix below. Webrev: http://cr.openjdk.java.net/~fparain/8137035/webrev.00/ Testing: JPRT (testset hotspot) and nsk.stress Thanks, Fred --------- All these crashes related to stack overflows on Windows have presumably the same causes: - an undersized StackShadowPages parameter - the behavior of guard pages on Windows - a flaw in Yellow Pages management These three factors combined together can lead to sporadic crashes of the JVM when stack overflow conditions are encountered. All the crashes listed in this CR and in the related CR are almost impossible to reproduce, which indicates that the issue only shows up in some extreme or uncommon conditions. By design, the JVM crashes on stack overflow only if the Red Zone (the last one in the execution stack) is hit. Before the Red Zone, there's the Yellow Zone which is here to detect and handle stack overflows in a nicer way (throwing a StackOverflowError instead of crashing the process). If the Red zone is hit, it means that the Yellow Zone was disabled, and there's only two cases where the Yellow Zone is disabled: 1 - when a potential stack overflow is detected in Java code, in this case the Yellow Zone is disabled during the generation of the StackOverflowError and restored during the propagation of the StackOverflowError 2 - when a stack overflow occurs either in native code or in JVM code, because there's anything else the JVM can do. In several crashes, the call stack doesn't show any special recursive Java calls that could suggest the JVM is in case 1. But they show relatively complex code paths inside JVM code (de-optimization or class/symbol resolution), which suggests that case 2 occurred. The case of stack overflow in native code is straight forward: if the Yellow Zone is hit, it is disabled, but when a JavaThread returns from native code to Java code, the Yellow Zone is systematically re-enabled (this is part of the native call wrapper generated by the JVM). The case of stack overflow in JVM code is more problematic. The JVM tries to avoid the case of stack overflow in VM code with the Shadow Pages mechanism. Whenever a Java method is invoked, the JVM tries to ensure that there's enough free stack space to execute the Java method and *any call to the JVM code (or JDK native code) that could occur during the execution of this method*. This check is performed by banging (touching) n pages ahead on the execution stack, and n is set to StackShadowPages. If the Yellow Zone is hit during the stack banging, a StackOverflowError is thrown before the execution of the first bytecode of the Java method. But this mechanism assumes that StackShadowPages pages is big enough to cover *any call to the JVM*. If this assumption is wrong, so bad things happen. I ran experiments with tests for which stack overflow related crashes were reported. I ran them with a JVM where the StackShadowPages value was decreased by only 1 compared the usual default value. It was very easy to reproduce stack overflow crashes. By instrumenting the JVM, it appeared that some threads hit the Yellow Zone while having thread state _thread_in_vm. Which means that in many cases, the margin between the stack space provided by StackShadowPages and the real stack usage while executing VM code is less than one page. And because knowing the biggest stack requirement to execute any JVM code is an undecidable problem, there's a high probability that some paths require more stack space than StackShadowPages ensures. It is important to notice that Windows is the platform with the smallest default value for StackShadowPages. So, an undersized StackShadowPages could cause the Yellow Zone to be hit while executing JVM code. On Unices (Solaris, Linux, MacOSX), the sanction is immediate: a SIGSEGV signal is sent, but because there's no more free space on the execution stack, the signal handler cannot be executed and the JVM process is killed. It's a crash without hs_error file generation. On Windows, the story is different. Yellow Pages are marked with the "Guard" bit. When a page with a Guard bit set is touched, the current thread receives an exception, but before the exception handler is executed, the OS remove the Guard bit from the page, so the page that trigger the fault can be used to execute the signal handler. So on Windows, when the Yellow Zone is hit while executing JVM code, the JVM doesn't die like on Unices systems, but the signal handler is executed. The logic in the signal handler looks like this (simplified version): if thread touches the yellow zone: if thread_in_java: disable yellow pages jump to code throwing StackOverflowError // note: yellow pages will be re-enabled // while unwinding the stack else: // thread_in_vm or thread_in_native disable yellow pages resume execution else: // Fatal red zone violation. disable red pages generate VM crash So, the signal handler disable the protection of the Yellow Pages and resume JVM code execution. Eventually, the thread will return from the VM and will continue executing Java code. But at this point, the yellow pages are still disabled and there's no systematic check to ensure that Yellow Pages are re-enabled when returning to Java. The only places where the JVM checks if Yellow Pages need to be re-activated is when returning from native code or in the exception propagation code (but not all paths reactivate the Yellow Zone). Once the execution of Java code has resumed with the yellow zone disabled, the thread is not protected any more against stack overflows. The only remaining protection is the red zone, and if it is hit, the VM will generate a crash report and die. Note that having Yellow Zone de-activated makes the stack banging of StackShadowPages inefficient. Stack banging relies on the Yellow Pages to be activated, so touching them triggers a signal. If Yellow Pages are de-activated (unprotected) no signal is sent, unless the stack banging hits the Red Page, which triggers a VM crash with hs_error file generation. To summarize: an undersized StackShadowPages on Windows can lead to a JavaThread executing Java code with Yellow Pages disabled, which means without any stack overflow protection except the Red Zone which is the one triggering VM crashes with hs_error file generation. Note that the Yellow Pages can be "incidentally" re-activated by a call to native code or by throwing an exception. Which could explain why stack overflow crashes are not so frequent, the time window during which Java code is executed without stack overflow protection might be small for some applications. Proposed fixes for this issue: - increase StackShadowPages for the Windows platform - add assertion is signal handler to detect thread hitting the Yellow Zone while executing JVM code (to detect undersized StackShadowPages during our testing) - ensure Yellow Pages are activated when transitioning from _thread_in_vm to _thread_in_java From gerard.ziemski at oracle.com Fri Aug 26 21:06:06 2016 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Fri, 26 Aug 2016 16:06:06 -0500 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: References: Message-ID: hi Frederic, I have 2 general questions: > On Aug 26, 2016, at 3:00 PM, Frederic Parain wrote: > > Hi, > > Please review this fix for bug JDK-8137035 > The bug is confidential but it is related to several VM crashes > that occurred on the Windows 64 bits platform in stack overflow > conditions. I've copied/pasted the analysis of the bug and the > description of the fix below. > > Webrev: > http://cr.openjdk.java.net/~fparain/8137035/webrev.00/ > > Testing: JPRT (testset hotspot) and nsk.stress > > Thanks, > > Fred > > --------- > > All these crashes related to stack overflows on Windows have presumably the same causes: > - an undersized StackShadowPages parameter > - the behavior of guard pages on Windows > - a flaw in Yellow Pages management > > These three factors combined together can lead to sporadic crashes of the JVM when stack overflow conditions are encountered. > > All the crashes listed in this CR and in the related CR are almost impossible to reproduce, which indicates that the issue only shows up in some extreme or uncommon conditions. By design, the JVM crashes on stack overflow only if the Red Zone (the last one in the execution stack) is hit. Before the Red Zone, there's the Yellow Zone which is here to detect and handle stack overflows in a nicer way (throwing a StackOverflowError instead of crashing the process). If the Red zone is hit, it means that the Yellow Zone was > disabled, and there's only two cases where the Yellow Zone is disabled: > > 1 - when a potential stack overflow is detected in Java code, in this case the Yellow Zone is disabled during the generation of the StackOverflowError and restored during the propagation of the StackOverflowError > 2 - when a stack overflow occurs either in native code or in JVM code, because there's anything else the JVM can do. > > In several crashes, the call stack doesn't show any special recursive Java calls that could suggest the JVM is in case 1. But they show relatively complex code paths inside JVM code (de-optimization or class/symbol resolution), which suggests that case 2 occurred. > > The case of stack overflow in native code is straight forward: if the Yellow Zone is hit, it is disabled, but when a JavaThread returns from native code to Java code, the Yellow Zone is systematically re-enabled (this is part of the native call wrapper > generated by the JVM). > > The case of stack overflow in JVM code is more problematic. The JVM tries to avoid the case of stack overflow in VM code with the Shadow Pages mechanism. Whenever a Java method is invoked, the JVM tries to ensure that there's enough free stack space to execute the Java method and *any call to the JVM code (or JDK native code) that could occur during the execution of this method*. This check is performed by banging (touching) n pages ahead on the execution stack, and n is set to StackShadowPages. If the Yellow Zone is hit during the stack banging, a StackOverflowError is thrown before the execution of the first bytecode of the Java method. But this mechanism assumes that StackShadowPages pages is big enough to cover *any call to the JVM*. If this assumption is wrong, so > bad things happen. > > I ran experiments with tests for which stack overflow related crashes were reported. I ran them with a JVM where the StackShadowPages value was decreased by only 1 compared the usual default value. It was very easy to reproduce stack overflow crashes. By instrumenting the JVM, it appeared that some threads hit the Yellow Zone while having thread state _thread_in_vm. Which means that in many cases, the margin between the stack space provided by StackShadowPages and the real stack usage while executing VM code is less than one page. And because knowing the biggest stack requirement to execute any JVM code is an undecidable problem, Is it really an undecidable problem? Why is that exactly? > there's a high probability that some paths require more stack space than StackShadowPages ensures. It is important to notice > that Windows is the platform with the smallest default value for StackShadowPages. > > So, an undersized StackShadowPages could cause the Yellow Zone to be hit while executing JVM code. On Unices (Solaris, Linux, MacOSX), the sanction is immediate: a SIGSEGV signal is sent, but because there's no more free space on the execution stack, the signal handler cannot be executed and the JVM process is killed. It's a crash without hs_error file generation. > > On Windows, the story is different. Yellow Pages are marked with the "Guard" bit. When a page with a Guard bit set is touched, the current thread receives an exception, but before the exception handler is executed, the OS remove the Guard bit from the page, so the page that trigger the fault can be used to execute the signal handler. So on Windows, when the Yellow Zone is hit while executing JVM code, the JVM doesn't die like on Unices systems, but the signal handler is executed. > > The logic in the signal handler looks like this (simplified version): > > if thread touches the yellow zone: > if thread_in_java: > disable yellow pages > jump to code throwing StackOverflowError > // note: yellow pages will be re-enabled > // while unwinding the stack > else: > // thread_in_vm or thread_in_native > disable yellow pages > resume execution > else: > // Fatal red zone violation. > disable red pages > generate VM crash > > So, the signal handler disable the protection of the Yellow Pages and resume JVM code execution. > > Eventually, the thread will return from the VM and will continue executing Java code. But at this point, the yellow pages are still disabled and there's no systematic check to ensure that Yellow Pages are re-enabled when returning to Java. The only places where the JVM checks if Yellow Pages need to be re-activated is when returning from native code or in the exception propagation code (but not all paths reactivate the Yellow Zone). > > Once the execution of Java code has resumed with the yellow zone disabled, the thread is not protected any more against stack overflows. The only remaining protection is the red zone, and if it is hit, the VM will generate a crash report and die. Note that having Yellow Zone de-activated makes the stack banging of StackShadowPages inefficient. Stack banging relies on the Yellow Pages to be activated, so touching them triggers a signal. If Yellow Pages are de-activated (unprotected) no signal is sent, unless the stack banging hits the Red Page, which triggers a VM crash with hs_error file generation. > > > To summarize: an undersized StackShadowPages on Windows can lead to a JavaThread executing Java code with Yellow Pages disabled, which means without any stack overflow protection except the Red Zone which is the one triggering VM crashes with hs_error file generation. > > Note that the Yellow Pages can be "incidentally" re-activated by a call to native code or by throwing an exception. Which could explain why stack overflow crashes are not so frequent, the time window during which Java code is executed without stack overflow protection might be small for some applications. > > > Proposed fixes for this issue: > - increase StackShadowPages for the Windows platform Why are we so stingy with the size of the default shadow pages on Windows? Even with your fix, which increases it by one, it?s only 7, compared to 20 on other platforms. Why can?t we have 20 pages of default shadow pages on Windows? Wouldn?t that significantly decrease the chance of hitting the yellow pages, if we can?t guarantee that all calls to VM fit? > - add assertion is signal handler to detect thread hitting the Yellow Zone while executing JVM code (to detect undersized StackShadowPages during our testing) > - ensure Yellow Pages are activated when transitioning from _thread_in_vm to _thread_in_java > From chris.plummer at oracle.com Fri Aug 26 21:35:34 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 26 Aug 2016 14:35:34 -0700 Subject: RFR(S): 8163994: Nightly test crashed in jvmtiAllocate In-Reply-To: References: Message-ID: Hi Dmitry, Although the fix is addressing the specific issue described in the bug, what about the general issue of referencing gdata after a call to cbEarlyVMDeath(). Do more references to gdata need to be protected? Also, is there the possibility of a multi-threading race condition here? Could gdata be cleared by another thread after it is checked? thanks, Chris On 8/26/16 4:00 AM, Dmitry Samersoff wrote: > Everybody, > > Please review the fix. > > http://cr.openjdk.java.net/~dsamersoff/JDK-8163994/webrev.02/ > > *Problem* > > Under some circumstances, when JVMTI_ERROR_WRONG_PHASE(112) is received, > jvmtiAllocate could be called after call to cbEarlyVMDeath. > > cbEarlyVMDeath set gdata->jvmti to NULL, so jvmtiAllocate crashes. > > The problem appears only once in nightly testing and I was not able to > reproduce it locally. > > *Solution* > > Guard added to jvmtiAllocate to get meaningful error message instead of > crash. > > These fix doesn't fix root cause - JVMTI_ERROR_WRONG_PHASE problem is > going to be addressed under JDK-8134103. > > -Dmitry > From frederic.parain at oracle.com Fri Aug 26 21:53:45 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Fri, 26 Aug 2016 17:53:45 -0400 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: References: Message-ID: <5ea72a2f-87b5-c2b7-e05b-5ea5025ccadf@oracle.com> On 08/26/2016 05:06 PM, Gerard Ziemski wrote: >> The case of stack overflow in JVM code is more problematic. The JVM tries to avoid the case of stack overflow in VM code with the Shadow Pages mechanism. Whenever a Java method is invoked, the JVM tries to ensure that there's enough free stack space to execute the Java method and *any call to the JVM code (or JDK native code) that could occur during the execution of this method*. This check is performed by banging (touching) n pages ahead on the execution stack, and n is set to StackShadowPages. If the Yellow Zone is hit during the stack banging, a StackOverflowError is thrown before the execution of the first bytecode of the Java method. But this mechanism assumes that StackShadowPages pages is big enough to cover *any call to the JVM*. If this assumption is wrong, so >> bad things happen. >> >> I ran experiments with tests for which stack overflow related crashes were reported. I ran them with a JVM where the StackShadowPages value was decreased by only 1 compared the usual default value. It was very easy to reproduce stack overflow crashes. By instrumenting the JVM, it appeared that some threads hit the Yellow Zone while having thread state _thread_in_vm. Which means that in many cases, the margin between the stack space provided by StackShadowPages and the real stack usage while executing VM code is less than one page. And because knowing the biggest stack requirement to execute any JVM code is an undecidable problem, > > Is it really an undecidable problem? Why is that exactly? How would you compute the max stack size for any call to the VM? Just the matrix of all VM options that could impact the stack usage is huge: several GC, several JIT compilers, JVMTI hooks, JFR. The work to be performed by the JVM can also be dependent on the application (class hierarchy, application code itself which can be optimized (and deoptimized) in many different ways according to compilation policies and application behavior). This problem is not specific to the JVM. Linux has a similar issue with its kernel stacks: they have a fixed size, but there's no way to ensure that the size is sufficient to execute any system call or perform any OS operation. >> Proposed fixes for this issue: >> - increase StackShadowPages for the Windows platform > > Why are we so stingy with the size of the default shadow pages on Windows? Even with your fix, which increases it by one, it?s only 7, compared to 20 on other platforms. > > Why can?t we have 20 pages of default shadow pages on Windows? Wouldn?t that significantly decrease the chance of hitting the yellow pages, if we can?t guarantee that all calls to VM fit? Historically, StackshadowPages was approximatively the same for all platforms. But one day, the JDK team has rewritten the native part of networking APIs for Unix platforms using stack allocated buffers instead of malloc'ed buffers. This change caused crashes due to stack overflows (either the native code hits the Yellow Zone, or it could even "jump" over the Yellow/Red Zone). So the StackShadowPages default value has been significantly increased on Unix platforms to provide stack overflow protection to the JDK networking code. The implementation of these APIs on Windows doesn't use stack allocated buffers, so the StackShadowPages default value has not been increased for this platform. Note that increasing the StackShadowPages has a cost: a cost in memory because more stack space is reserved for VM code, and a cost in CPU because StackShadowPages determines the number of pages to bang before executing a Java method. Fred > >> - add assertion is signal handler to detect thread hitting the Yellow Zone while executing JVM code (to detect undersized StackShadowPages during our testing) >> - ensure Yellow Pages are activated when transitioning from _thread_in_vm to _thread_in_java >> > From aph at redhat.com Sun Aug 28 20:03:48 2016 From: aph at redhat.com (Andrew Haley) Date: Sun, 28 Aug 2016 21:03:48 +0100 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> <57BE3FF3.9000408@oracle.com> <3468A67D-34C8-44B6-8071-DD5C4BFDFD9A@oracle.com> <57BEABAB.1030503@oracle.com> <57C04549.4030401@oracle.com> Message-ID: <5924483c-cca9-e7e3-e4bc-1c7228f115bb@redhat.com> On 26/08/16 19:05, Kim Barrett wrote: > > It seems my attempt at clarification has led to further confusion. > > Recent discussion in this thread has been focused on the right shift, > e.g. assuming it "does the right thing". > > The left shift is *broken*. Recent versions of gcc *will* do > something other than what is being expected. We've already seen > reports of (and fixed) problems encountered by folks using gcc6 for > exactly this sort of thing. See, for example, JDK-8157758. > > In the specific case at hand, the compiler can trivially prove that > undefined behavior is being invoked, because of the constant -1 being > passed to an inline function where the shift occurs. What it does from > there is anyone's guess; gcc6 seems to be treating such things as > unreachable code and optimizing accordingly. GCC supports shifting left negative integers: "4.5 Integers "As an extension to the C language, GCC does not use the latitude given in C99 and C11 only to treat certain aspects of signed ?< References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> <57BE3FF3.9000408@oracle.com> <3468A67D-34C8-44B6-8071-DD5C4BFDFD9A@oracle.com> <57BEABAB.1030503@oracle.com> <57C04549.4030401@oracle.com> Message-ID: <57C380FA.6040202@oracle.com> On 8/26/16 11:05 AM, Kim Barrett wrote: >> On Aug 26, 2016, at 9:34 AM, Ioi Lam wrote: >>>> Since I am doing something very specific (setting/extracting the top 16 bits of a jint), I am a bit hesitant to add routines for general shifting. globalDefinitions.hpp has these: >>> The suggestion of using a wrapper was a "perhaps", and not really >>> intended for you to deal with while addressing the problem at hand. >>> Sorry I was confusing about that. >>> >>> If we do something along that line (as a separate project), I suggest >>> we keep with the names we've already got for similar operations, >>> e.g. use java_shift_{left,right} to be consistent with java_add and >>> friends. >> Hi Kim, >> >> Thanks for the clarification. >> >> My RBT tests passed, so I will check in the code as is in my last webrev using the >> and << operators. I'll leave the general problem of java_shift_left/right as a future improvement. >> >> Thanks >> - Ioi >> > It seems my attempt at clarification has led to further confusion. > > Recent discussion in this thread has been focused on the right shift, > e.g. assuming it "does the right thing". > > The left shift is *broken*. Recent versions of gcc *will* do > something other than what is being expected. We've already seen > reports of (and fixed) problems encountered by folks using gcc6 for > exactly this sort of thing. See, for example, JDK-8157758. > > In the specific case at hand, the compiler can trivially prove that > undefined behavior is being invoked, because of the constant -1 being > passed to an inline function where the shift occurs. What it does from > there is anyone's guess; gcc6 seems to be treating such things as > unreachable code and optimizing accordingly. > > So not fixing the left shift is just leaving a land mine for someone > else to step on. Please don't do that. Hi Kim, I've already pushed my changes, so I need to fix this in a separate bug ID. Will something like this work? inline jshort Atomic::add(jshort add_value, volatile jshort* dest) { - jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest-1)); + jint int_add_value = jint(juint(add_value) << 16); + jint new_value = Atomic::add(int_add_value, (volatile jint*)(dest-1)); Thanks - Ioi From david.holmes at oracle.com Mon Aug 29 01:14:19 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 Aug 2016 11:14:19 +1000 Subject: RFR(S): 8163994: Nightly test crashed in jvmtiAllocate In-Reply-To: References: Message-ID: <290c1d74-97ee-95d0-1b20-01d5aebdab48@oracle.com> On 27/08/2016 7:35 AM, Chris Plummer wrote: > Hi Dmitry, > > Although the fix is addressing the specific issue described in the bug, > what about the general issue of referencing gdata after a call to > cbEarlyVMDeath(). Do more references to gdata need to be protected? > > Also, is there the possibility of a multi-threading race condition here? > Could gdata be cleared by another thread after it is checked? Certainly. This really isn't fixing anything just adding a bailout check before the crashing code. We can still crash and not be any the wiser as to why. Not sure I really see the point of doing this instead of closing this as a dup of JDK-8134103 and fixing things properly. David > thanks, > > Chris > > On 8/26/16 4:00 AM, Dmitry Samersoff wrote: >> Everybody, >> >> Please review the fix. >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8163994/webrev.02/ >> >> *Problem* >> >> Under some circumstances, when JVMTI_ERROR_WRONG_PHASE(112) is received, >> jvmtiAllocate could be called after call to cbEarlyVMDeath. >> >> cbEarlyVMDeath set gdata->jvmti to NULL, so jvmtiAllocate crashes. >> >> The problem appears only once in nightly testing and I was not able to >> reproduce it locally. >> >> *Solution* >> >> Guard added to jvmtiAllocate to get meaningful error message instead of >> crash. >> >> These fix doesn't fix root cause - JVMTI_ERROR_WRONG_PHASE problem is >> going to be addressed under JDK-8134103. >> >> -Dmitry >> > From david.holmes at oracle.com Mon Aug 29 01:36:56 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 Aug 2016 11:36:56 +1000 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: References: Message-ID: <64c09902-a570-1270-0e28-910cff34a139@oracle.com> Hi Fred, On 27/08/2016 6:00 AM, Frederic Parain wrote: > Hi, > > Please review this fix for bug JDK-8137035 > The bug is confidential but it is related to several VM crashes > that occurred on the Windows 64 bits platform in stack overflow > conditions. I've copied/pasted the analysis of the bug and the > description of the fix below. The analysis and solution all seem reasonable. Though I do have to wonder how the failure to reenable the yellow zone when returning to Java would not cause far more problem, on all platforms. > > Webrev: > http://cr.openjdk.java.net/~fparain/8137035/webrev.00/ src/os/windows/vm/os_windows.cpp While examining the thread state logic in the exception handler I noticed some pre-existing bugs: 2506 if (exception_code == EXCEPTION_ACCESS_VIOLATION) { 2507 JavaThread* thread = (JavaThread*) t; there is no check that t is in fact a JavaThread, or even that t is non-NULL. Such checks occur slightly later: 2523 if (t != NULL && t->is_Java_thread()) { 2524 JavaThread* thread = (JavaThread*) t; This bug seems significant: 2566 if (thread->stack_guards_enabled()) { 2567 if (_thread_in_Java) { _thread_in_Java is an enum value not a variable so we will always execute this block! This code should be testing the local in_java variable. Your changes seem fine in themselves. Thanks, David > Testing: JPRT (testset hotspot) and nsk.stress > > Thanks, > > Fred > > --------- > > All these crashes related to stack overflows on Windows have presumably > the same causes: > - an undersized StackShadowPages parameter > - the behavior of guard pages on Windows > - a flaw in Yellow Pages management > > These three factors combined together can lead to sporadic crashes of > the JVM when stack overflow conditions are encountered. > > All the crashes listed in this CR and in the related CR are almost > impossible to reproduce, which indicates that the issue only shows up in > some extreme or uncommon conditions. By design, the JVM crashes on stack > overflow only if the Red Zone (the last one in the execution stack) is > hit. Before the Red Zone, there's the Yellow Zone which is here to > detect and handle stack overflows in a nicer way (throwing a > StackOverflowError instead of crashing the process). If the Red zone is > hit, it means that the Yellow Zone was > disabled, and there's only two cases where the Yellow Zone is disabled: > > 1 - when a potential stack overflow is detected in Java code, in this > case the Yellow Zone is disabled during the generation of the > StackOverflowError and restored during the propagation of the > StackOverflowError > 2 - when a stack overflow occurs either in native code or in JVM code, > because there's anything else the JVM can do. > > In several crashes, the call stack doesn't show any special recursive > Java calls that could suggest the JVM is in case 1. But they show > relatively complex code paths inside JVM code (de-optimization or > class/symbol resolution), which suggests that case 2 occurred. > > The case of stack overflow in native code is straight forward: if the > Yellow Zone is hit, it is disabled, but when a JavaThread returns from > native code to Java code, the Yellow Zone is systematically re-enabled > (this is part of the native call wrapper > generated by the JVM). > > The case of stack overflow in JVM code is more problematic. The JVM > tries to avoid the case of stack overflow in VM code with the Shadow > Pages mechanism. Whenever a Java method is invoked, the JVM tries to > ensure that there's enough free stack space to execute the Java method > and *any call to the JVM code (or JDK native code) that could occur > during the execution of this method*. This check is performed by banging > (touching) n pages ahead on the execution stack, and n is set to > StackShadowPages. If the Yellow Zone is hit during the stack banging, a > StackOverflowError is thrown before the execution of the first bytecode > of the Java method. But this mechanism assumes that StackShadowPages > pages is big enough to cover *any call to the JVM*. If this assumption > is wrong, so > bad things happen. > > I ran experiments with tests for which stack overflow related crashes > were reported. I ran them with a JVM where the StackShadowPages value > was decreased by only 1 compared the usual default value. It was very > easy to reproduce stack overflow crashes. By instrumenting the JVM, it > appeared that some threads hit the Yellow Zone while having thread state > _thread_in_vm. Which means that in many cases, the margin between the > stack space provided by StackShadowPages and the real stack usage while > executing VM code is less than one page. And because knowing the biggest > stack requirement to execute any JVM code is an undecidable problem, > there's a high probability that some paths require more stack space than > StackShadowPages ensures. It is important to notice > that Windows is the platform with the smallest default value for > StackShadowPages. > > So, an undersized StackShadowPages could cause the Yellow Zone to be hit > while executing JVM code. On Unices (Solaris, Linux, MacOSX), the > sanction is immediate: a SIGSEGV signal is sent, but because there's no > more free space on the execution stack, the signal handler cannot be > executed and the JVM process is killed. It's a crash without hs_error > file generation. > > On Windows, the story is different. Yellow Pages are marked with the > "Guard" bit. When a page with a Guard bit set is touched, the current > thread receives an exception, but before the exception handler is > executed, the OS remove the Guard bit from the page, so the page that > trigger the fault can be used to execute the signal handler. So on > Windows, when the Yellow Zone is hit while executing JVM code, the JVM > doesn't die like on Unices systems, but the signal handler is executed. > > The logic in the signal handler looks like this (simplified version): > > if thread touches the yellow zone: > if thread_in_java: > disable yellow pages > jump to code throwing StackOverflowError > // note: yellow pages will be re-enabled > // while unwinding the stack > else: > // thread_in_vm or thread_in_native > disable yellow pages > resume execution > else: > // Fatal red zone violation. > disable red pages > generate VM crash > > So, the signal handler disable the protection of the Yellow Pages and > resume JVM code execution. > > Eventually, the thread will return from the VM and will continue > executing Java code. But at this point, the yellow pages are still > disabled and there's no systematic check to ensure that Yellow Pages are > re-enabled when returning to Java. The only places where the JVM checks > if Yellow Pages need to be re-activated is when returning from native > code or in the exception propagation code (but not all paths reactivate > the Yellow Zone). > > Once the execution of Java code has resumed with the yellow zone > disabled, the thread is not protected any more against stack overflows. > The only remaining protection is the red zone, and if it is hit, the VM > will generate a crash report and die. Note that having Yellow Zone > de-activated makes the stack banging of StackShadowPages inefficient. > Stack banging relies on the Yellow Pages to be activated, so touching > them triggers a signal. If Yellow Pages are de-activated (unprotected) > no signal is sent, unless the stack banging hits the Red Page, which > triggers a VM crash with hs_error file generation. > > > To summarize: an undersized StackShadowPages on Windows can lead to a > JavaThread executing Java code with Yellow Pages disabled, which means > without any stack overflow protection except the Red Zone which is the > one triggering VM crashes with hs_error file generation. > > Note that the Yellow Pages can be "incidentally" re-activated by a call > to native code or by throwing an exception. Which could explain why > stack overflow crashes are not so frequent, the time window during which > Java code is executed without stack overflow protection might be small > for some applications. > > > Proposed fixes for this issue: > - increase StackShadowPages for the Windows platform > - add assertion is signal handler to detect thread hitting the Yellow > Zone while executing JVM code (to detect undersized StackShadowPages > during our testing) > - ensure Yellow Pages are activated when transitioning from > _thread_in_vm to _thread_in_java > From david.holmes at oracle.com Mon Aug 29 01:45:05 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 Aug 2016 11:45:05 +1000 Subject: RFR: 8157948: UL allows same log file with multiple file= In-Reply-To: References: <98d3f2fe-5937-07b7-e591-c1c7a0ea7a2f@oracle.com> Message-ID: <48d4161a-637f-5933-472b-012c9a562238@oracle.com> Hi Marcus, On 26/08/2016 10:11 PM, Marcus Larsson wrote: > Hi David, > > Thanks for looking at this! > > New webrev: > http://cr.openjdk.java.net/~mlarsson/8157948/webrev.01/ > > Incremental: > http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00-01/ src/share/vm/logging/logConfiguration.cpp Why bother with implicit_output_prefix instead of using LogFileOutput::Prefix directly? You do use the latter in two places so I find the inconsistency strange. > See replies below. Follow up below ... > > On 08/26/2016 03:44 AM, David Holmes wrote: >> Hi Marcus, >> >> We really need a better way to specify and verify these mini-grammars >> for command-line options. :( > > Yeah, I'm all for something like that. > >> >> On 25/08/2016 7:31 PM, Marcus Larsson wrote: >>> Hi, >>> >>> Please review the following patch to fix the issue where you could have >>> the same file added twice as different log outputs in UL if it had the >>> "file=" prefix or if it was quoted. Log output names are now normalized >>> during log argument parsing to ensure they are always normalized when >>> finding existing or adding new outputs. >> >> So does this mean that whereas today >> >> -Xlog:gc=debug:foo >> >> assumes foo is the log file, with this fix you will get an error? > > No, the file= prefix will be assumed just like before. The parse step > will now explicitly add it in the case that it wasn't specified. So > every LogFileOutput instance created will have the prefix in its name. Ok. >> >>> Webrev: >>> http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00/ >> >> src/share/vm/logging/logFileOutput.cpp >> >> Suggestion: >> >> const char* prefix = "file="; >> assert(strstr(name, prefix) == name, "invalid output name '%s': >> missing prefix: %s", name, prefix); >> _file_name = make_file_name(name + strlen(prefix), _pid_str, >> _vm_start_time_str); > > Fixed, see below. > >> >> --- >> >> src/share/vm/logging/logConfiguration.cpp >> >> Suggestion: >> >> static const char* prefix = "file="; > > I've refactored all "file=" literals into constants, but I made the > constant a field of LogFileOutput. I think it fits better there, let me > know if you think otherwise. Placement is fine. >> >> In normalize_output_name it is hard for me to work out what the >> possible "grammar" is, or how different cases will be handled. >> Currently -Xlog:gc=debug:"file"=foo is treated as >> -Xlog:gc=debug:file=foo. But with your changes I think the quoting >> will be handled differently. > > Actually -Xlog:gc=debug:"file"=foo should give an error, since quoting > the output types isn't supported (only the name can be quoted). This > should just be a refactoring to make sure we're always managing the > output names in a uniform manner (so that file="foo" and file=foo isn't > treated as two different log outputs). > > BTW, take care if you're testing this on the command line, as the shell > might be stripping away quotes in the arguments for you. Yes you are right it was stripping them away - it is an error. Thanks, David > > Thanks, > Marcus > >> >> Thanks, >> David >> >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8157948 >>> >>> Testing: >>> New unit test through JPRT >>> >>> Thanks, >>> Marcus > From chris.plummer at oracle.com Mon Aug 29 06:43:30 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Sun, 28 Aug 2016 23:43:30 -0700 Subject: RFR(S): 8163994: Nightly test crashed in jvmtiAllocate In-Reply-To: <290c1d74-97ee-95d0-1b20-01d5aebdab48@oracle.com> References: <290c1d74-97ee-95d0-1b20-01d5aebdab48@oracle.com> Message-ID: On 8/28/16 6:14 PM, David Holmes wrote: > On 27/08/2016 7:35 AM, Chris Plummer wrote: >> Hi Dmitry, >> >> Although the fix is addressing the specific issue described in the bug, >> what about the general issue of referencing gdata after a call to >> cbEarlyVMDeath(). Do more references to gdata need to be protected? >> >> Also, is there the possibility of a multi-threading race condition here? >> Could gdata be cleared by another thread after it is checked? > > Certainly. This really isn't fixing anything just adding a bailout > check before the crashing code. We can still crash and not be any the > wiser as to why. > > Not sure I really see the point of doing this instead of closing this > as a dup of JDK-8134103 and fixing things properly. It it correct to say that Dmitry is fixing a bug exposed by JDK-8134103, or that he is temporarily working around something that is not a true bug, but is indirectly caused by JDK-8134103. I'm not sure, but the answer will dictate the correct course of action here. Chris > > David > >> thanks, >> >> Chris >> >> On 8/26/16 4:00 AM, Dmitry Samersoff wrote: >>> Everybody, >>> >>> Please review the fix. >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8163994/webrev.02/ >>> >>> *Problem* >>> >>> Under some circumstances, when JVMTI_ERROR_WRONG_PHASE(112) is >>> received, >>> jvmtiAllocate could be called after call to cbEarlyVMDeath. >>> >>> cbEarlyVMDeath set gdata->jvmti to NULL, so jvmtiAllocate crashes. >>> >>> The problem appears only once in nightly testing and I was not able to >>> reproduce it locally. >>> >>> *Solution* >>> >>> Guard added to jvmtiAllocate to get meaningful error message instead of >>> crash. >>> >>> These fix doesn't fix root cause - JVMTI_ERROR_WRONG_PHASE problem is >>> going to be addressed under JDK-8134103. >>> >>> -Dmitry >>> >> From david.holmes at oracle.com Mon Aug 29 07:24:24 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 Aug 2016 17:24:24 +1000 Subject: RFR (S): JDK-8164086: Checked JNI pending exception check should be cleared when returning to Java frame In-Reply-To: <70aacfa1-ffa8-a4a9-8092-4cc12949d027@oracle.com> References: <6b96634e-a9c0-47b2-c17b-f6b2057a5f14@oracle.com> <7b2fe986-fca9-dbd1-d20e-17934ac1bc08@oracle.com> <58f4a6da-b3d0-a1c7-9086-a34883e6b590@oracle.com> <70aacfa1-ffa8-a4a9-8092-4cc12949d027@oracle.com> Message-ID: <8082b3e4-1983-8a03-f2c8-b007c8ac91db@oracle.com> Hi David, I still do not understand why you think you need to make any changes in libjli ?? Certainly I do not think you should be printing anything about exceptions. Thanks, David H. On 26/08/2016 9:55 PM, David Simms wrote: > Hi David, > > Updated webrev: http://cr.openjdk.java.net/~dsimms/8164086/webrev2/ > > On 26/08/16 02:27, David Holmes wrote: >> Hi David, >> >> I'm missing some pieces of this puzzle I'm afraid. >> >> On 25/08/2016 8:05 PM, David Simms wrote: >>> >>> Updated the webrev here: >>> http://cr.openjdk.java.net/~dsimms/8164086/webrev1/ >> >> hotspot/src/share/vm/prims/whitebox.cpp >> >> First I'm not sure that Whitebox isn't a special case here that could >> be handled in the WB_START/END macros - see below. >> >> More generally you state below that the transition from native back to >> the VM doesn't have to do anything with the pending_exception_check >> flag because well behaved native code in that context will explicitly >> check for exceptions, and so the pending-exception-check will already >> be disabled before returning to Java. First, if that is the case then >> we should assert that it is so in the native->VM return transition. > > Agreed, inserted assert. > >> >> Second though, it doesn't seem to be the case in Whitebox because the >> CHECK_JNI_EXCEPTION_ macro simply calls HAS_PENDING_EXCEPTION and so >> won't touch the pending-exception-check flag. ?? > > Doh, you are correct...I mistook this for the CHECK_JNI_EXCEPTION macro > in "java.c" which does perform check... > >> >> It was a good pick up that some whitebox code was using values that >> might be NULL because an exception had occurred. There are a couple of >> changes that are unnecessary though: >> >> 1235 result = env->NewObjectArray(5, clazz, NULL); >> 1236 CHECK_JNI_EXCEPTION_(env, NULL); >> 1237 if (result == NULL) { >> 1238 return result; >> 1239 } >> >> (and similarly at 1322) >> >> result will be NULL iff there is a pending exception; and vice-versa. >> So the existing check for NULL suffices for correctness. If you want >> to check exceptions for the side-effect of clearing the >> pending-exception-check flag then lines 1237-1239 can be deleted. >> However I would suggest that if you truly do want to clear the >> pending-exception-check flag then the place to do it is the WB_END >> macro. That allows allows exception checks at the end of methods, eg: >> >> 1261 env->SetObjectArrayElement(result, 4, entry_point); >> 1262 CHECK_JNI_EXCEPTION_(env, NULL); >> 1263 >> 1264 return result; >> >> to be elided. >> > > Agreed, introduce StackObj with appropriate destructor, removed the > checks above. > > >> --- >> >> hotspot/src/share/vm/runtime/thread.hpp >> >> ! // which function name. Returning to a Java frame should >> implicitly clear the >> ! // need for, this is done for Native->Java transitions. >> >> Seems to be some text missing after "need for". > > Thanks for seeing that, fixed. > >> >> --- >> >> For the tests we no longer use bug numbers as part of the test names. >> Looks like some recent tests slipped by unfortunately. :( >> > > Moved to "test/runtime/jni/checked" > >> You should be able to get rid of the: >> >> * @modules java.base/jdk.internal.misc >> >> with Christian's just pushed changes to ProcessTools to isolate the >> Unsafe dependency. >> > > Done > >>> core-libs & Kumar: java launcher: are you okay with the >>> CHECK_EXCEPTION_PRINT macro, or would you rather it was silent (i.e. >>> CHECK_EXCEPTION_RETURN) ? >> >> I'm not seeing the point of this logic. Any exceptions that remain >> pending when the main thread detaches from the VM will be reported by >> the uncaught-exception handling logic. The checks you put in are in >> most cases immediately before a return so there is no need to check >> for a pending exception and do an earlier return. And in one case you >> would bypass tracing logic by doing an early return. > > Removed all the extra checks, add JNI exception check to within the > existing CHECK_NULL0 macro (make more sense there). > >> >> I had assumed this was just some debugging code you had left in by >> mistake. > > The method invocations needed to find main class needs to check for the > test to pass. > > Cheers > /David From marcus.larsson at oracle.com Mon Aug 29 12:47:01 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Mon, 29 Aug 2016 14:47:01 +0200 Subject: RFR: 8157948: UL allows same log file with multiple file= In-Reply-To: <48d4161a-637f-5933-472b-012c9a562238@oracle.com> References: <98d3f2fe-5937-07b7-e591-c1c7a0ea7a2f@oracle.com> <48d4161a-637f-5933-472b-012c9a562238@oracle.com> Message-ID: Hi, On 08/29/2016 03:45 AM, David Holmes wrote: > Hi Marcus, > > On 26/08/2016 10:11 PM, Marcus Larsson wrote: >> Hi David, >> >> Thanks for looking at this! >> >> New webrev: >> http://cr.openjdk.java.net/~mlarsson/8157948/webrev.01/ >> >> Incremental: >> http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00-01/ > > src/share/vm/logging/logConfiguration.cpp > > Why bother with implicit_output_prefix instead of using > LogFileOutput::Prefix directly? You do use the latter in two places so > I find the inconsistency strange. The two instances of direct usage of LogFileOutput::Prefix are not related to the implicit prefix, which is why I don't use the constant there. I wanted the implicit_output_prefix constant to improve readability and make it easy to switch the default, should we ever want to. I'm fine with removing it if you think that's better. > >> See replies below. > > Follow up below ... > >> >> On 08/26/2016 03:44 AM, David Holmes wrote: >>> Hi Marcus, >>> >>> We really need a better way to specify and verify these mini-grammars >>> for command-line options. :( >> >> Yeah, I'm all for something like that. >> >>> >>> On 25/08/2016 7:31 PM, Marcus Larsson wrote: >>>> Hi, >>>> >>>> Please review the following patch to fix the issue where you could >>>> have >>>> the same file added twice as different log outputs in UL if it had the >>>> "file=" prefix or if it was quoted. Log output names are now >>>> normalized >>>> during log argument parsing to ensure they are always normalized when >>>> finding existing or adding new outputs. >>> >>> So does this mean that whereas today >>> >>> -Xlog:gc=debug:foo >>> >>> assumes foo is the log file, with this fix you will get an error? >> >> No, the file= prefix will be assumed just like before. The parse step >> will now explicitly add it in the case that it wasn't specified. So >> every LogFileOutput instance created will have the prefix in its name. > > Ok. > >>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00/ >>> >>> src/share/vm/logging/logFileOutput.cpp >>> >>> Suggestion: >>> >>> const char* prefix = "file="; >>> assert(strstr(name, prefix) == name, "invalid output name '%s': >>> missing prefix: %s", name, prefix); >>> _file_name = make_file_name(name + strlen(prefix), _pid_str, >>> _vm_start_time_str); >> >> Fixed, see below. >> >>> >>> --- >>> >>> src/share/vm/logging/logConfiguration.cpp >>> >>> Suggestion: >>> >>> static const char* prefix = "file="; >> >> I've refactored all "file=" literals into constants, but I made the >> constant a field of LogFileOutput. I think it fits better there, let me >> know if you think otherwise. > > Placement is fine. > >>> >>> In normalize_output_name it is hard for me to work out what the >>> possible "grammar" is, or how different cases will be handled. >>> Currently -Xlog:gc=debug:"file"=foo is treated as >>> -Xlog:gc=debug:file=foo. But with your changes I think the quoting >>> will be handled differently. >> >> Actually -Xlog:gc=debug:"file"=foo should give an error, since quoting >> the output types isn't supported (only the name can be quoted). This >> should just be a refactoring to make sure we're always managing the >> output names in a uniform manner (so that file="foo" and file=foo isn't >> treated as two different log outputs). >> >> BTW, take care if you're testing this on the command line, as the shell >> might be stripping away quotes in the arguments for you. > > Yes you are right it was stripping them away - it is an error. Great! Thanks, Marcus > > Thanks, > David > >> >> Thanks, >> Marcus >> >>> >>> Thanks, >>> David >>> >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8157948 >>>> >>>> Testing: >>>> New unit test through JPRT >>>> >>>> Thanks, >>>> Marcus >> From dmitry.samersoff at oracle.com Mon Aug 29 13:12:12 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 29 Aug 2016 16:12:12 +0300 Subject: RFR(S): 8163994: Nightly test crashed in jvmtiAllocate In-Reply-To: References: <290c1d74-97ee-95d0-1b20-01d5aebdab48@oracle.com> Message-ID: <3a4bb157-72da-61d6-5952-ed807a130344@oracle.com> Chris & David, JVMTI_ERROR_WRONG_PHASE problem is complicated and requires significant work probably on both JDWP and JVMTI side. Serguei plan to do it as a part of JDK-8134103 and not for JDK 9. So yes, we can close this one as a dup of JDK-8134103 - it has the same root cause and should be addressed as the part of JDK-8134103 (particularly, we have to cleanup ignore_vm_death logic) But the crash is observed only once in a nightly, so my intention is to save us a bit of time next time when this situation happens. i.e. before the changes we get JVMTI_ERROR_WRONG_PHASE message and *crash*, after the changes we get JVMTI_ERROR_WRONG_PHASE message and AGENT_ERROR_INTERNAL message. -Dmitry On 2016-08-29 09:43, Chris Plummer wrote: > On 8/28/16 6:14 PM, David Holmes wrote: >> On 27/08/2016 7:35 AM, Chris Plummer wrote: >>> Hi Dmitry, >>> >>> Although the fix is addressing the specific issue described in the bug, >>> what about the general issue of referencing gdata after a call to >>> cbEarlyVMDeath(). Do more references to gdata need to be protected? >>> >>> Also, is there the possibility of a multi-threading race condition here? >>> Could gdata be cleared by another thread after it is checked? >> >> Certainly. This really isn't fixing anything just adding a bailout >> check before the crashing code. We can still crash and not be any the >> wiser as to why. >> >> Not sure I really see the point of doing this instead of closing this >> as a dup of JDK-8134103 and fixing things properly. > It it correct to say that Dmitry is fixing a bug exposed by JDK-8134103, > or that he is temporarily working around something that is not a true > bug, but is indirectly caused by JDK-8134103. I'm not sure, but the > answer will dictate the correct course of action here. > > Chris >> >> David >> >>> thanks, >>> >>> Chris >>> >>> On 8/26/16 4:00 AM, Dmitry Samersoff wrote: >>>> Everybody, >>>> >>>> Please review the fix. >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8163994/webrev.02/ >>>> >>>> *Problem* >>>> >>>> Under some circumstances, when JVMTI_ERROR_WRONG_PHASE(112) is >>>> received, >>>> jvmtiAllocate could be called after call to cbEarlyVMDeath. >>>> >>>> cbEarlyVMDeath set gdata->jvmti to NULL, so jvmtiAllocate crashes. >>>> >>>> The problem appears only once in nightly testing and I was not able to >>>> reproduce it locally. >>>> >>>> *Solution* >>>> >>>> Guard added to jvmtiAllocate to get meaningful error message instead of >>>> crash. >>>> >>>> These fix doesn't fix root cause - JVMTI_ERROR_WRONG_PHASE problem is >>>> going to be addressed under JDK-8134103. >>>> >>>> -Dmitry >>>> >>> > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From coleen.phillimore at oracle.com Mon Aug 29 13:25:40 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 29 Aug 2016 09:25:40 -0400 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: References: Message-ID: <6f801fc0-4070-af01-42a9-8015830eec14@oracle.com> Hi Fred, This is the clearest writeup of stack overflow handling that I've seen so far, and your fix seems good. I'm a bit worried about the assertion in http://cr.openjdk.java.net/~fparain/8137035/webrev.00/src/os/windows/vm/os_windows.cpp.udiff.html But I think it would be better to hit this in our testing rather than spurious stack overflow exceptions. http://cr.openjdk.java.net/~fparain/8137035/webrev.00/src/cpu/x86/vm/globals_x86.hpp.udiff.html Should windows 32 bit stack shadow pages be increased also? Have we seen these on 32 bits also? http://cr.openjdk.java.net/~fparain/8137035/webrev.00/src/share/vm/runtime/interfaceSupport.hpp.udiff.html Lastly, I really prefer this code here rather than buried in the stack overflow throwing logic in the interpreter and compiler. As we talked about, I think this code should be cleaned up. I don't know why it was done this way other than that's where it was historically and we were afraid to change it without the extensive analysis that you've done. Maybe someone from the compiler team remembers? (cc'ed) Thanks, Coleen On 8/26/16 4:00 PM, Frederic Parain wrote: > Hi, > > Please review this fix for bug JDK-8137035 > The bug is confidential but it is related to several VM crashes > that occurred on the Windows 64 bits platform in stack overflow > conditions. I've copied/pasted the analysis of the bug and the > description of the fix below. > > Webrev: > http://cr.openjdk.java.net/~fparain/8137035/webrev.00/ > > Testing: JPRT (testset hotspot) and nsk.stress > > Thanks, > > Fred > > --------- > > All these crashes related to stack overflows on Windows have > presumably the same causes: > - an undersized StackShadowPages parameter > - the behavior of guard pages on Windows > - a flaw in Yellow Pages management > > These three factors combined together can lead to sporadic crashes of > the JVM when stack overflow conditions are encountered. > > All the crashes listed in this CR and in the related CR are almost > impossible to reproduce, which indicates that the issue only shows up > in some extreme or uncommon conditions. By design, the JVM crashes on > stack overflow only if the Red Zone (the last one in the execution > stack) is hit. Before the Red Zone, there's the Yellow Zone which is > here to detect and handle stack overflows in a nicer way (throwing a > StackOverflowError instead of crashing the process). If the Red zone > is hit, it means that the Yellow Zone was > disabled, and there's only two cases where the Yellow Zone is disabled: > > 1 - when a potential stack overflow is detected in Java code, in > this case the Yellow Zone is disabled during the generation of the > StackOverflowError and restored during the propagation of the > StackOverflowError > 2 - when a stack overflow occurs either in native code or in JVM > code, because there's anything else the JVM can do. > > In several crashes, the call stack doesn't show any special recursive > Java calls that could suggest the JVM is in case 1. But they show > relatively complex code paths inside JVM code (de-optimization or > class/symbol resolution), which suggests that case 2 occurred. > > The case of stack overflow in native code is straight forward: if the > Yellow Zone is hit, it is disabled, but when a JavaThread returns from > native code to Java code, the Yellow Zone is systematically re-enabled > (this is part of the native call wrapper > generated by the JVM). > > The case of stack overflow in JVM code is more problematic. The JVM > tries to avoid the case of stack overflow in VM code with the Shadow > Pages mechanism. Whenever a Java method is invoked, the JVM tries to > ensure that there's enough free stack space to execute the Java method > and *any call to the JVM code (or JDK native code) that could occur > during the execution of this method*. This check is performed by > banging (touching) n pages ahead on the execution stack, and n is set > to StackShadowPages. If the Yellow Zone is hit during the stack > banging, a StackOverflowError is thrown before the execution of the > first bytecode of the Java method. But this mechanism assumes that > StackShadowPages pages is big enough to cover *any call to the JVM*. > If this assumption is wrong, so > bad things happen. > > I ran experiments with tests for which stack overflow related crashes > were reported. I ran them with a JVM where the StackShadowPages value > was decreased by only 1 compared the usual default value. It was very > easy to reproduce stack overflow crashes. By instrumenting the JVM, it > appeared that some threads hit the Yellow Zone while having thread > state _thread_in_vm. Which means that in many cases, the margin > between the stack space provided by StackShadowPages and the real > stack usage while executing VM code is less than one page. And because > knowing the biggest stack requirement to execute any JVM code is an > undecidable problem, there's a high probability that some paths > require more stack space than StackShadowPages ensures. It is > important to notice > that Windows is the platform with the smallest default value for > StackShadowPages. > > So, an undersized StackShadowPages could cause the Yellow Zone to be > hit while executing JVM code. On Unices (Solaris, Linux, MacOSX), the > sanction is immediate: a SIGSEGV signal is sent, but because there's > no more free space on the execution stack, the signal handler cannot > be executed and the JVM process is killed. It's a crash without > hs_error file generation. > > On Windows, the story is different. Yellow Pages are marked with the > "Guard" bit. When a page with a Guard bit set is touched, the current > thread receives an exception, but before the exception handler is > executed, the OS remove the Guard bit from the page, so the page that > trigger the fault can be used to execute the signal handler. So on > Windows, when the Yellow Zone is hit while executing JVM code, the JVM > doesn't die like on Unices systems, but the signal handler is executed. > > The logic in the signal handler looks like this (simplified version): > > if thread touches the yellow zone: > if thread_in_java: > disable yellow pages > jump to code throwing StackOverflowError > // note: yellow pages will be re-enabled > // while unwinding the stack > else: > // thread_in_vm or thread_in_native > disable yellow pages > resume execution > else: > // Fatal red zone violation. > disable red pages > generate VM crash > > So, the signal handler disable the protection of the Yellow Pages and > resume JVM code execution. > > Eventually, the thread will return from the VM and will continue > executing Java code. But at this point, the yellow pages are still > disabled and there's no systematic check to ensure that Yellow Pages > are re-enabled when returning to Java. The only places where the JVM > checks if Yellow Pages need to be re-activated is when returning from > native code or in the exception propagation code (but not all paths > reactivate the Yellow Zone). > > Once the execution of Java code has resumed with the yellow zone > disabled, the thread is not protected any more against stack > overflows. The only remaining protection is the red zone, and if it is > hit, the VM will generate a crash report and die. Note that having > Yellow Zone de-activated makes the stack banging of StackShadowPages > inefficient. Stack banging relies on the Yellow Pages to be activated, > so touching them triggers a signal. If Yellow Pages are de-activated > (unprotected) no signal is sent, unless the stack banging hits the Red > Page, which triggers a VM crash with hs_error file generation. > > > To summarize: an undersized StackShadowPages on Windows can lead to a > JavaThread executing Java code with Yellow Pages disabled, which means > without any stack overflow protection except the Red Zone which is the > one triggering VM crashes with hs_error file generation. > > Note that the Yellow Pages can be "incidentally" re-activated by a > call to native code or by throwing an exception. Which could explain > why stack overflow crashes are not so frequent, the time window during > which Java code is executed without stack overflow protection might be > small for some applications. > > > Proposed fixes for this issue: > - increase StackShadowPages for the Windows platform > - add assertion is signal handler to detect thread hitting the > Yellow Zone while executing JVM code (to detect undersized > StackShadowPages during our testing) > - ensure Yellow Pages are activated when transitioning from > _thread_in_vm to _thread_in_java > From frederic.parain at oracle.com Mon Aug 29 14:37:12 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Mon, 29 Aug 2016 10:37:12 -0400 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: <64c09902-a570-1270-0e28-910cff34a139@oracle.com> References: <64c09902-a570-1270-0e28-910cff34a139@oracle.com> Message-ID: <1137f725-3f0e-5298-4d7c-1c3bd573b832@oracle.com> Hi David, Thank you for the review. A few comments in-lined below. On 08/28/2016 09:36 PM, David Holmes wrote: > Hi Fred, > > On 27/08/2016 6:00 AM, Frederic Parain wrote: >> Hi, >> >> Please review this fix for bug JDK-8137035 >> The bug is confidential but it is related to several VM crashes >> that occurred on the Windows 64 bits platform in stack overflow >> conditions. I've copied/pasted the analysis of the bug and the >> description of the fix below. > > The analysis and solution all seem reasonable. Though I do have to > wonder how the failure to reenable the yellow zone when returning to > Java would not cause far more problem, on all platforms. Running with Yellow Pages disabled clearly opens the door to random crashes. Making the mechanism simpler and more robust would benefit to all platforms. >> >> Webrev: >> http://cr.openjdk.java.net/~fparain/8137035/webrev.00/ > > src/os/windows/vm/os_windows.cpp > > While examining the thread state logic in the exception handler I > noticed some pre-existing bugs: > > 2506 if (exception_code == EXCEPTION_ACCESS_VIOLATION) { > 2507 JavaThread* thread = (JavaThread*) t; > > there is no check that t is in fact a JavaThread, or even that t is > non-NULL. Such checks occur slightly later: I've investigated this issue, and it is currently harmless. The casted pointer is only used to call a method requiring a JavaThread* pointer and the only usage of its argument it's a NULL check. Unfortunately, fixing this issue would require to modify the prototype of os::is_memory_serialize_page() and propagate the change across all platforms using it. It's a wider scope fix than JDK-8137035. I've added a comment the unsafe cast in os_windows.cpp file, highlighting the fact it was unsafe, and explaining why it is currently harmless. > > 2523 if (t != NULL && t->is_Java_thread()) { > 2524 JavaThread* thread = (JavaThread*) t; > > This bug seems significant: > > 2566 if (thread->stack_guards_enabled()) { > 2567 if (_thread_in_Java) { > > _thread_in_Java is an enum value not a variable so we will always > execute this block! This code should be testing the local in_java variable. Good catch! Fixed. Updated webrev: http://cr.openjdk.java.net/~fparain/8137035/webrev.01/index.html Thank you, Fred > Your changes seem fine in themselves. > > Thanks, > David > > >> Testing: JPRT (testset hotspot) and nsk.stress >> >> Thanks, >> >> Fred >> >> --------- >> >> All these crashes related to stack overflows on Windows have presumably >> the same causes: >> - an undersized StackShadowPages parameter >> - the behavior of guard pages on Windows >> - a flaw in Yellow Pages management >> >> These three factors combined together can lead to sporadic crashes of >> the JVM when stack overflow conditions are encountered. >> >> All the crashes listed in this CR and in the related CR are almost >> impossible to reproduce, which indicates that the issue only shows up in >> some extreme or uncommon conditions. By design, the JVM crashes on stack >> overflow only if the Red Zone (the last one in the execution stack) is >> hit. Before the Red Zone, there's the Yellow Zone which is here to >> detect and handle stack overflows in a nicer way (throwing a >> StackOverflowError instead of crashing the process). If the Red zone is >> hit, it means that the Yellow Zone was >> disabled, and there's only two cases where the Yellow Zone is disabled: >> >> 1 - when a potential stack overflow is detected in Java code, in this >> case the Yellow Zone is disabled during the generation of the >> StackOverflowError and restored during the propagation of the >> StackOverflowError >> 2 - when a stack overflow occurs either in native code or in JVM code, >> because there's anything else the JVM can do. >> >> In several crashes, the call stack doesn't show any special recursive >> Java calls that could suggest the JVM is in case 1. But they show >> relatively complex code paths inside JVM code (de-optimization or >> class/symbol resolution), which suggests that case 2 occurred. >> >> The case of stack overflow in native code is straight forward: if the >> Yellow Zone is hit, it is disabled, but when a JavaThread returns from >> native code to Java code, the Yellow Zone is systematically re-enabled >> (this is part of the native call wrapper >> generated by the JVM). >> >> The case of stack overflow in JVM code is more problematic. The JVM >> tries to avoid the case of stack overflow in VM code with the Shadow >> Pages mechanism. Whenever a Java method is invoked, the JVM tries to >> ensure that there's enough free stack space to execute the Java method >> and *any call to the JVM code (or JDK native code) that could occur >> during the execution of this method*. This check is performed by banging >> (touching) n pages ahead on the execution stack, and n is set to >> StackShadowPages. If the Yellow Zone is hit during the stack banging, a >> StackOverflowError is thrown before the execution of the first bytecode >> of the Java method. But this mechanism assumes that StackShadowPages >> pages is big enough to cover *any call to the JVM*. If this assumption >> is wrong, so >> bad things happen. >> >> I ran experiments with tests for which stack overflow related crashes >> were reported. I ran them with a JVM where the StackShadowPages value >> was decreased by only 1 compared the usual default value. It was very >> easy to reproduce stack overflow crashes. By instrumenting the JVM, it >> appeared that some threads hit the Yellow Zone while having thread state >> _thread_in_vm. Which means that in many cases, the margin between the >> stack space provided by StackShadowPages and the real stack usage while >> executing VM code is less than one page. And because knowing the biggest >> stack requirement to execute any JVM code is an undecidable problem, >> there's a high probability that some paths require more stack space than >> StackShadowPages ensures. It is important to notice >> that Windows is the platform with the smallest default value for >> StackShadowPages. >> >> So, an undersized StackShadowPages could cause the Yellow Zone to be hit >> while executing JVM code. On Unices (Solaris, Linux, MacOSX), the >> sanction is immediate: a SIGSEGV signal is sent, but because there's no >> more free space on the execution stack, the signal handler cannot be >> executed and the JVM process is killed. It's a crash without hs_error >> file generation. >> >> On Windows, the story is different. Yellow Pages are marked with the >> "Guard" bit. When a page with a Guard bit set is touched, the current >> thread receives an exception, but before the exception handler is >> executed, the OS remove the Guard bit from the page, so the page that >> trigger the fault can be used to execute the signal handler. So on >> Windows, when the Yellow Zone is hit while executing JVM code, the JVM >> doesn't die like on Unices systems, but the signal handler is executed. >> >> The logic in the signal handler looks like this (simplified version): >> >> if thread touches the yellow zone: >> if thread_in_java: >> disable yellow pages >> jump to code throwing StackOverflowError >> // note: yellow pages will be re-enabled >> // while unwinding the stack >> else: >> // thread_in_vm or thread_in_native >> disable yellow pages >> resume execution >> else: >> // Fatal red zone violation. >> disable red pages >> generate VM crash >> >> So, the signal handler disable the protection of the Yellow Pages and >> resume JVM code execution. >> >> Eventually, the thread will return from the VM and will continue >> executing Java code. But at this point, the yellow pages are still >> disabled and there's no systematic check to ensure that Yellow Pages are >> re-enabled when returning to Java. The only places where the JVM checks >> if Yellow Pages need to be re-activated is when returning from native >> code or in the exception propagation code (but not all paths reactivate >> the Yellow Zone). >> >> Once the execution of Java code has resumed with the yellow zone >> disabled, the thread is not protected any more against stack overflows. >> The only remaining protection is the red zone, and if it is hit, the VM >> will generate a crash report and die. Note that having Yellow Zone >> de-activated makes the stack banging of StackShadowPages inefficient. >> Stack banging relies on the Yellow Pages to be activated, so touching >> them triggers a signal. If Yellow Pages are de-activated (unprotected) >> no signal is sent, unless the stack banging hits the Red Page, which >> triggers a VM crash with hs_error file generation. >> >> >> To summarize: an undersized StackShadowPages on Windows can lead to a >> JavaThread executing Java code with Yellow Pages disabled, which means >> without any stack overflow protection except the Red Zone which is the >> one triggering VM crashes with hs_error file generation. >> >> Note that the Yellow Pages can be "incidentally" re-activated by a call >> to native code or by throwing an exception. Which could explain why >> stack overflow crashes are not so frequent, the time window during which >> Java code is executed without stack overflow protection might be small >> for some applications. >> >> >> Proposed fixes for this issue: >> - increase StackShadowPages for the Windows platform >> - add assertion is signal handler to detect thread hitting the Yellow >> Zone while executing JVM code (to detect undersized StackShadowPages >> during our testing) >> - ensure Yellow Pages are activated when transitioning from >> _thread_in_vm to _thread_in_java >> From frederic.parain at oracle.com Mon Aug 29 14:45:47 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Mon, 29 Aug 2016 10:45:47 -0400 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: <6f801fc0-4070-af01-42a9-8015830eec14@oracle.com> References: <6f801fc0-4070-af01-42a9-8015830eec14@oracle.com> Message-ID: <9cf99e6f-ebc5-d7f6-3f82-3cb985790b32@oracle.com> Hi Coleen, Thank you for your review, my answers are in-lined below. On 08/29/2016 09:25 AM, Coleen Phillimore wrote: > > Hi Fred, > > This is the clearest writeup of stack overflow handling that I've seen > so far, and your fix seems good. I'm a bit worried about the assertion in > > http://cr.openjdk.java.net/~fparain/8137035/webrev.00/src/os/windows/vm/os_windows.cpp.udiff.html > > > But I think it would be better to hit this in our testing rather than > spurious stack overflow exceptions. Yes, the goal is really to catch undersized StackShadowPages values during testing. But note that the behavior on product builds is unchanged. > > http://cr.openjdk.java.net/~fparain/8137035/webrev.00/src/cpu/x86/vm/globals_x86.hpp.udiff.html > > > Should windows 32 bit stack shadow pages be increased also? Have we > seen these on 32 bits also? There's a significant number of bug reports about stack overflow related crashes on Windows (referring to more or less directly to JDK-8137035). All the crashes I've been able to find so far were on Windows 64 bits. So I've conservatively increase StackShadowPages on WIN64 only. If it makes you more comfortable, I can also increase on WIN32. > http://cr.openjdk.java.net/~fparain/8137035/webrev.00/src/share/vm/runtime/interfaceSupport.hpp.udiff.html > > > Lastly, I really prefer this code here rather than buried in the stack > overflow throwing logic in the interpreter and compiler. As we talked > about, I think this code should be cleaned up. I don't know why it was > done this way other than that's where it was historically and we were > afraid to change it without the extensive analysis that you've done. > Maybe someone from the compiler team remembers? (cc'ed) This fix aims to make the VM more robust for JDK9. The clean up of the stack overflow throwing logic is likely to be much more intrusive, so it seems preferable to delay it to JDK10. Thank you, Fred > On 8/26/16 4:00 PM, Frederic Parain wrote: >> Hi, >> >> Please review this fix for bug JDK-8137035 >> The bug is confidential but it is related to several VM crashes >> that occurred on the Windows 64 bits platform in stack overflow >> conditions. I've copied/pasted the analysis of the bug and the >> description of the fix below. >> >> Webrev: >> http://cr.openjdk.java.net/~fparain/8137035/webrev.00/ >> >> Testing: JPRT (testset hotspot) and nsk.stress >> >> Thanks, >> >> Fred >> >> --------- >> >> All these crashes related to stack overflows on Windows have >> presumably the same causes: >> - an undersized StackShadowPages parameter >> - the behavior of guard pages on Windows >> - a flaw in Yellow Pages management >> >> These three factors combined together can lead to sporadic crashes of >> the JVM when stack overflow conditions are encountered. >> >> All the crashes listed in this CR and in the related CR are almost >> impossible to reproduce, which indicates that the issue only shows up >> in some extreme or uncommon conditions. By design, the JVM crashes on >> stack overflow only if the Red Zone (the last one in the execution >> stack) is hit. Before the Red Zone, there's the Yellow Zone which is >> here to detect and handle stack overflows in a nicer way (throwing a >> StackOverflowError instead of crashing the process). If the Red zone >> is hit, it means that the Yellow Zone was >> disabled, and there's only two cases where the Yellow Zone is disabled: >> >> 1 - when a potential stack overflow is detected in Java code, in >> this case the Yellow Zone is disabled during the generation of the >> StackOverflowError and restored during the propagation of the >> StackOverflowError >> 2 - when a stack overflow occurs either in native code or in JVM >> code, because there's anything else the JVM can do. >> >> In several crashes, the call stack doesn't show any special recursive >> Java calls that could suggest the JVM is in case 1. But they show >> relatively complex code paths inside JVM code (de-optimization or >> class/symbol resolution), which suggests that case 2 occurred. >> >> The case of stack overflow in native code is straight forward: if the >> Yellow Zone is hit, it is disabled, but when a JavaThread returns from >> native code to Java code, the Yellow Zone is systematically re-enabled >> (this is part of the native call wrapper >> generated by the JVM). >> >> The case of stack overflow in JVM code is more problematic. The JVM >> tries to avoid the case of stack overflow in VM code with the Shadow >> Pages mechanism. Whenever a Java method is invoked, the JVM tries to >> ensure that there's enough free stack space to execute the Java method >> and *any call to the JVM code (or JDK native code) that could occur >> during the execution of this method*. This check is performed by >> banging (touching) n pages ahead on the execution stack, and n is set >> to StackShadowPages. If the Yellow Zone is hit during the stack >> banging, a StackOverflowError is thrown before the execution of the >> first bytecode of the Java method. But this mechanism assumes that >> StackShadowPages pages is big enough to cover *any call to the JVM*. >> If this assumption is wrong, so >> bad things happen. >> >> I ran experiments with tests for which stack overflow related crashes >> were reported. I ran them with a JVM where the StackShadowPages value >> was decreased by only 1 compared the usual default value. It was very >> easy to reproduce stack overflow crashes. By instrumenting the JVM, it >> appeared that some threads hit the Yellow Zone while having thread >> state _thread_in_vm. Which means that in many cases, the margin >> between the stack space provided by StackShadowPages and the real >> stack usage while executing VM code is less than one page. And because >> knowing the biggest stack requirement to execute any JVM code is an >> undecidable problem, there's a high probability that some paths >> require more stack space than StackShadowPages ensures. It is >> important to notice >> that Windows is the platform with the smallest default value for >> StackShadowPages. >> >> So, an undersized StackShadowPages could cause the Yellow Zone to be >> hit while executing JVM code. On Unices (Solaris, Linux, MacOSX), the >> sanction is immediate: a SIGSEGV signal is sent, but because there's >> no more free space on the execution stack, the signal handler cannot >> be executed and the JVM process is killed. It's a crash without >> hs_error file generation. >> >> On Windows, the story is different. Yellow Pages are marked with the >> "Guard" bit. When a page with a Guard bit set is touched, the current >> thread receives an exception, but before the exception handler is >> executed, the OS remove the Guard bit from the page, so the page that >> trigger the fault can be used to execute the signal handler. So on >> Windows, when the Yellow Zone is hit while executing JVM code, the JVM >> doesn't die like on Unices systems, but the signal handler is executed. >> >> The logic in the signal handler looks like this (simplified version): >> >> if thread touches the yellow zone: >> if thread_in_java: >> disable yellow pages >> jump to code throwing StackOverflowError >> // note: yellow pages will be re-enabled >> // while unwinding the stack >> else: >> // thread_in_vm or thread_in_native >> disable yellow pages >> resume execution >> else: >> // Fatal red zone violation. >> disable red pages >> generate VM crash >> >> So, the signal handler disable the protection of the Yellow Pages and >> resume JVM code execution. >> >> Eventually, the thread will return from the VM and will continue >> executing Java code. But at this point, the yellow pages are still >> disabled and there's no systematic check to ensure that Yellow Pages >> are re-enabled when returning to Java. The only places where the JVM >> checks if Yellow Pages need to be re-activated is when returning from >> native code or in the exception propagation code (but not all paths >> reactivate the Yellow Zone). >> >> Once the execution of Java code has resumed with the yellow zone >> disabled, the thread is not protected any more against stack >> overflows. The only remaining protection is the red zone, and if it is >> hit, the VM will generate a crash report and die. Note that having >> Yellow Zone de-activated makes the stack banging of StackShadowPages >> inefficient. Stack banging relies on the Yellow Pages to be activated, >> so touching them triggers a signal. If Yellow Pages are de-activated >> (unprotected) no signal is sent, unless the stack banging hits the Red >> Page, which triggers a VM crash with hs_error file generation. >> >> >> To summarize: an undersized StackShadowPages on Windows can lead to a >> JavaThread executing Java code with Yellow Pages disabled, which means >> without any stack overflow protection except the Red Zone which is the >> one triggering VM crashes with hs_error file generation. >> >> Note that the Yellow Pages can be "incidentally" re-activated by a >> call to native code or by throwing an exception. Which could explain >> why stack overflow crashes are not so frequent, the time window during >> which Java code is executed without stack overflow protection might be >> small for some applications. >> >> >> Proposed fixes for this issue: >> - increase StackShadowPages for the Windows platform >> - add assertion is signal handler to detect thread hitting the >> Yellow Zone while executing JVM code (to detect undersized >> StackShadowPages during our testing) >> - ensure Yellow Pages are activated when transitioning from >> _thread_in_vm to _thread_in_java >> > From kirill.zhaldybin at oracle.com Mon Aug 29 15:02:56 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Mon, 29 Aug 2016 18:02:56 +0300 Subject: RFR(XS): 8164743: Convert TestAsUtf8 to GTest In-Reply-To: <85c5d756-e502-72b3-b247-9dd9a507e136@oracle.com> References: <85c5d756-e502-72b3-b247-9dd9a507e136@oracle.com> Message-ID: David, Thank you for review! Regards, Kirill On 26.08.2016 06:39, David Holmes wrote: > Looks fine. > > Thanks, > David > ----- > > On 25/08/2016 2:45 AM, Kirill Zhaldybin wrote: >> Dear all, >> >> Could you please review this fix for 8164743? >> The test was converted to GTest. >> >> WebRev: >> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164743/webrev.00/ >> CR: https://bugs.openjdk.java.net/browse/JDK-8164743 >> >> Thank you. >> >> Regards, Kirill From kirill.zhaldybin at oracle.com Mon Aug 29 15:06:54 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Mon, 29 Aug 2016 18:06:54 +0300 Subject: RFR(S): 8164738: Convert AltHashing_test to GTest In-Reply-To: References: <94b3c417-47ed-d158-c166-84dc8536f199@oracle.com> Message-ID: <6f652dc9-a022-fbf3-f92e-545fadc0dd77@oracle.com> David, Thank you for review! On 26.08.2016 06:38, David Holmes wrote: > On 25/08/2016 1:47 AM, Kirill Zhaldybin wrote: >> Dear all, >> >> Could you please review this fix for 8164738? > > Seems okay. > >> To convert the test I added new friend class to AltHashing class so we >> could access private member function static juint murmur3_32(const int* >> data, int len). There are also few formating fixes. > > Any reason all the murmur functions shouldn't be public? I'm not a fan > of friends. No big deal either way. Well, I am not an author so I could only speculate that if static juint murmur3_32(const int* data, int len); static juint murmur3_32(juint seed, const int* data, int len); are used only from AltHashing class according to general "the less visible the better" rule they were made private. Regards, Kirill > > Thanks, > David > >> >> WebRev: >> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164738/webrev.00/ >> CR: https://bugs.openjdk.java.net/browse/JDK-8164738 >> >> Regards, Kirill From gerard.ziemski at oracle.com Mon Aug 29 15:52:47 2016 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Mon, 29 Aug 2016 10:52:47 -0500 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: <5ea72a2f-87b5-c2b7-e05b-5ea5025ccadf@oracle.com> References: <5ea72a2f-87b5-c2b7-e05b-5ea5025ccadf@oracle.com> Message-ID: hi Fred, > On Aug 26, 2016, at 4:53 PM, Frederic Parain wrote: > >> Is it really an undecidable problem? Why is that exactly? > > How would you compute the max stack size for any call to the VM? > Just the matrix of all VM options that could impact the stack usage > is huge: several GC, several JIT compilers, JVMTI hooks, JFR. > The work to be performed by the JVM can also be dependent on the > application (class hierarchy, application code itself which can > be optimized (and deoptimized) in many different ways according to > compilation policies and application behavior). > > This problem is not specific to the JVM. Linux has a similar issue > with its kernel stacks: they have a fixed size, but there's no way > to ensure that the size is sufficient to execute any system call or > perform any OS operation. Absolutely, however, in light of the issue, now that we determined we need to increase the number of shadow pages, it seems to me that maybe we could take this opportunity and try to evaluate (somehow) how many we actually need under some hypothetical load condition with all the common options turned on, as an alternative way to conservatively increasing them by 1. After all, like you said, when the networking code was changed, we had to find a new default value somehow, so it has bee done before. I don?t know when we set the pages size last, but if it has been a while, then given all the new features we probably added since then, again as you said JFR, GC strategies etc., means we should probably re-evaluate this every now and then? It?s just that increasing the pages by 1 and hoping (admittedly backed up by testing) that it?s good enough seems to me not quite good enough? Should we at least have a follow-up issue to address this? cheers From frederic.parain at oracle.com Mon Aug 29 16:20:46 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Mon, 29 Aug 2016 12:20:46 -0400 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: References: <5ea72a2f-87b5-c2b7-e05b-5ea5025ccadf@oracle.com> Message-ID: <2df15ea5-675e-48f9-030b-d08512ac72cb@oracle.com> On 08/29/2016 11:52 AM, Gerard Ziemski wrote: > hi Fred, > >> On Aug 26, 2016, at 4:53 PM, Frederic Parain >> wrote: >> >>> Is it really an undecidable problem? Why is that exactly? >> >> How would you compute the max stack size for any call to the VM? >> Just the matrix of all VM options that could impact the stack >> usage is huge: several GC, several JIT compilers, JVMTI hooks, >> JFR. The work to be performed by the JVM can also be dependent on >> the application (class hierarchy, application code itself which >> can be optimized (and deoptimized) in many different ways according >> to compilation policies and application behavior). >> >> This problem is not specific to the JVM. Linux has a similar issue >> with its kernel stacks: they have a fixed size, but there's no way >> to ensure that the size is sufficient to execute any system call >> or perform any OS operation. > > Absolutely, however, in light of the issue, now that we determined we > need to increase the number of shadow pages, it seems to me that > maybe we could take this opportunity and try to evaluate (somehow) > how many we actually need under some hypothetical load condition > with all the common options turned on, as an alternative way to > conservatively increasing them by 1. After all, like you said, when > the networking code was changed, we had to find a new default value > somehow, so it has bee done before. I don?t know when we set the > pages size last, but if it has been a while, then given all the new > features we probably added since then, again as you said JFR, GC > strategies etc., means we should probably re-evaluate this every now > and then? It?s just that increasing the pages by 1 and hoping > (admittedly backed up by testing) that it?s good enough seems to me > not quite good enough? Should we at least have a follow-up issue to > address this? This is what part of the fix does. Undersized stack shadow pages are easily detected on Unices because it causes crashes as soon as the JVM code hit the yellow zone. The Windows platform was more sensible to this issue because 1) stack shadow zone was smaller 2) stack overflows could happen silently when executing JVM code. With the new assert to detect stack overflows in JVM code on Windows, we will be able to detect during our testing when the default shadow zone is too small. Trying to determine a set of tests and configurations to use to test the deepest stack usage looks a waste of time to me (any code change could introduce a deeper stack usage). However, the new assert will be checked on every test ran with a debug build. I expect this to provide the wide coverage we need to estimate the JVM code stack requirements. There's a side discussion about adding a mechanism to measure stack consumption during every VM call, but so far, proposed designs are both complex and brittle. Adding such code to the main baseline would be a high risk compared to the issue it tries to solve. The sizing of the different special zones of the execution stacks is currently done with a trial and error method. I agree this is not an ideal solution, especially with all the new features being continuously added to the JVM, but we haven't a better solution to propose on the short term. Regards, Fred From gerard.ziemski at oracle.com Mon Aug 29 16:37:58 2016 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Mon, 29 Aug 2016 11:37:58 -0500 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: <2df15ea5-675e-48f9-030b-d08512ac72cb@oracle.com> References: <5ea72a2f-87b5-c2b7-e05b-5ea5025ccadf@oracle.com> <2df15ea5-675e-48f9-030b-d08512ac72cb@oracle.com> Message-ID: > On Aug 29, 2016, at 11:20 AM, Frederic Parain wrote: > > > > On 08/29/2016 11:52 AM, Gerard Ziemski wrote: >> hi Fred, >> >>> On Aug 26, 2016, at 4:53 PM, Frederic Parain >>> wrote: >>> >>>> Is it really an undecidable problem? Why is that exactly? >>> >>> How would you compute the max stack size for any call to the VM? >>> Just the matrix of all VM options that could impact the stack >>> usage is huge: several GC, several JIT compilers, JVMTI hooks, >>> JFR. The work to be performed by the JVM can also be dependent on >>> the application (class hierarchy, application code itself which >>> can be optimized (and deoptimized) in many different ways according >>> to compilation policies and application behavior). >>> >>> This problem is not specific to the JVM. Linux has a similar issue >>> with its kernel stacks: they have a fixed size, but there's no way >>> to ensure that the size is sufficient to execute any system call >>> or perform any OS operation. >> >> Absolutely, however, in light of the issue, now that we determined we >> need to increase the number of shadow pages, it seems to me that >> maybe we could take this opportunity and try to evaluate (somehow) >> how many we actually need under some hypothetical load condition >> with all the common options turned on, as an alternative way to >> conservatively increasing them by 1. After all, like you said, when >> the networking code was changed, we had to find a new default value >> somehow, so it has bee done before. I don?t know when we set the >> pages size last, but if it has been a while, then given all the new >> features we probably added since then, again as you said JFR, GC >> strategies etc., means we should probably re-evaluate this every now >> and then? It?s just that increasing the pages by 1 and hoping >> (admittedly backed up by testing) that it?s good enough seems to me >> not quite good enough? Should we at least have a follow-up issue to >> address this? > > > This is what part of the fix does. Undersized stack shadow pages are > easily detected on Unices because it causes crashes as soon as > the JVM code hit the yellow zone. The Windows platform was more > sensible to this issue because 1) stack shadow zone was smaller > 2) stack overflows could happen silently when executing JVM code. > With the new assert to detect stack overflows in JVM code on > Windows, we will be able to detect during our testing when the > default shadow zone is too small. > > Trying to determine a set of tests and configurations to use > to test the deepest stack usage looks a waste of time to me > (any code change could introduce a deeper stack usage). Right, but it has to be tracked somehow, and like you say next, we are attempting to do this. > However, the new assert will be checked on every test ran > with a debug build. I expect this to provide the wide > coverage we need to estimate the JVM code stack requirements. Good. > > There's a side discussion about adding a mechanism to measure > stack consumption during every VM call, but so far, proposed > designs are both complex and brittle. Adding such code to > the main baseline would be a high risk compared to the > issue it tries to solve. > > The sizing of the different special zones of the execution > stacks is currently done with a trial and error method. > I agree this is not an ideal solution, especially with all > the new features being continuously added to the JVM, but > we haven't a better solution to propose on the short term. If there is an existing issue or a document tracking this, would you mind adding it to this discussion as a reference for any future discussions? Thank you for answering my questions. cheers From kirill.zhaldybin at oracle.com Mon Aug 29 16:54:04 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Mon, 29 Aug 2016 19:54:04 +0300 Subject: RFR(XS): 8164743: Convert TestAsUtf8 to GTest In-Reply-To: References: Message-ID: Rachel, Thank you for review! Regards, Kirill On 26.08.2016 17:44, Rachel Protacio wrote: > Looks good to me too. > > Rachel > > > On 8/24/2016 12:45 PM, Kirill Zhaldybin wrote: >> Dear all, >> >> Could you please review this fix for 8164743? >> The test was converted to GTest. >> >> WebRev: >> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164743/webrev.00/ >> CR: https://bugs.openjdk.java.net/browse/JDK-8164743 >> >> Thank you. >> >> Regards, Kirill > From kim.barrett at oracle.com Mon Aug 29 17:53:58 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 29 Aug 2016 13:53:58 -0400 Subject: RFR (XS) 8161280 - assert failed: reference count underflow for symbol In-Reply-To: <57C380FA.6040202@oracle.com> References: <57BC1F10.2000104@oracle.com> <5487e45a-2123-9e83-9c36-fb46ad704fb3@oracle.com> <57BCF24B.3000407@oracle.com> <57BCF91D.8050900@oracle.com> <57BD056D.1020303@oracle.com> <682c54cc-ca7f-3560-2498-21ea646c0362@oracle.com> <57BD181D.30108@oracle.com> <8a7258d0-a834-ff08-3245-7a8703c528c4@oracle.com> <57BD4649.6000804@oracle.com> <4CF2AFA7-1705-4565-9430-6D91AC4E0235@oracle.com> <7B2BB977-9A59-4790-B060-3AF97D5F3927@oracle.com> <57BE3FF3.9000408@oracle.com> <3468A67D-34C8-44B6-8071-DD5C4BFDFD9A@oracle.com> <57BEABAB.1030503@oracle.com> <57C04549.4030401@oracle.com> <57C380FA.6040202@oracle.com> Message-ID: > On Aug 28, 2016, at 8:25 PM, Ioi Lam wrote: > > > > On 8/26/16 11:05 AM, Kim Barrett wrote: >>> On Aug 26, 2016, at 9:34 AM, Ioi Lam wrote: >>>>> Since I am doing something very specific (setting/extracting the top 16 bits of a jint), I am a bit hesitant to add routines for general shifting. globalDefinitions.hpp has these: >>>> The suggestion of using a wrapper was a "perhaps", and not really >>>> intended for you to deal with while addressing the problem at hand. >>>> Sorry I was confusing about that. >>>> >>>> If we do something along that line (as a separate project), I suggest >>>> we keep with the names we've already got for similar operations, >>>> e.g. use java_shift_{left,right} to be consistent with java_add and >>>> friends. >>> Hi Kim, >>> >>> Thanks for the clarification. >>> >>> My RBT tests passed, so I will check in the code as is in my last webrev using the >> and << operators. I'll leave the general problem of java_shift_left/right as a future improvement. >>> >>> Thanks >>> - Ioi >>> >> It seems my attempt at clarification has led to further confusion. >> >> Recent discussion in this thread has been focused on the right shift, >> e.g. assuming it "does the right thing". >> >> The left shift is *broken*. Recent versions of gcc *will* do >> something other than what is being expected. We've already seen >> reports of (and fixed) problems encountered by folks using gcc6 for >> exactly this sort of thing. See, for example, JDK-8157758. >> >> In the specific case at hand, the compiler can trivially prove that >> undefined behavior is being invoked, because of the constant -1 being >> passed to an inline function where the shift occurs. What it does from >> there is anyone's guess; gcc6 seems to be treating such things as >> unreachable code and optimizing accordingly. >> >> So not fixing the left shift is just leaving a land mine for someone >> else to step on. Please don't do that. > > Hi Kim, > > I've already pushed my changes, so I need to fix this in a separate bug ID. > Will something like this work? > > inline jshort Atomic::add(jshort add_value, volatile jshort* dest) { > - jint new_value = Atomic::add(add_value << 16, (volatile jint*)(dest-1)); > + jint int_add_value = jint(juint(add_value) << 16); No, that?s just a different path to undefined behavior. The large *positive* value resulting from juint(add_value) << 16 exceeds the positive jint range. The way to twiddle the bit representation is the reinterpret_cast of a reference trick used in the java_xxx functions in globalDefinitions.hpp. But simpler in this case, since we know the value ranges, is to just multiply by 1 << 16 rather than left-shifting by 16. And as Andrew pointed out, it looks like this might not be as urgent as I thought, since it seems gcc (even recent versions) is only treating left shift of negative in a problematic way in constant expression contexts. > + jint new_value = Atomic::add(int_add_value, (volatile jint*)(dest-1)); From serguei.spitsyn at oracle.com Mon Aug 29 18:46:05 2016 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 29 Aug 2016 11:46:05 -0700 Subject: RFR(S): 8163994: Nightly test crashed in jvmtiAllocate In-Reply-To: <3a4bb157-72da-61d6-5952-ed807a130344@oracle.com> References: <290c1d74-97ee-95d0-1b20-01d5aebdab48@oracle.com> <3a4bb157-72da-61d6-5952-ed807a130344@oracle.com> Message-ID: <57C482ED.2050804@oracle.com> Chris and David, We had a private discussion about this bug with Dmitry last week. I initially suggested to close it as a dup of JDK-8134103 but then agreed with a fix replacing crash symptom with AGENT_ERROR_INTERNAL. I still have some doubt if it makes sense, as it does not look as important. Now, it seems you also prefer to close this bug as a dup. But let's check your opinion on the Dmitry's reasoning below. Thanks, Serguei On 8/29/16 06:12, Dmitry Samersoff wrote: > Chris & David, > > JVMTI_ERROR_WRONG_PHASE problem is complicated and requires significant > work probably on both JDWP and JVMTI side. Serguei plan to do it as a > part of JDK-8134103 and not for JDK 9. > > So yes, we can close this one as a dup of JDK-8134103 - it has the same > root cause and should be addressed as the part of JDK-8134103 > (particularly, we have to cleanup ignore_vm_death logic) > > But the crash is observed only once in a nightly, so my intention is to > save us a bit of time next time when this situation happens. > > i.e. before the changes we get JVMTI_ERROR_WRONG_PHASE message and > *crash*, after the changes we get JVMTI_ERROR_WRONG_PHASE message > and AGENT_ERROR_INTERNAL message. > > > -Dmitry > > > > On 2016-08-29 09:43, Chris Plummer wrote: >> On 8/28/16 6:14 PM, David Holmes wrote: >>> On 27/08/2016 7:35 AM, Chris Plummer wrote: >>>> Hi Dmitry, >>>> >>>> Although the fix is addressing the specific issue described in the bug, >>>> what about the general issue of referencing gdata after a call to >>>> cbEarlyVMDeath(). Do more references to gdata need to be protected? >>>> >>>> Also, is there the possibility of a multi-threading race condition here? >>>> Could gdata be cleared by another thread after it is checked? >>> Certainly. This really isn't fixing anything just adding a bailout >>> check before the crashing code. We can still crash and not be any the >>> wiser as to why. >>> >>> Not sure I really see the point of doing this instead of closing this >>> as a dup of JDK-8134103 and fixing things properly. >> It it correct to say that Dmitry is fixing a bug exposed by JDK-8134103, >> or that he is temporarily working around something that is not a true >> bug, but is indirectly caused by JDK-8134103. I'm not sure, but the >> answer will dictate the correct course of action here. >> >> Chris >>> David >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 8/26/16 4:00 AM, Dmitry Samersoff wrote: >>>>> Everybody, >>>>> >>>>> Please review the fix. >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8163994/webrev.02/ >>>>> >>>>> *Problem* >>>>> >>>>> Under some circumstances, when JVMTI_ERROR_WRONG_PHASE(112) is >>>>> received, >>>>> jvmtiAllocate could be called after call to cbEarlyVMDeath. >>>>> >>>>> cbEarlyVMDeath set gdata->jvmti to NULL, so jvmtiAllocate crashes. >>>>> >>>>> The problem appears only once in nightly testing and I was not able to >>>>> reproduce it locally. >>>>> >>>>> *Solution* >>>>> >>>>> Guard added to jvmtiAllocate to get meaningful error message instead of >>>>> crash. >>>>> >>>>> These fix doesn't fix root cause - JVMTI_ERROR_WRONG_PHASE problem is >>>>> going to be addressed under JDK-8134103. >>>>> >>>>> -Dmitry >>>>> > From daniel.daugherty at oracle.com Mon Aug 29 20:41:57 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 29 Aug 2016 14:41:57 -0600 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: <9563ba26-ac59-a1ff-09ba-e48bc6b3690b@redhat.com> References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> <31c7aab6-2082-db8d-6598-e137ec4dbaf0@oracle.com> <9563ba26-ac59-a1ff-09ba-e48bc6b3690b@redhat.com> Message-ID: <8b1a0035-5a31-9a3e-d23d-20ec0f762c20@oracle.com> Looks good to me also. Dan On 8/23/16 6:04 AM, Zhengyu Gu wrote: > Thanks. Look good to me. > > > -Zhengyu > > > On 08/23/2016 03:48 AM, David Holmes wrote: >> Hi Zhengyu, >> >> On 22/08/2016 11:13 PM, Zhengyu Gu wrote: >>> Hi David, >>> >>> The changes look good to me. >> >> Thanks for the review! >> >>> Just a minor comment: >>> >>> I saw you made InstanceKlass::_array_klasses pointer "volatile", but >>> not >>> some other places. I know that probably it has not effort, but >>> should we >>> make all these pointers "volatile" just for consistency? >> >> Yes. I missed _methods_jmethod_ids in instanceKLass.hpp, and _next >> in classLoader.hpp - now fixed. >> >> Webrev updating in place. >> >> Thanks, >> David >> >>> Thanks, >>> >>> -Zhengyu >>> >>> >>> >>> On 08/22/2016 12:16 AM, David Holmes wrote: >>>> I went to push this and realized I hadn't hg add'ed the new >>>> >>>> src/share/vm/oops/arrayKlass.inline.hpp >>>> >>>> which is also missing from the webrev (but now updated in place). >>>> >>>> Thanks, >>>> David >>>> >>>> On 19/08/2016 12:02 PM, Daniel D. Daugherty wrote: >>>>> On 8/17/16 8:50 PM, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8158854 >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ >>>>> >>>>> src/share/vm/classfile/classLoader.hpp >>>>> No comments. >>>>> >>>>> src/share/vm/classfile/verifier.cpp >>>>> No comments. >>>>> >>>>> src/share/vm/oops/arrayKlass.hpp >>>>> No comments. >>>>> >>>>> src/share/vm/oops/instanceKlass.cpp >>>>> No comments. >>>>> >>>>> src/share/vm/oops/instanceKlass.hpp >>>>> No comments. >>>>> >>>>> src/share/vm/oops/instanceKlass.inline.hpp >>>>> No comments. >>>>> >>>>> src/share/vm/oops/objArrayKlass.cpp >>>>> No comments. >>>>> >>>>> src/share/vm/oops/typeArrayKlass.cpp >>>>> No comments. >>>>> >>>>> src/share/vm/runtime/vmStructs.cpp >>>>> No comments. >>>>> >>>>> Thumbs up. >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> >>>>>> Generally speaking release_store should be paired with >>>>>> load_acquire to >>>>>> ensure correct memory visibility and ordering in lock-free code >>>>>> (often >>>>>> the read path is what is lock-free). So based on some observations >>>>>> from earlier bug fixes this bug was intended to examine the use of >>>>>> release_store and see if we have the appropriate load_acquire as >>>>>> well. >>>>>> The bug report lists all of the cases that were examined - some >>>>>> clear >>>>>> cut correct, some complex correct, some fixed here and some split >>>>>> out >>>>>> into separate issues. >>>>>> >>>>>> Here's a summary of the actual changes in the webrev: >>>>>> >>>>>> src/share/vm/classfile/classLoader.hpp >>>>>> >>>>>> - next() accessor needs to use load_acquire. >>>>>> >>>>>> --- >>>>>> >>>>>> src/share/vm/classfile/verifier.cpp >>>>>> >>>>>> - load of _verify_byte_codes_fn needs to load_acquire to pair >>>>>> with use >>>>>> of release_store >>>>>> - release_store of _is_new_verify_byte_codes_fn is not needed >>>>>> >>>>>> --- >>>>>> >>>>>> src/share/vm/oops/arrayKlass.hpp >>>>>> src/share/vm/oops/instanceKlass.cpp >>>>>> src/share/vm/oops/instanceKlass.hpp >>>>>> src/share/vm/oops/instanceKlass.inline.hpp >>>>>> src/share/vm/oops/objArrayKlass.cpp >>>>>> src/share/vm/oops/typeArrayKlass.cpp >>>>>> >>>>>> The logic for storing dimensions values was using a storeStore >>>>>> barrier >>>>>> between the lower and higher dimensions. This is converted to use a >>>>>> release-store setter for higher-dimension, with paired load-acquire >>>>>> accessor. Plus the accessed fields are declared volatile. >>>>>> >>>>>> The methods_jmethod_ids_acquire() and its paired >>>>>> release_set_methods_jmethod_ids(), are moved to the .inline.hpp file >>>>>> where they belong. >>>>>> >>>>>> --- >>>>>> >>>>>> src/share/vm/runtime/vmStructs.cpp >>>>>> >>>>>> Updated declaration for _array_klasses now it is volatile. >>>>>> >>>>>> --- >>>>>> >>>>>> Thanks, >>>>>> David >>>>> >>> > From coleen.phillimore at oracle.com Mon Aug 29 20:44:30 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 29 Aug 2016 16:44:30 -0400 Subject: RFR(S): 8164738: Convert AltHashing_test to GTest In-Reply-To: <6f652dc9-a022-fbf3-f92e-545fadc0dd77@oracle.com> References: <94b3c417-47ed-d158-c166-84dc8536f199@oracle.com> <6f652dc9-a022-fbf3-f92e-545fadc0dd77@oracle.com> Message-ID: <2c1122cd-f186-8afd-9d99-f25dc2216484@oracle.com> This test conversion looks good. On 8/29/16 11:06 AM, Kirill Zhaldybin wrote: > David, > > Thank you for review! > > > On 26.08.2016 06:38, David Holmes wrote: >> On 25/08/2016 1:47 AM, Kirill Zhaldybin wrote: >>> Dear all, >>> >>> Could you please review this fix for 8164738? >> >> Seems okay. >> >>> To convert the test I added new friend class to AltHashing class so we >>> could access private member function static juint murmur3_32(const int* >>> data, int len). There are also few formating fixes. >> >> Any reason all the murmur functions shouldn't be public? I'm not a >> fan of friends. No big deal either way. > Well, I am not an author so I could only speculate that if > static juint murmur3_32(const int* data, int len); > static juint murmur3_32(juint seed, const int* data, int len); > > are used only from AltHashing class according to general "the less > visible the better" rule they were made private. > Yes, that's why the functions are private. In general, the tests should probably be made friends if they're going to use private functions rather than making the functions public for the rest of the JVM to use. thanks, Coleen > Regards, Kirill >> >> Thanks, >> David >> >>> >>> WebRev: >>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164738/webrev.00/ >>> CR: https://bugs.openjdk.java.net/browse/JDK-8164738 >>> >>> Regards, Kirill > From coleen.phillimore at oracle.com Mon Aug 29 20:47:56 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 29 Aug 2016 16:47:56 -0400 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: <9cf99e6f-ebc5-d7f6-3f82-3cb985790b32@oracle.com> References: <6f801fc0-4070-af01-42a9-8015830eec14@oracle.com> <9cf99e6f-ebc5-d7f6-3f82-3cb985790b32@oracle.com> Message-ID: On 8/29/16 10:45 AM, Frederic Parain wrote: > Hi Coleen, > > Thank you for your review, my answers are in-lined below. > > On 08/29/2016 09:25 AM, Coleen Phillimore wrote: >> >> Hi Fred, >> >> This is the clearest writeup of stack overflow handling that I've seen >> so far, and your fix seems good. I'm a bit worried about the >> assertion in >> >> http://cr.openjdk.java.net/~fparain/8137035/webrev.00/src/os/windows/vm/os_windows.cpp.udiff.html >> >> >> >> But I think it would be better to hit this in our testing rather than >> spurious stack overflow exceptions. > > Yes, the goal is really to catch undersized StackShadowPages values > during testing. But note that the behavior on product builds is > unchanged. > >> >> http://cr.openjdk.java.net/~fparain/8137035/webrev.00/src/cpu/x86/vm/globals_x86.hpp.udiff.html >> >> >> >> Should windows 32 bit stack shadow pages be increased also? Have we >> seen these on 32 bits also? > > There's a significant number of bug reports about stack overflow related > crashes on Windows (referring to more or less directly to JDK-8137035). > All the crashes I've been able to find so far were on Windows 64 bits. > So I've conservatively increase StackShadowPages on WIN64 only. > If it makes you more comfortable, I can also increase on WIN32. I agree that you should leave 32 bits alone if there were no reported problems on windows 32 bit. The cost of an extra stack page might be noticeable there, especially if unneeded. > >> http://cr.openjdk.java.net/~fparain/8137035/webrev.00/src/share/vm/runtime/interfaceSupport.hpp.udiff.html >> >> >> >> Lastly, I really prefer this code here rather than buried in the stack >> overflow throwing logic in the interpreter and compiler. As we talked >> about, I think this code should be cleaned up. I don't know why it was >> done this way other than that's where it was historically and we were >> afraid to change it without the extensive analysis that you've done. >> Maybe someone from the compiler team remembers? (cc'ed) > > This fix aims to make the VM more robust for JDK9. > The clean up of the stack overflow throwing logic is likely to be > much more intrusive, so it seems preferable to delay it to JDK10. > Yes. Can you file an RFE to clean this up to save us another reinvestigation of this code. thanks, Coleen > Thank you, > > Fred > >> On 8/26/16 4:00 PM, Frederic Parain wrote: >>> Hi, >>> >>> Please review this fix for bug JDK-8137035 >>> The bug is confidential but it is related to several VM crashes >>> that occurred on the Windows 64 bits platform in stack overflow >>> conditions. I've copied/pasted the analysis of the bug and the >>> description of the fix below. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~fparain/8137035/webrev.00/ >>> >>> Testing: JPRT (testset hotspot) and nsk.stress >>> >>> Thanks, >>> >>> Fred >>> >>> --------- >>> >>> All these crashes related to stack overflows on Windows have >>> presumably the same causes: >>> - an undersized StackShadowPages parameter >>> - the behavior of guard pages on Windows >>> - a flaw in Yellow Pages management >>> >>> These three factors combined together can lead to sporadic crashes of >>> the JVM when stack overflow conditions are encountered. >>> >>> All the crashes listed in this CR and in the related CR are almost >>> impossible to reproduce, which indicates that the issue only shows up >>> in some extreme or uncommon conditions. By design, the JVM crashes on >>> stack overflow only if the Red Zone (the last one in the execution >>> stack) is hit. Before the Red Zone, there's the Yellow Zone which is >>> here to detect and handle stack overflows in a nicer way (throwing a >>> StackOverflowError instead of crashing the process). If the Red zone >>> is hit, it means that the Yellow Zone was >>> disabled, and there's only two cases where the Yellow Zone is disabled: >>> >>> 1 - when a potential stack overflow is detected in Java code, in >>> this case the Yellow Zone is disabled during the generation of the >>> StackOverflowError and restored during the propagation of the >>> StackOverflowError >>> 2 - when a stack overflow occurs either in native code or in JVM >>> code, because there's anything else the JVM can do. >>> >>> In several crashes, the call stack doesn't show any special recursive >>> Java calls that could suggest the JVM is in case 1. But they show >>> relatively complex code paths inside JVM code (de-optimization or >>> class/symbol resolution), which suggests that case 2 occurred. >>> >>> The case of stack overflow in native code is straight forward: if the >>> Yellow Zone is hit, it is disabled, but when a JavaThread returns from >>> native code to Java code, the Yellow Zone is systematically re-enabled >>> (this is part of the native call wrapper >>> generated by the JVM). >>> >>> The case of stack overflow in JVM code is more problematic. The JVM >>> tries to avoid the case of stack overflow in VM code with the Shadow >>> Pages mechanism. Whenever a Java method is invoked, the JVM tries to >>> ensure that there's enough free stack space to execute the Java method >>> and *any call to the JVM code (or JDK native code) that could occur >>> during the execution of this method*. This check is performed by >>> banging (touching) n pages ahead on the execution stack, and n is set >>> to StackShadowPages. If the Yellow Zone is hit during the stack >>> banging, a StackOverflowError is thrown before the execution of the >>> first bytecode of the Java method. But this mechanism assumes that >>> StackShadowPages pages is big enough to cover *any call to the JVM*. >>> If this assumption is wrong, so >>> bad things happen. >>> >>> I ran experiments with tests for which stack overflow related crashes >>> were reported. I ran them with a JVM where the StackShadowPages value >>> was decreased by only 1 compared the usual default value. It was very >>> easy to reproduce stack overflow crashes. By instrumenting the JVM, it >>> appeared that some threads hit the Yellow Zone while having thread >>> state _thread_in_vm. Which means that in many cases, the margin >>> between the stack space provided by StackShadowPages and the real >>> stack usage while executing VM code is less than one page. And because >>> knowing the biggest stack requirement to execute any JVM code is an >>> undecidable problem, there's a high probability that some paths >>> require more stack space than StackShadowPages ensures. It is >>> important to notice >>> that Windows is the platform with the smallest default value for >>> StackShadowPages. >>> >>> So, an undersized StackShadowPages could cause the Yellow Zone to be >>> hit while executing JVM code. On Unices (Solaris, Linux, MacOSX), the >>> sanction is immediate: a SIGSEGV signal is sent, but because there's >>> no more free space on the execution stack, the signal handler cannot >>> be executed and the JVM process is killed. It's a crash without >>> hs_error file generation. >>> >>> On Windows, the story is different. Yellow Pages are marked with the >>> "Guard" bit. When a page with a Guard bit set is touched, the current >>> thread receives an exception, but before the exception handler is >>> executed, the OS remove the Guard bit from the page, so the page that >>> trigger the fault can be used to execute the signal handler. So on >>> Windows, when the Yellow Zone is hit while executing JVM code, the JVM >>> doesn't die like on Unices systems, but the signal handler is executed. >>> >>> The logic in the signal handler looks like this (simplified version): >>> >>> if thread touches the yellow zone: >>> if thread_in_java: >>> disable yellow pages >>> jump to code throwing StackOverflowError >>> // note: yellow pages will be re-enabled >>> // while unwinding the stack >>> else: >>> // thread_in_vm or thread_in_native >>> disable yellow pages >>> resume execution >>> else: >>> // Fatal red zone violation. >>> disable red pages >>> generate VM crash >>> >>> So, the signal handler disable the protection of the Yellow Pages and >>> resume JVM code execution. >>> >>> Eventually, the thread will return from the VM and will continue >>> executing Java code. But at this point, the yellow pages are still >>> disabled and there's no systematic check to ensure that Yellow Pages >>> are re-enabled when returning to Java. The only places where the JVM >>> checks if Yellow Pages need to be re-activated is when returning from >>> native code or in the exception propagation code (but not all paths >>> reactivate the Yellow Zone). >>> >>> Once the execution of Java code has resumed with the yellow zone >>> disabled, the thread is not protected any more against stack >>> overflows. The only remaining protection is the red zone, and if it is >>> hit, the VM will generate a crash report and die. Note that having >>> Yellow Zone de-activated makes the stack banging of StackShadowPages >>> inefficient. Stack banging relies on the Yellow Pages to be activated, >>> so touching them triggers a signal. If Yellow Pages are de-activated >>> (unprotected) no signal is sent, unless the stack banging hits the Red >>> Page, which triggers a VM crash with hs_error file generation. >>> >>> >>> To summarize: an undersized StackShadowPages on Windows can lead to a >>> JavaThread executing Java code with Yellow Pages disabled, which means >>> without any stack overflow protection except the Red Zone which is the >>> one triggering VM crashes with hs_error file generation. >>> >>> Note that the Yellow Pages can be "incidentally" re-activated by a >>> call to native code or by throwing an exception. Which could explain >>> why stack overflow crashes are not so frequent, the time window during >>> which Java code is executed without stack overflow protection might be >>> small for some applications. >>> >>> >>> Proposed fixes for this issue: >>> - increase StackShadowPages for the Windows platform >>> - add assertion is signal handler to detect thread hitting the >>> Yellow Zone while executing JVM code (to detect undersized >>> StackShadowPages during our testing) >>> - ensure Yellow Pages are activated when transitioning from >>> _thread_in_vm to _thread_in_java >>> >> From daniel.daugherty at oracle.com Mon Aug 29 22:26:25 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 29 Aug 2016 16:26:25 -0600 Subject: (S) RFR: 8157904: Atomic::cmpxchg for jbyte is missing a fence on initial failure In-Reply-To: <1981f29f-b96b-c00a-2aab-280937d275f4@oracle.com> References: <663c892a-adda-772c-4d89-6d4af4edb3ec@redhat.com> <4ea48968-2410-c3c6-324b-faafa61621dd@oracle.com> <8aa322f0-9f0f-0d63-6156-c4110fc9bbdc@oracle.com> <1981f29f-b96b-c00a-2aab-280937d275f4@oracle.com> Message-ID: <45c5d86b-7e44-7bda-7e99-3ac3105a57e4@oracle.com> On 8/23/16 11:21 PM, David Holmes wrote: > Hi Kim, > > Thanks for looking at this. > > Webrev updated in-place. Comments inline. Yes, I know this already pushed. Sorry for the late review. > http://cr.openjdk.java.net/~dholmes/8157904/webrev.v2/ src/share/vm/runtime/atomic.hpp No comments. src/share/vm/utilities/globalDefinitions.hpp No comments. Thumbs up! Thanks for adding more comments to make it more clear what this code is doing. Dan > > On 24/08/2016 6:25 AM, Kim Barrett wrote: >>> On Aug 23, 2016, at 4:55 AM, David Holmes >>> wrote: >>> >>> Hi Volker, Andrew, >>> >>> On 23/08/2016 12:27 AM, Volker Simonis wrote: >>>> Hi, >>>> >>>> I don't particularly like the const_casts as well. >>> >>> I would have thought this was exactly the kind of thing const_cast >>> was good for - avoiding the need to define multiple overloads to >>> deal with volatile, non-volatile, const etc. >>> >>>> Why not change pointer_delta to accept pointers to volatiles as well: >>>> >>>> pointer_delta(const volatile void* left, const volatile void* right, >>> >>> I can do that. I also have to make a similar change to >>> align_ptr_down. Now should I also change align_ptr_up for >>> consistency (though I note they are already inconsistent in that one >>> takes void* and one takes const void*) ? >>> >>> Alternative webrev at: >>> >>> http://cr.openjdk.java.net/~dholmes/8157904/webrev.v2/ >> >> ------------------------------------------------------------------------------ >> >> src/share/vm/runtime/atomic.hpp >> 155 assert(sizeof(jbyte) == 1, "assumption"); >> >> STATIC_ASSERT would be better here. > > Changed. > >> ------------------------------------------------------------------------------ >> >> src/share/vm/utilities/globalDefinitions.hpp >> 524 inline void* align_ptr_down(volatile void* ptr, size_t alignment) { >> 525 return (void*)align_size_down((intptr_t)ptr, >> (intptr_t)alignment); >> 526 } >> >> I think implicitly (to the caller of align_ptr_down) casting away >> volatile like this is a mistake. I disagree with the rationale for >> this change; stripping off volatile (or const) *should* be annoyingly >> in your face with a const_cast. > > Yep my bad - volatile in, volatile out: > > inline volatile void* align_ptr_down(volatile void* ptr, size_t > alignment) { > return (volatile void*)align_size_down((intptr_t)ptr, > (intptr_t)alignment); > } > > This also leads to a change to the static_cast to be "volatile jint*". > >> The addition of volatile to pointer_delta is not the same sort of >> thing. I think that change is good, except I agree with Volker that >> only the one version is needed. > > Fixed. I hadn't appreciated what Volker was saying about one version. > >> ------------------------------------------------------------------------------ >> >> >> Otherwise looks good to me. >> >> Regarding: >> >> Now should I also change align_ptr_up for consistency (though I note >> they are already inconsistent in that one takes void* and one takes >> const void*) ? >> >> I think there should be two overloads of each of these, one with const >> qualified argument and result, and one without const qualification for >> either. That way the result has the same const-ness as the argument. >> We could double the number of overloads by similarly dealing with >> volatile, but I doubt there are enough relevant callers for that to be >> worthwhile; just use const_cast to deal with volatile at the call >> sites. But this is all a different issue... > > Agreed - separate issue if when it becomes an issue. > > Thanks, > David > >> Another option would be to make the argument and result >> const-qualified, and make callers deal with the result, but there are >> probably enough call sites to make the second overload worthwhile. >> >> > From david.holmes at oracle.com Mon Aug 29 23:44:00 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Aug 2016 09:44:00 +1000 Subject: RFR: 8158854: Ensure release_store is paired with load_acquire in lock-free code In-Reply-To: <8b1a0035-5a31-9a3e-d23d-20ec0f762c20@oracle.com> References: <18d42b2b-f724-304d-f61e-9967c1730da8@oracle.com> <36113982-11b5-2852-8f9d-f466119e2308@oracle.com> <31c7aab6-2082-db8d-6598-e137ec4dbaf0@oracle.com> <9563ba26-ac59-a1ff-09ba-e48bc6b3690b@redhat.com> <8b1a0035-5a31-9a3e-d23d-20ec0f762c20@oracle.com> Message-ID: <39880c8f-7338-6168-7103-ffcefce6ab65@oracle.com> Thanks Dan! David On 30/08/2016 6:41 AM, Daniel D. Daugherty wrote: > Looks good to me also. > > Dan > > > On 8/23/16 6:04 AM, Zhengyu Gu wrote: >> Thanks. Look good to me. >> >> >> -Zhengyu >> >> >> On 08/23/2016 03:48 AM, David Holmes wrote: >>> Hi Zhengyu, >>> >>> On 22/08/2016 11:13 PM, Zhengyu Gu wrote: >>>> Hi David, >>>> >>>> The changes look good to me. >>> >>> Thanks for the review! >>> >>>> Just a minor comment: >>>> >>>> I saw you made InstanceKlass::_array_klasses pointer "volatile", but >>>> not >>>> some other places. I know that probably it has not effort, but >>>> should we >>>> make all these pointers "volatile" just for consistency? >>> >>> Yes. I missed _methods_jmethod_ids in instanceKLass.hpp, and _next >>> in classLoader.hpp - now fixed. >>> >>> Webrev updating in place. >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> >>>> -Zhengyu >>>> >>>> >>>> >>>> On 08/22/2016 12:16 AM, David Holmes wrote: >>>>> I went to push this and realized I hadn't hg add'ed the new >>>>> >>>>> src/share/vm/oops/arrayKlass.inline.hpp >>>>> >>>>> which is also missing from the webrev (but now updated in place). >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 19/08/2016 12:02 PM, Daniel D. Daugherty wrote: >>>>>> On 8/17/16 8:50 PM, David Holmes wrote: >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8158854 >>>>>>> >>>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8158854/webrev/ >>>>>> >>>>>> src/share/vm/classfile/classLoader.hpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/classfile/verifier.cpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/oops/arrayKlass.hpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/oops/instanceKlass.cpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/oops/instanceKlass.hpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/oops/instanceKlass.inline.hpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/oops/objArrayKlass.cpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/oops/typeArrayKlass.cpp >>>>>> No comments. >>>>>> >>>>>> src/share/vm/runtime/vmStructs.cpp >>>>>> No comments. >>>>>> >>>>>> Thumbs up. >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> Generally speaking release_store should be paired with >>>>>>> load_acquire to >>>>>>> ensure correct memory visibility and ordering in lock-free code >>>>>>> (often >>>>>>> the read path is what is lock-free). So based on some observations >>>>>>> from earlier bug fixes this bug was intended to examine the use of >>>>>>> release_store and see if we have the appropriate load_acquire as >>>>>>> well. >>>>>>> The bug report lists all of the cases that were examined - some >>>>>>> clear >>>>>>> cut correct, some complex correct, some fixed here and some split >>>>>>> out >>>>>>> into separate issues. >>>>>>> >>>>>>> Here's a summary of the actual changes in the webrev: >>>>>>> >>>>>>> src/share/vm/classfile/classLoader.hpp >>>>>>> >>>>>>> - next() accessor needs to use load_acquire. >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> src/share/vm/classfile/verifier.cpp >>>>>>> >>>>>>> - load of _verify_byte_codes_fn needs to load_acquire to pair >>>>>>> with use >>>>>>> of release_store >>>>>>> - release_store of _is_new_verify_byte_codes_fn is not needed >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> src/share/vm/oops/arrayKlass.hpp >>>>>>> src/share/vm/oops/instanceKlass.cpp >>>>>>> src/share/vm/oops/instanceKlass.hpp >>>>>>> src/share/vm/oops/instanceKlass.inline.hpp >>>>>>> src/share/vm/oops/objArrayKlass.cpp >>>>>>> src/share/vm/oops/typeArrayKlass.cpp >>>>>>> >>>>>>> The logic for storing dimensions values was using a storeStore >>>>>>> barrier >>>>>>> between the lower and higher dimensions. This is converted to use a >>>>>>> release-store setter for higher-dimension, with paired load-acquire >>>>>>> accessor. Plus the accessed fields are declared volatile. >>>>>>> >>>>>>> The methods_jmethod_ids_acquire() and its paired >>>>>>> release_set_methods_jmethod_ids(), are moved to the .inline.hpp file >>>>>>> where they belong. >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> src/share/vm/runtime/vmStructs.cpp >>>>>>> >>>>>>> Updated declaration for _array_klasses now it is volatile. >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>> >>>> >> > From david.holmes at oracle.com Tue Aug 30 03:05:06 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Aug 2016 13:05:06 +1000 Subject: RFR(S): 8163994: Nightly test crashed in jvmtiAllocate In-Reply-To: <57C482ED.2050804@oracle.com> References: <290c1d74-97ee-95d0-1b20-01d5aebdab48@oracle.com> <3a4bb157-72da-61d6-5952-ed807a130344@oracle.com> <57C482ED.2050804@oracle.com> Message-ID: <0a0a1dce-5ec0-05ca-4918-5ad0ae108938@oracle.com> Hi Serguei, On 30/08/2016 4:46 AM, serguei.spitsyn at oracle.com wrote: > Chris and David, > > We had a private discussion about this bug with Dmitry last week. > I initially suggested to close it as a dup of JDK-8134103 but then > agreed with a fix replacing crash symptom with AGENT_ERROR_INTERNAL. > I still have some doubt if it makes sense, as it does not look as > important. > > Now, it seems you also prefer to close this bug as a dup. > But let's check your opinion on the Dmitry's reasoning below. The problem is that the "fix" still doesn't guarantee that we will get the more informative AGENT_ERROR_INTERNAL. The whole situation is racy. But I won't block it and don't want to waste time arguing over it. So if Dmitry wants to proceed then you can count this as Reviewed. Thanks, David ----- > Thanks, > Serguei > > > On 8/29/16 06:12, Dmitry Samersoff wrote: >> Chris & David, >> >> JVMTI_ERROR_WRONG_PHASE problem is complicated and requires significant >> work probably on both JDWP and JVMTI side. Serguei plan to do it as a >> part of JDK-8134103 and not for JDK 9. >> >> So yes, we can close this one as a dup of JDK-8134103 - it has the same >> root cause and should be addressed as the part of JDK-8134103 >> (particularly, we have to cleanup ignore_vm_death logic) >> >> But the crash is observed only once in a nightly, so my intention is to >> save us a bit of time next time when this situation happens. >> >> i.e. before the changes we get JVMTI_ERROR_WRONG_PHASE message and >> *crash*, after the changes we get JVMTI_ERROR_WRONG_PHASE message >> and AGENT_ERROR_INTERNAL message. >> >> >> -Dmitry >> >> >> >> On 2016-08-29 09:43, Chris Plummer wrote: >>> On 8/28/16 6:14 PM, David Holmes wrote: >>>> On 27/08/2016 7:35 AM, Chris Plummer wrote: >>>>> Hi Dmitry, >>>>> >>>>> Although the fix is addressing the specific issue described in the >>>>> bug, >>>>> what about the general issue of referencing gdata after a call to >>>>> cbEarlyVMDeath(). Do more references to gdata need to be protected? >>>>> >>>>> Also, is there the possibility of a multi-threading race condition >>>>> here? >>>>> Could gdata be cleared by another thread after it is checked? >>>> Certainly. This really isn't fixing anything just adding a bailout >>>> check before the crashing code. We can still crash and not be any the >>>> wiser as to why. >>>> >>>> Not sure I really see the point of doing this instead of closing this >>>> as a dup of JDK-8134103 and fixing things properly. >>> It it correct to say that Dmitry is fixing a bug exposed by JDK-8134103, >>> or that he is temporarily working around something that is not a true >>> bug, but is indirectly caused by JDK-8134103. I'm not sure, but the >>> answer will dictate the correct course of action here. >>> >>> Chris >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 8/26/16 4:00 AM, Dmitry Samersoff wrote: >>>>>> Everybody, >>>>>> >>>>>> Please review the fix. >>>>>> >>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8163994/webrev.02/ >>>>>> >>>>>> *Problem* >>>>>> >>>>>> Under some circumstances, when JVMTI_ERROR_WRONG_PHASE(112) is >>>>>> received, >>>>>> jvmtiAllocate could be called after call to cbEarlyVMDeath. >>>>>> >>>>>> cbEarlyVMDeath set gdata->jvmti to NULL, so jvmtiAllocate crashes. >>>>>> >>>>>> The problem appears only once in nightly testing and I was not >>>>>> able to >>>>>> reproduce it locally. >>>>>> >>>>>> *Solution* >>>>>> >>>>>> Guard added to jvmtiAllocate to get meaningful error message >>>>>> instead of >>>>>> crash. >>>>>> >>>>>> These fix doesn't fix root cause - JVMTI_ERROR_WRONG_PHASE problem is >>>>>> going to be addressed under JDK-8134103. >>>>>> >>>>>> -Dmitry >>>>>> >> > From david.holmes at oracle.com Tue Aug 30 03:10:01 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Aug 2016 13:10:01 +1000 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: <1137f725-3f0e-5298-4d7c-1c3bd573b832@oracle.com> References: <64c09902-a570-1270-0e28-910cff34a139@oracle.com> <1137f725-3f0e-5298-4d7c-1c3bd573b832@oracle.com> Message-ID: Hi Fred, On 30/08/2016 12:37 AM, Frederic Parain wrote: > Hi David, > > Thank you for the review. > > A few comments in-lined below. > > On 08/28/2016 09:36 PM, David Holmes wrote: >> Hi Fred, >> >> On 27/08/2016 6:00 AM, Frederic Parain wrote: >>> Hi, >>> >>> Please review this fix for bug JDK-8137035 >>> The bug is confidential but it is related to several VM crashes >>> that occurred on the Windows 64 bits platform in stack overflow >>> conditions. I've copied/pasted the analysis of the bug and the >>> description of the fix below. >> >> The analysis and solution all seem reasonable. Though I do have to >> wonder how the failure to reenable the yellow zone when returning to >> Java would not cause far more problem, on all platforms. > > Running with Yellow Pages disabled clearly opens the door to random > crashes. Making the mechanism simpler and more robust would benefit > to all platforms. > >>> >>> Webrev: >>> http://cr.openjdk.java.net/~fparain/8137035/webrev.00/ >> >> src/os/windows/vm/os_windows.cpp >> >> While examining the thread state logic in the exception handler I >> noticed some pre-existing bugs: >> >> 2506 if (exception_code == EXCEPTION_ACCESS_VIOLATION) { >> 2507 JavaThread* thread = (JavaThread*) t; >> >> there is no check that t is in fact a JavaThread, or even that t is >> non-NULL. Such checks occur slightly later: > > I've investigated this issue, and it is currently harmless. > The casted pointer is only used to call a method requiring > a JavaThread* pointer and the only usage of its argument it's > a NULL check. Unfortunately, fixing this issue would require > to modify the prototype of os::is_memory_serialize_page() > and propagate the change across all platforms using it. > It's a wider scope fix than JDK-8137035. I was only expecting you to move (if the scoping allows it, else copy) the later: if (t != NULL && t->is_Java_thread()) { check. ;) Thanks, David > I've added a comment the unsafe cast in os_windows.cpp file, > highlighting the fact it was unsafe, and explaining why it > is currently harmless. > >> >> 2523 if (t != NULL && t->is_Java_thread()) { >> 2524 JavaThread* thread = (JavaThread*) t; >> >> This bug seems significant: >> >> 2566 if (thread->stack_guards_enabled()) { >> 2567 if (_thread_in_Java) { >> >> _thread_in_Java is an enum value not a variable so we will always >> execute this block! This code should be testing the local in_java >> variable. > > Good catch! Fixed. > > Updated webrev: > http://cr.openjdk.java.net/~fparain/8137035/webrev.01/index.html > > Thank you, > > Fred > >> Your changes seem fine in themselves. >> >> Thanks, >> David >> >> >>> Testing: JPRT (testset hotspot) and nsk.stress >>> >>> Thanks, >>> >>> Fred >>> >>> --------- >>> >>> All these crashes related to stack overflows on Windows have presumably >>> the same causes: >>> - an undersized StackShadowPages parameter >>> - the behavior of guard pages on Windows >>> - a flaw in Yellow Pages management >>> >>> These three factors combined together can lead to sporadic crashes of >>> the JVM when stack overflow conditions are encountered. >>> >>> All the crashes listed in this CR and in the related CR are almost >>> impossible to reproduce, which indicates that the issue only shows up in >>> some extreme or uncommon conditions. By design, the JVM crashes on stack >>> overflow only if the Red Zone (the last one in the execution stack) is >>> hit. Before the Red Zone, there's the Yellow Zone which is here to >>> detect and handle stack overflows in a nicer way (throwing a >>> StackOverflowError instead of crashing the process). If the Red zone is >>> hit, it means that the Yellow Zone was >>> disabled, and there's only two cases where the Yellow Zone is disabled: >>> >>> 1 - when a potential stack overflow is detected in Java code, in this >>> case the Yellow Zone is disabled during the generation of the >>> StackOverflowError and restored during the propagation of the >>> StackOverflowError >>> 2 - when a stack overflow occurs either in native code or in JVM code, >>> because there's anything else the JVM can do. >>> >>> In several crashes, the call stack doesn't show any special recursive >>> Java calls that could suggest the JVM is in case 1. But they show >>> relatively complex code paths inside JVM code (de-optimization or >>> class/symbol resolution), which suggests that case 2 occurred. >>> >>> The case of stack overflow in native code is straight forward: if the >>> Yellow Zone is hit, it is disabled, but when a JavaThread returns from >>> native code to Java code, the Yellow Zone is systematically re-enabled >>> (this is part of the native call wrapper >>> generated by the JVM). >>> >>> The case of stack overflow in JVM code is more problematic. The JVM >>> tries to avoid the case of stack overflow in VM code with the Shadow >>> Pages mechanism. Whenever a Java method is invoked, the JVM tries to >>> ensure that there's enough free stack space to execute the Java method >>> and *any call to the JVM code (or JDK native code) that could occur >>> during the execution of this method*. This check is performed by banging >>> (touching) n pages ahead on the execution stack, and n is set to >>> StackShadowPages. If the Yellow Zone is hit during the stack banging, a >>> StackOverflowError is thrown before the execution of the first bytecode >>> of the Java method. But this mechanism assumes that StackShadowPages >>> pages is big enough to cover *any call to the JVM*. If this assumption >>> is wrong, so >>> bad things happen. >>> >>> I ran experiments with tests for which stack overflow related crashes >>> were reported. I ran them with a JVM where the StackShadowPages value >>> was decreased by only 1 compared the usual default value. It was very >>> easy to reproduce stack overflow crashes. By instrumenting the JVM, it >>> appeared that some threads hit the Yellow Zone while having thread state >>> _thread_in_vm. Which means that in many cases, the margin between the >>> stack space provided by StackShadowPages and the real stack usage while >>> executing VM code is less than one page. And because knowing the biggest >>> stack requirement to execute any JVM code is an undecidable problem, >>> there's a high probability that some paths require more stack space than >>> StackShadowPages ensures. It is important to notice >>> that Windows is the platform with the smallest default value for >>> StackShadowPages. >>> >>> So, an undersized StackShadowPages could cause the Yellow Zone to be hit >>> while executing JVM code. On Unices (Solaris, Linux, MacOSX), the >>> sanction is immediate: a SIGSEGV signal is sent, but because there's no >>> more free space on the execution stack, the signal handler cannot be >>> executed and the JVM process is killed. It's a crash without hs_error >>> file generation. >>> >>> On Windows, the story is different. Yellow Pages are marked with the >>> "Guard" bit. When a page with a Guard bit set is touched, the current >>> thread receives an exception, but before the exception handler is >>> executed, the OS remove the Guard bit from the page, so the page that >>> trigger the fault can be used to execute the signal handler. So on >>> Windows, when the Yellow Zone is hit while executing JVM code, the JVM >>> doesn't die like on Unices systems, but the signal handler is executed. >>> >>> The logic in the signal handler looks like this (simplified version): >>> >>> if thread touches the yellow zone: >>> if thread_in_java: >>> disable yellow pages >>> jump to code throwing StackOverflowError >>> // note: yellow pages will be re-enabled >>> // while unwinding the stack >>> else: >>> // thread_in_vm or thread_in_native >>> disable yellow pages >>> resume execution >>> else: >>> // Fatal red zone violation. >>> disable red pages >>> generate VM crash >>> >>> So, the signal handler disable the protection of the Yellow Pages and >>> resume JVM code execution. >>> >>> Eventually, the thread will return from the VM and will continue >>> executing Java code. But at this point, the yellow pages are still >>> disabled and there's no systematic check to ensure that Yellow Pages are >>> re-enabled when returning to Java. The only places where the JVM checks >>> if Yellow Pages need to be re-activated is when returning from native >>> code or in the exception propagation code (but not all paths reactivate >>> the Yellow Zone). >>> >>> Once the execution of Java code has resumed with the yellow zone >>> disabled, the thread is not protected any more against stack overflows. >>> The only remaining protection is the red zone, and if it is hit, the VM >>> will generate a crash report and die. Note that having Yellow Zone >>> de-activated makes the stack banging of StackShadowPages inefficient. >>> Stack banging relies on the Yellow Pages to be activated, so touching >>> them triggers a signal. If Yellow Pages are de-activated (unprotected) >>> no signal is sent, unless the stack banging hits the Red Page, which >>> triggers a VM crash with hs_error file generation. >>> >>> >>> To summarize: an undersized StackShadowPages on Windows can lead to a >>> JavaThread executing Java code with Yellow Pages disabled, which means >>> without any stack overflow protection except the Red Zone which is the >>> one triggering VM crashes with hs_error file generation. >>> >>> Note that the Yellow Pages can be "incidentally" re-activated by a call >>> to native code or by throwing an exception. Which could explain why >>> stack overflow crashes are not so frequent, the time window during which >>> Java code is executed without stack overflow protection might be small >>> for some applications. >>> >>> >>> Proposed fixes for this issue: >>> - increase StackShadowPages for the Windows platform >>> - add assertion is signal handler to detect thread hitting the Yellow >>> Zone while executing JVM code (to detect undersized StackShadowPages >>> during our testing) >>> - ensure Yellow Pages are activated when transitioning from >>> _thread_in_vm to _thread_in_java >>> From david.holmes at oracle.com Tue Aug 30 03:50:43 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Aug 2016 13:50:43 +1000 Subject: RFR: 8157948: UL allows same log file with multiple file= In-Reply-To: References: <98d3f2fe-5937-07b7-e591-c1c7a0ea7a2f@oracle.com> <48d4161a-637f-5933-472b-012c9a562238@oracle.com> Message-ID: On 29/08/2016 10:47 PM, Marcus Larsson wrote: > Hi, > > > On 08/29/2016 03:45 AM, David Holmes wrote: >> Hi Marcus, >> >> On 26/08/2016 10:11 PM, Marcus Larsson wrote: >>> Hi David, >>> >>> Thanks for looking at this! >>> >>> New webrev: >>> http://cr.openjdk.java.net/~mlarsson/8157948/webrev.01/ >>> >>> Incremental: >>> http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00-01/ >> >> src/share/vm/logging/logConfiguration.cpp >> >> Why bother with implicit_output_prefix instead of using >> LogFileOutput::Prefix directly? You do use the latter in two places so >> I find the inconsistency strange. > > The two instances of direct usage of LogFileOutput::Prefix are not > related to the implicit prefix, which is why I don't use the constant > there. I wanted the implicit_output_prefix constant to improve > readability and make it easy to switch the default, should we ever want > to. I'm fine with removing it if you think that's better. It seems unlikely we would ever change this, but I'm fine either way. Thanks, David > >> >>> See replies below. >> >> Follow up below ... >> >>> >>> On 08/26/2016 03:44 AM, David Holmes wrote: >>>> Hi Marcus, >>>> >>>> We really need a better way to specify and verify these mini-grammars >>>> for command-line options. :( >>> >>> Yeah, I'm all for something like that. >>> >>>> >>>> On 25/08/2016 7:31 PM, Marcus Larsson wrote: >>>>> Hi, >>>>> >>>>> Please review the following patch to fix the issue where you could >>>>> have >>>>> the same file added twice as different log outputs in UL if it had the >>>>> "file=" prefix or if it was quoted. Log output names are now >>>>> normalized >>>>> during log argument parsing to ensure they are always normalized when >>>>> finding existing or adding new outputs. >>>> >>>> So does this mean that whereas today >>>> >>>> -Xlog:gc=debug:foo >>>> >>>> assumes foo is the log file, with this fix you will get an error? >>> >>> No, the file= prefix will be assumed just like before. The parse step >>> will now explicitly add it in the case that it wasn't specified. So >>> every LogFileOutput instance created will have the prefix in its name. >> >> Ok. >> >>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~mlarsson/8157948/webrev.00/ >>>> >>>> src/share/vm/logging/logFileOutput.cpp >>>> >>>> Suggestion: >>>> >>>> const char* prefix = "file="; >>>> assert(strstr(name, prefix) == name, "invalid output name '%s': >>>> missing prefix: %s", name, prefix); >>>> _file_name = make_file_name(name + strlen(prefix), _pid_str, >>>> _vm_start_time_str); >>> >>> Fixed, see below. >>> >>>> >>>> --- >>>> >>>> src/share/vm/logging/logConfiguration.cpp >>>> >>>> Suggestion: >>>> >>>> static const char* prefix = "file="; >>> >>> I've refactored all "file=" literals into constants, but I made the >>> constant a field of LogFileOutput. I think it fits better there, let me >>> know if you think otherwise. >> >> Placement is fine. >> >>>> >>>> In normalize_output_name it is hard for me to work out what the >>>> possible "grammar" is, or how different cases will be handled. >>>> Currently -Xlog:gc=debug:"file"=foo is treated as >>>> -Xlog:gc=debug:file=foo. But with your changes I think the quoting >>>> will be handled differently. >>> >>> Actually -Xlog:gc=debug:"file"=foo should give an error, since quoting >>> the output types isn't supported (only the name can be quoted). This >>> should just be a refactoring to make sure we're always managing the >>> output names in a uniform manner (so that file="foo" and file=foo isn't >>> treated as two different log outputs). >>> >>> BTW, take care if you're testing this on the command line, as the shell >>> might be stripping away quotes in the arguments for you. >> >> Yes you are right it was stripping them away - it is an error. > > Great! > > Thanks, > Marcus > >> >> Thanks, >> David >> >>> >>> Thanks, >>> Marcus >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> Issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8157948 >>>>> >>>>> Testing: >>>>> New unit test through JPRT >>>>> >>>>> Thanks, >>>>> Marcus >>> > From tobias.hartmann at oracle.com Tue Aug 30 06:50:07 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 30 Aug 2016 08:50:07 +0200 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> References: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> Message-ID: <57C52C9F.40301@oracle.com> Hi, this problem showed up in PIT: https://bugs.openjdk.java.net/browse/JDK-8164968 I assigned it to runtime - feel free to re-assign. Best regards, Tobias On 30.10.2015 12:52, Doerr, Martin wrote: > Hi Aleksey, > > we have seen JVM crashes when running the following test on SPARC: > org.openjdk.jcstress.tests.vjug.UnsafeAtomicityTest > > Maybe it is not supposed to run on platforms which don't support unaligned accesses? > > I see 2 problems: > > 1. The current implementation uses the version of UnsafeHolder.U.putInt(null, offset, 0xFFFFFFFF) (and getInt) which is designed to access object fields. Seems like the JVM is allowed to crash with SIGBUS if it is misused for unaligned accesses. The JVM is designed to catch SIGBUS only in the other version which only takes the address UnsafeHolder.U.putInt(offset, 0xFFFFFFFF). > > 2. The signal handler in os_solaris_sparc needs a fix to catch BUS_ADRALN as well. The part "&& info->si_code == BUS_OBJERR" of the condition "if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access())" should get removed as it was done on other platforms. > > With the problems fixed, it may be possible to catch the asynchronous exception which may get generated by the Unsafe access. The following stand-alone test program below can do it. > > Hope this is interesting for you. > Best regards, > Martin > > > > import sun.misc.Unsafe; > import sun.reflect.ReflectionFactory; > import java.lang.reflect.Constructor; > import java.lang.reflect.Field; > import java.lang.reflect.Modifier; > > public class TestUnsafe{ > > public static Unsafe getUnsafe() throws Exception { > Constructor unsafeConstructor = Unsafe.class.getDeclaredConstructor(); > unsafeConstructor.setAccessible(true); > return unsafeConstructor.newInstance(); > } > > private void test_unsafe(Unsafe u) { > long addr_raw = u.allocateMemory(1024); > long addr_misaligned = ((addr_raw + 512) & ~255) - 2; > //u.putInt(null, addr_misaligned, 0xFFFFFFFF); // crashes on some platforms with SIGBUS ADRALN > u.putInt(addr_misaligned, 0xFFFFFFFF); // should work on all platforms (catch SIGBUS if needed) > } > > public static void main(String args[]){ > TestUnsafe xyz = new TestUnsafe(); > Unsafe u; > try { > u = getUnsafe(); > } catch (Exception e) { > e.printStackTrace(); > return; > } > try { > xyz.test_unsafe(u); > System.gc(); > } catch (Error e) { > e.printStackTrace(); // did we catch an async exception reported by Unsafe? > } > } > > } > From david.holmes at oracle.com Tue Aug 30 07:24:29 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Aug 2016 17:24:29 +1000 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: <56337951.6020008@oracle.com> References: <56337951.6020008@oracle.com> Message-ID: Aleksey, It seems there was no follow up to this, neither from runtime side nor in terms of the test being fixed. Thanks, David Aleksey Shipilev aleksey.shipilev at oracle.com Fri Oct 30 14:06:09 UTC 2015 > I see. > > Yes, I would think intercepting SIGBUS and failing with InternalError is > a marginally better JVM behavior, especially if it says something like > "Unrecoverable misaligned unsafe access, bye". > > Let's see what runtime devs think about this? > > -Aleksey > > On 10/30/2015 04:02 PM, Doerr, Martin wrote: >> Hi Aleksey, >> >> exactly, the putInt(offset, 0xFFFFFFFF) improves the situation. It uses DEFINE_GETSETNATIVE which calls set_doing_unsafe_access before the access while the other one uses DEFINE_GETSETOOP which doesn't do that (see unsafe.cpp). >> >> The JVM will usually not run much longer after the SIGBUS was caught because set_pending_unsafe_access_error() gets called afterwards which should eventually lead to JVM exit with the asynchronous java.lang.InternalError exception (unless one catches it which is rather uncommon). >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com] >> Sent: Freitag, 30. Oktober 2015 13:06 >> To: Doerr, Martin >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: UnsafeAtomicityTest crashes on SPARC >> >> * PGP Signed by an unknown key >> >> Hi Martin, >> >> Thanks for a heads-up. >> >> On 10/30/2015 02:52 PM, Doerr, Martin wrote: >>> we have seen JVM crashes when running the following test on SPARC: >>> org.openjdk.jcstress.tests.vjug.UnsafeAtomicityTest >>> >>> Maybe it is not supposed to run on platforms which don't support >>> unaligned accesses? >> >> Yes, unaligned Unsafe access might crash on platforms that do not >> support unaligned accesses. The test should have checked >> Unsafe.unalignedAccess() and/or used Unsafe.putIntUnaligned. Both APIs >> are not available in JDK 8, though. >> >>> I see 2 problems: >>> >>> 1. The current implementation uses the version of >>> UnsafeHolder.U.putInt(null, offset, 0xFFFFFFFF) (and getInt) which is >>> designed to access object fields. Seems like the JVM is allowed to crash >>> with SIGBUS if it is misused for unaligned accesses. The JVM is designed >>> to catch SIGBUS only in the other version which only takes the address >>> UnsafeHolder.U.putInt(offset, 0xFFFFFFFF). >> >> This is an odd difference. So, nominally, making the test to use >> putInt(offset, 0xFFFFFFFF) avoids the issue? >> >> >>> 2. The signal handler in os_solaris_sparc needs a fix to catch >>> BUS_ADRALN as well. The part "&& info->si_code == BUS_OBJERR" of the >>> condition "if (sig == SIGBUS && info->si_code == BUS_OBJERR && >>> thread->doing_unsafe_access())" should get removed as it was done on >>> other platforms. >> >>> With the problems fixed, it may be possible to catch the asynchronous >>> exception which may get generated by the Unsafe access. The following >>> stand-alone test program below can do it. >> >> Yes, I think runtime folks might consider bullet-proofing this. Although >> I sometimes see the SIGBUS as a viable alternative for a creeping >> performance problem, at least in testing. (IIRC, some kernels are known >> to silently fix up this as well). >> >> Thanks, >> -Aleksey >> >> >> * Unknown Key >> * 0x62A119A7 >> From martin.doerr at sap.com Tue Aug 30 08:15:47 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 30 Aug 2016 08:15:47 +0000 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: <57C52C9F.40301@oracle.com> References: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> <57C52C9F.40301@oracle.com> Message-ID: <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> Hi Tobias, I think the problem is that JVM_handle_solaris_signal only catches SIGBUS with si_code BUS_OBJERR: if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access()) We have removed "&& info->si_code == BUS_OBJERR" in our JVM. It should also catch other types like BUS_ADRALN. Best regards, Martin -----Original Message----- From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] Sent: Dienstag, 30. August 2016 08:50 To: Doerr, Martin ; Aleksey Shipilev (aleksey.shipilev at oracle.com) Cc: hotspot-runtime-dev at openjdk.java.net Subject: Re: UnsafeAtomicityTest crashes on SPARC Hi, this problem showed up in PIT: https://bugs.openjdk.java.net/browse/JDK-8164968 I assigned it to runtime - feel free to re-assign. Best regards, Tobias On 30.10.2015 12:52, Doerr, Martin wrote: > Hi Aleksey, > > we have seen JVM crashes when running the following test on SPARC: > org.openjdk.jcstress.tests.vjug.UnsafeAtomicityTest > > Maybe it is not supposed to run on platforms which don't support unaligned accesses? > > I see 2 problems: > > 1. The current implementation uses the version of UnsafeHolder.U.putInt(null, offset, 0xFFFFFFFF) (and getInt) which is designed to access object fields. Seems like the JVM is allowed to crash with SIGBUS if it is misused for unaligned accesses. The JVM is designed to catch SIGBUS only in the other version which only takes the address UnsafeHolder.U.putInt(offset, 0xFFFFFFFF). > > 2. The signal handler in os_solaris_sparc needs a fix to catch BUS_ADRALN as well. The part "&& info->si_code == BUS_OBJERR" of the condition "if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access())" should get removed as it was done on other platforms. > > With the problems fixed, it may be possible to catch the asynchronous exception which may get generated by the Unsafe access. The following stand-alone test program below can do it. > > Hope this is interesting for you. > Best regards, > Martin > > > > import sun.misc.Unsafe; > import sun.reflect.ReflectionFactory; > import java.lang.reflect.Constructor; > import java.lang.reflect.Field; > import java.lang.reflect.Modifier; > > public class TestUnsafe{ > > public static Unsafe getUnsafe() throws Exception { > Constructor unsafeConstructor = Unsafe.class.getDeclaredConstructor(); > unsafeConstructor.setAccessible(true); > return unsafeConstructor.newInstance(); > } > > private void test_unsafe(Unsafe u) { > long addr_raw = u.allocateMemory(1024); > long addr_misaligned = ((addr_raw + 512) & ~255) - 2; > //u.putInt(null, addr_misaligned, 0xFFFFFFFF); // crashes on some platforms with SIGBUS ADRALN > u.putInt(addr_misaligned, 0xFFFFFFFF); // should work on all platforms (catch SIGBUS if needed) > } > > public static void main(String args[]){ > TestUnsafe xyz = new TestUnsafe(); > Unsafe u; > try { > u = getUnsafe(); > } catch (Exception e) { > e.printStackTrace(); > return; > } > try { > xyz.test_unsafe(u); > System.gc(); > } catch (Error e) { > e.printStackTrace(); // did we catch an async exception reported by Unsafe? > } > } > > } > From tobias.hartmann at oracle.com Tue Aug 30 08:52:05 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 30 Aug 2016 10:52:05 +0200 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> References: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> <57C52C9F.40301@oracle.com> <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> Message-ID: <57C54935.1060802@oracle.com> Hi Martin, On 30.08.2016 10:15, Doerr, Martin wrote: > Hi Tobias, > > I think the problem is that JVM_handle_solaris_signal only catches SIGBUS with si_code BUS_OBJERR: > if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access()) > > We have removed "&& info->si_code == BUS_OBJERR" in our JVM. It should also catch other types like BUS_ADRALN. That seems reasonable to me but the test would still need to catch the java.lang.InternalError, right? I leave this to the runtime team to decide. Thanks, Tobias > Best regards, > Martin > > > -----Original Message----- > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > Sent: Dienstag, 30. August 2016 08:50 > To: Doerr, Martin ; Aleksey Shipilev (aleksey.shipilev at oracle.com) > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: UnsafeAtomicityTest crashes on SPARC > > Hi, > > this problem showed up in PIT: > https://bugs.openjdk.java.net/browse/JDK-8164968 > > I assigned it to runtime - feel free to re-assign. > > Best regards, > Tobias > > On 30.10.2015 12:52, Doerr, Martin wrote: >> Hi Aleksey, >> >> we have seen JVM crashes when running the following test on SPARC: >> org.openjdk.jcstress.tests.vjug.UnsafeAtomicityTest >> >> Maybe it is not supposed to run on platforms which don't support unaligned accesses? >> >> I see 2 problems: >> >> 1. The current implementation uses the version of UnsafeHolder.U.putInt(null, offset, 0xFFFFFFFF) (and getInt) which is designed to access object fields. Seems like the JVM is allowed to crash with SIGBUS if it is misused for unaligned accesses. The JVM is designed to catch SIGBUS only in the other version which only takes the address UnsafeHolder.U.putInt(offset, 0xFFFFFFFF). >> >> 2. The signal handler in os_solaris_sparc needs a fix to catch BUS_ADRALN as well. The part "&& info->si_code == BUS_OBJERR" of the condition "if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access())" should get removed as it was done on other platforms. >> >> With the problems fixed, it may be possible to catch the asynchronous exception which may get generated by the Unsafe access. The following stand-alone test program below can do it. >> >> Hope this is interesting for you. >> Best regards, >> Martin >> >> >> >> import sun.misc.Unsafe; >> import sun.reflect.ReflectionFactory; >> import java.lang.reflect.Constructor; >> import java.lang.reflect.Field; >> import java.lang.reflect.Modifier; >> >> public class TestUnsafe{ >> >> public static Unsafe getUnsafe() throws Exception { >> Constructor unsafeConstructor = Unsafe.class.getDeclaredConstructor(); >> unsafeConstructor.setAccessible(true); >> return unsafeConstructor.newInstance(); >> } >> >> private void test_unsafe(Unsafe u) { >> long addr_raw = u.allocateMemory(1024); >> long addr_misaligned = ((addr_raw + 512) & ~255) - 2; >> //u.putInt(null, addr_misaligned, 0xFFFFFFFF); // crashes on some platforms with SIGBUS ADRALN >> u.putInt(addr_misaligned, 0xFFFFFFFF); // should work on all platforms (catch SIGBUS if needed) >> } >> >> public static void main(String args[]){ >> TestUnsafe xyz = new TestUnsafe(); >> Unsafe u; >> try { >> u = getUnsafe(); >> } catch (Exception e) { >> e.printStackTrace(); >> return; >> } >> try { >> xyz.test_unsafe(u); >> System.gc(); >> } catch (Error e) { >> e.printStackTrace(); // did we catch an async exception reported by Unsafe? >> } >> } >> >> } >> From martin.doerr at sap.com Tue Aug 30 09:10:11 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 30 Aug 2016 09:10:11 +0000 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: <57C54935.1060802@oracle.com> References: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> <57C52C9F.40301@oracle.com> <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> <57C54935.1060802@oracle.com> Message-ID: Hi Tobias, correct, the test will either need to get changed or disabled on SPARC. It was originally written for x86 and the code was illegal for platforms which can't access unaligned memory (as explained in my original email below under 1.). In the meantime, the unsafe implementation has changed and I think that the "GuardUnsafeAccess" in "put(T x)" should help. So yes, if the test is supposed to run on SPARC, the remaining part to fix is to catch the java.lang.InternalError. I think the signal handler part is a real bug which should get fixed. Best regards, Martin -----Original Message----- From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] Sent: Dienstag, 30. August 2016 10:52 To: Doerr, Martin ; Aleksey Shipilev (aleksey.shipilev at oracle.com) Cc: hotspot-runtime-dev at openjdk.java.net Subject: Re: UnsafeAtomicityTest crashes on SPARC Hi Martin, On 30.08.2016 10:15, Doerr, Martin wrote: > Hi Tobias, > > I think the problem is that JVM_handle_solaris_signal only catches SIGBUS with si_code BUS_OBJERR: > if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access()) > > We have removed "&& info->si_code == BUS_OBJERR" in our JVM. It should also catch other types like BUS_ADRALN. That seems reasonable to me but the test would still need to catch the java.lang.InternalError, right? I leave this to the runtime team to decide. Thanks, Tobias > Best regards, > Martin > > > -----Original Message----- > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > Sent: Dienstag, 30. August 2016 08:50 > To: Doerr, Martin ; Aleksey Shipilev (aleksey.shipilev at oracle.com) > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: UnsafeAtomicityTest crashes on SPARC > > Hi, > > this problem showed up in PIT: > https://bugs.openjdk.java.net/browse/JDK-8164968 > > I assigned it to runtime - feel free to re-assign. > > Best regards, > Tobias > > On 30.10.2015 12:52, Doerr, Martin wrote: >> Hi Aleksey, >> >> we have seen JVM crashes when running the following test on SPARC: >> org.openjdk.jcstress.tests.vjug.UnsafeAtomicityTest >> >> Maybe it is not supposed to run on platforms which don't support unaligned accesses? >> >> I see 2 problems: >> >> 1. The current implementation uses the version of UnsafeHolder.U.putInt(null, offset, 0xFFFFFFFF) (and getInt) which is designed to access object fields. Seems like the JVM is allowed to crash with SIGBUS if it is misused for unaligned accesses. The JVM is designed to catch SIGBUS only in the other version which only takes the address UnsafeHolder.U.putInt(offset, 0xFFFFFFFF). >> >> 2. The signal handler in os_solaris_sparc needs a fix to catch BUS_ADRALN as well. The part "&& info->si_code == BUS_OBJERR" of the condition "if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access())" should get removed as it was done on other platforms. >> >> With the problems fixed, it may be possible to catch the asynchronous exception which may get generated by the Unsafe access. The following stand-alone test program below can do it. >> >> Hope this is interesting for you. >> Best regards, >> Martin >> >> >> >> import sun.misc.Unsafe; >> import sun.reflect.ReflectionFactory; >> import java.lang.reflect.Constructor; >> import java.lang.reflect.Field; >> import java.lang.reflect.Modifier; >> >> public class TestUnsafe{ >> >> public static Unsafe getUnsafe() throws Exception { >> Constructor unsafeConstructor = Unsafe.class.getDeclaredConstructor(); >> unsafeConstructor.setAccessible(true); >> return unsafeConstructor.newInstance(); >> } >> >> private void test_unsafe(Unsafe u) { >> long addr_raw = u.allocateMemory(1024); >> long addr_misaligned = ((addr_raw + 512) & ~255) - 2; >> //u.putInt(null, addr_misaligned, 0xFFFFFFFF); // crashes on some platforms with SIGBUS ADRALN >> u.putInt(addr_misaligned, 0xFFFFFFFF); // should work on all platforms (catch SIGBUS if needed) >> } >> >> public static void main(String args[]){ >> TestUnsafe xyz = new TestUnsafe(); >> Unsafe u; >> try { >> u = getUnsafe(); >> } catch (Exception e) { >> e.printStackTrace(); >> return; >> } >> try { >> xyz.test_unsafe(u); >> System.gc(); >> } catch (Error e) { >> e.printStackTrace(); // did we catch an async exception reported by Unsafe? >> } >> } >> >> } >> From tobias.hartmann at oracle.com Tue Aug 30 09:32:40 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 30 Aug 2016 11:32:40 +0200 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: References: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> <57C52C9F.40301@oracle.com> <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> <57C54935.1060802@oracle.com> Message-ID: <57C552B8.70304@oracle.com> Hi Martin, On 30.08.2016 11:10, Doerr, Martin wrote: > Hi Tobias, > > correct, the test will either need to get changed or disabled on SPARC. It was originally written for x86 and the code was illegal for platforms which can't access unaligned memory (as explained in my original email below under 1.). > In the meantime, the unsafe implementation has changed and I think that the "GuardUnsafeAccess" in "put(T x)" should help. So yes, if the test is supposed to run on SPARC, the remaining part to fix is to catch the java.lang.InternalError. > > I think the signal handler part is a real bug which should get fixed. Okay, thanks for the clarification! I filed a bug to keep track of this: https://bugs.openjdk.java.net/browse/JDK-8165014 Best regards, Tobias > > Best regards, > Martin > > > -----Original Message----- > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > Sent: Dienstag, 30. August 2016 10:52 > To: Doerr, Martin ; Aleksey Shipilev (aleksey.shipilev at oracle.com) > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: UnsafeAtomicityTest crashes on SPARC > > Hi Martin, > > On 30.08.2016 10:15, Doerr, Martin wrote: >> Hi Tobias, >> >> I think the problem is that JVM_handle_solaris_signal only catches SIGBUS with si_code BUS_OBJERR: >> if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access()) >> >> We have removed "&& info->si_code == BUS_OBJERR" in our JVM. It should also catch other types like BUS_ADRALN. > > That seems reasonable to me but the test would still need to catch the java.lang.InternalError, right? > > I leave this to the runtime team to decide. > > Thanks, > Tobias > >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >> Sent: Dienstag, 30. August 2016 08:50 >> To: Doerr, Martin ; Aleksey Shipilev (aleksey.shipilev at oracle.com) >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: UnsafeAtomicityTest crashes on SPARC >> >> Hi, >> >> this problem showed up in PIT: >> https://bugs.openjdk.java.net/browse/JDK-8164968 >> >> I assigned it to runtime - feel free to re-assign. >> >> Best regards, >> Tobias >> >> On 30.10.2015 12:52, Doerr, Martin wrote: >>> Hi Aleksey, >>> >>> we have seen JVM crashes when running the following test on SPARC: >>> org.openjdk.jcstress.tests.vjug.UnsafeAtomicityTest >>> >>> Maybe it is not supposed to run on platforms which don't support unaligned accesses? >>> >>> I see 2 problems: >>> >>> 1. The current implementation uses the version of UnsafeHolder.U.putInt(null, offset, 0xFFFFFFFF) (and getInt) which is designed to access object fields. Seems like the JVM is allowed to crash with SIGBUS if it is misused for unaligned accesses. The JVM is designed to catch SIGBUS only in the other version which only takes the address UnsafeHolder.U.putInt(offset, 0xFFFFFFFF). >>> >>> 2. The signal handler in os_solaris_sparc needs a fix to catch BUS_ADRALN as well. The part "&& info->si_code == BUS_OBJERR" of the condition "if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access())" should get removed as it was done on other platforms. >>> >>> With the problems fixed, it may be possible to catch the asynchronous exception which may get generated by the Unsafe access. The following stand-alone test program below can do it. >>> >>> Hope this is interesting for you. >>> Best regards, >>> Martin >>> >>> >>> >>> import sun.misc.Unsafe; >>> import sun.reflect.ReflectionFactory; >>> import java.lang.reflect.Constructor; >>> import java.lang.reflect.Field; >>> import java.lang.reflect.Modifier; >>> >>> public class TestUnsafe{ >>> >>> public static Unsafe getUnsafe() throws Exception { >>> Constructor unsafeConstructor = Unsafe.class.getDeclaredConstructor(); >>> unsafeConstructor.setAccessible(true); >>> return unsafeConstructor.newInstance(); >>> } >>> >>> private void test_unsafe(Unsafe u) { >>> long addr_raw = u.allocateMemory(1024); >>> long addr_misaligned = ((addr_raw + 512) & ~255) - 2; >>> //u.putInt(null, addr_misaligned, 0xFFFFFFFF); // crashes on some platforms with SIGBUS ADRALN >>> u.putInt(addr_misaligned, 0xFFFFFFFF); // should work on all platforms (catch SIGBUS if needed) >>> } >>> >>> public static void main(String args[]){ >>> TestUnsafe xyz = new TestUnsafe(); >>> Unsafe u; >>> try { >>> u = getUnsafe(); >>> } catch (Exception e) { >>> e.printStackTrace(); >>> return; >>> } >>> try { >>> xyz.test_unsafe(u); >>> System.gc(); >>> } catch (Error e) { >>> e.printStackTrace(); // did we catch an async exception reported by Unsafe? >>> } >>> } >>> >>> } >>> From david.holmes at oracle.com Tue Aug 30 10:38:18 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Aug 2016 20:38:18 +1000 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> References: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> <57C52C9F.40301@oracle.com> <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> Message-ID: On 30/08/2016 6:15 PM, Doerr, Martin wrote: > Hi Tobias, > > I think the problem is that JVM_handle_solaris_signal only catches SIGBUS with si_code BUS_OBJERR: > if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access()) > > We have removed "&& info->si_code == BUS_OBJERR" in our JVM. It should also catch other types like BUS_ADRALN. I'm not yet convinced that we should be anticipating/supporting unaligned access from Unsafe. But I'll look into this more closely later this week. Thanks, David > Best regards, > Martin > > > -----Original Message----- > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > Sent: Dienstag, 30. August 2016 08:50 > To: Doerr, Martin ; Aleksey Shipilev (aleksey.shipilev at oracle.com) > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: UnsafeAtomicityTest crashes on SPARC > > Hi, > > this problem showed up in PIT: > https://bugs.openjdk.java.net/browse/JDK-8164968 > > I assigned it to runtime - feel free to re-assign. > > Best regards, > Tobias > > On 30.10.2015 12:52, Doerr, Martin wrote: >> Hi Aleksey, >> >> we have seen JVM crashes when running the following test on SPARC: >> org.openjdk.jcstress.tests.vjug.UnsafeAtomicityTest >> >> Maybe it is not supposed to run on platforms which don't support unaligned accesses? >> >> I see 2 problems: >> >> 1. The current implementation uses the version of UnsafeHolder.U.putInt(null, offset, 0xFFFFFFFF) (and getInt) which is designed to access object fields. Seems like the JVM is allowed to crash with SIGBUS if it is misused for unaligned accesses. The JVM is designed to catch SIGBUS only in the other version which only takes the address UnsafeHolder.U.putInt(offset, 0xFFFFFFFF). >> >> 2. The signal handler in os_solaris_sparc needs a fix to catch BUS_ADRALN as well. The part "&& info->si_code == BUS_OBJERR" of the condition "if (sig == SIGBUS && info->si_code == BUS_OBJERR && thread->doing_unsafe_access())" should get removed as it was done on other platforms. >> >> With the problems fixed, it may be possible to catch the asynchronous exception which may get generated by the Unsafe access. The following stand-alone test program below can do it. >> >> Hope this is interesting for you. >> Best regards, >> Martin >> >> >> >> import sun.misc.Unsafe; >> import sun.reflect.ReflectionFactory; >> import java.lang.reflect.Constructor; >> import java.lang.reflect.Field; >> import java.lang.reflect.Modifier; >> >> public class TestUnsafe{ >> >> public static Unsafe getUnsafe() throws Exception { >> Constructor unsafeConstructor = Unsafe.class.getDeclaredConstructor(); >> unsafeConstructor.setAccessible(true); >> return unsafeConstructor.newInstance(); >> } >> >> private void test_unsafe(Unsafe u) { >> long addr_raw = u.allocateMemory(1024); >> long addr_misaligned = ((addr_raw + 512) & ~255) - 2; >> //u.putInt(null, addr_misaligned, 0xFFFFFFFF); // crashes on some platforms with SIGBUS ADRALN >> u.putInt(addr_misaligned, 0xFFFFFFFF); // should work on all platforms (catch SIGBUS if needed) >> } >> >> public static void main(String args[]){ >> TestUnsafe xyz = new TestUnsafe(); >> Unsafe u; >> try { >> u = getUnsafe(); >> } catch (Exception e) { >> e.printStackTrace(); >> return; >> } >> try { >> xyz.test_unsafe(u); >> System.gc(); >> } catch (Error e) { >> e.printStackTrace(); // did we catch an async exception reported by Unsafe? >> } >> } >> >> } >> From martin.doerr at sap.com Tue Aug 30 10:42:37 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 30 Aug 2016 10:42:37 +0000 Subject: RFR(XXS): 8165014: Unaligned unsafe access should throw InternalError on Solaris Message-ID: <7d8057d9e08d4448aeaab6d35bf3768b@DEWDFE13DE14.global.corp.sap> Hi, as discussed in http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-August/020888.html, the signal handler on Solaris SPARC should throw a java.lang.InternalError when Unsafe accesses unaligned addresses. This is currently broken because the signal handler only accepts BUS_OBJERR which is not sufficient. Proposed fix: http://cr.openjdk.java.net/~mdoerr/8165014_sparcSIGBUS/webrev.00/ Please review. I will also need a sponsor. Best regards, Martin From martin.doerr at sap.com Tue Aug 30 10:51:27 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 30 Aug 2016 10:51:27 +0000 Subject: RFR(XXS): 8165018: Missing memory barrier for PPC64 in Unsafe_GetObjectVolatile Message-ID: Hi, we found that a memory barrier for PPC64 is missing in the current Unsafe implementation. get_volatile already contains the memory barrier for "support_IRIW_for_not_multiple_copy_atomic_cpu". The same is needed in Unsafe_GetObjectVolatile. Here's my webrev: http://cr.openjdk.java.net/~mdoerr/8165018_UnsafePPC64/webrev.00/ And while looking at it I wonder why Unsafe_GetObjectVolatile does not contain a G1 barrier like Unsafe_GetObject. Is it not possible to use the Volatile version to access the referent field of a Reference? Please review. As it is shared code, I will need a sponsor, please. Best regards, Martin From kirill.zhaldybin at oracle.com Tue Aug 30 12:13:23 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Tue, 30 Aug 2016 15:13:23 +0300 Subject: RFR(S): 8164738: Convert AltHashing_test to GTest In-Reply-To: <2c1122cd-f186-8afd-9d99-f25dc2216484@oracle.com> References: <94b3c417-47ed-d158-c166-84dc8536f199@oracle.com> <6f652dc9-a022-fbf3-f92e-545fadc0dd77@oracle.com> <2c1122cd-f186-8afd-9d99-f25dc2216484@oracle.com> Message-ID: <55146529-b86d-5964-7f4c-b4d7c16a0844@oracle.com> Coleen, Thank you for review! Regards, Kirill On 29.08.2016 23:44, Coleen Phillimore wrote: > > This test conversion looks good. > > On 8/29/16 11:06 AM, Kirill Zhaldybin wrote: >> David, >> >> Thank you for review! >> >> >> On 26.08.2016 06:38, David Holmes wrote: >>> On 25/08/2016 1:47 AM, Kirill Zhaldybin wrote: >>>> Dear all, >>>> >>>> Could you please review this fix for 8164738? >>> >>> Seems okay. >>> >>>> To convert the test I added new friend class to AltHashing class so we >>>> could access private member function static juint murmur3_32(const >>>> int* >>>> data, int len). There are also few formating fixes. >>> >>> Any reason all the murmur functions shouldn't be public? I'm not a >>> fan of friends. No big deal either way. >> Well, I am not an author so I could only speculate that if >> static juint murmur3_32(const int* data, int len); >> static juint murmur3_32(juint seed, const int* data, int len); >> >> are used only from AltHashing class according to general "the less >> visible the better" rule they were made private. >> > > Yes, that's why the functions are private. In general, the tests > should probably be made friends if they're going to use private > functions rather than making the functions public for the rest of the > JVM to use. > > thanks, > Coleen > >> Regards, Kirill >>> >>> Thanks, >>> David >>> >>>> >>>> WebRev: >>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8164738/webrev.00/ >>>> CR: https://bugs.openjdk.java.net/browse/JDK-8164738 >>>> >>>> Regards, Kirill >> > From harold.seigel at oracle.com Tue Aug 30 17:47:40 2016 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 30 Aug 2016 13:47:40 -0400 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... Message-ID: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> Hi, Please review this fix for JDK-8162412. This fix allows user properties that start with "-Djdk.module." unless they match any of the seven reserved system properties as follows: The JVM will ignore -D, -D.[*], and -D=[*] where is any one of these seven: jdk.module.addmods jdk.module.limitmods jdk.module.addexports jdk.module.addreads jdk.module.patch jdk.module.path jdk.module.upgrade.path JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ The fix was tested with the JCK Lang and VM tests, the hotpot, and java/lang, java/util and other JTreg tests, and the NSK non-colocated quick tests. Thanks, Harold From gerard.ziemski at oracle.com Tue Aug 30 19:05:22 2016 From: gerard.ziemski at oracle.com (Gerard Ziemski) Date: Tue, 30 Aug 2016 14:05:22 -0500 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> Message-ID: <24E3C73E-DFDA-4ADC-9E58-A05504F96FED@oracle.com> hi Harold, The code looks fine, I just have a question and tiny quibble: #1. This comment: 189 // Return true if property starts with "jdk.module." and its ensuing chars match 190 // any of the reserved module properties. 191 // property should be passed without the leading "-D". 192 bool Arguments::is_internal_module_property(const char* property) { about ?-D? refers to the property expected by "Arguments::is_internal_module_property? method, not how the user passes it to the VM? #2. The only tiny quibble would be to put: 200 matches_property_suffix(property_suffix, PATCH, PATCH_LEN) || right after 197 matches_property_suffix(property_suffix, ADDREADS, ADDREADS_LEN) || to match their declaration order: 170 #define ADDREADS "addreads" 171 #define ADDREADS_LEN 8 172 #define PATCH "patch" 173 #define PATCH_LEN 5 cheers > On Aug 30, 2016, at 12:47 PM, harold seigel wrote: > > Hi, > > Please review this fix for JDK-8162412. This fix allows user properties that start with "-Djdk.module." unless they match any of the seven reserved system properties as follows: > > The JVM will ignore -D, -D.[*], and > -D=[*] where is any one of these seven: > > jdk.module.addmods > jdk.module.limitmods > jdk.module.addexports > jdk.module.addreads > jdk.module.patch > jdk.module.path > jdk.module.upgrade.path > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 > > Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ > > The fix was tested with the JCK Lang and VM tests, the hotpot, and java/lang, java/util and other JTreg tests, and the NSK non-colocated quick tests. > > Thanks, Harold From harold.seigel at oracle.com Tue Aug 30 19:14:13 2016 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 30 Aug 2016 15:14:13 -0400 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: <24E3C73E-DFDA-4ADC-9E58-A05504F96FED@oracle.com> References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> <24E3C73E-DFDA-4ADC-9E58-A05504F96FED@oracle.com> Message-ID: Hi Gerard, Thanks for the review. The "-D" comment refers to the property expected by the "Arguments::is_internal_module_property" method. I can switch the order of PATCH and ADDREADS. Thanks! Harold On 8/30/2016 3:05 PM, Gerard Ziemski wrote: > hi Harold, > > The code looks fine, I just have a question and tiny quibble: > > #1. This comment: > > 189 // Return true if property starts with "jdk.module." and its ensuing chars match > 190 // any of the reserved module properties. > 191 // property should be passed without the leading "-D". > 192 bool Arguments::is_internal_module_property(const char* property) { > > about ?-D? refers to the property expected by "Arguments::is_internal_module_property? method, not how the user passes it to the VM? > > > #2. The only tiny quibble would be to put: > > 200 matches_property_suffix(property_suffix, PATCH, PATCH_LEN) || > > right after > > 197 matches_property_suffix(property_suffix, ADDREADS, ADDREADS_LEN) || > > to match their declaration order: > > 170 #define ADDREADS "addreads" > 171 #define ADDREADS_LEN 8 > 172 #define PATCH "patch" > 173 #define PATCH_LEN 5 > > > cheers > >> On Aug 30, 2016, at 12:47 PM, harold seigel wrote: >> >> Hi, >> >> Please review this fix for JDK-8162412. This fix allows user properties that start with "-Djdk.module." unless they match any of the seven reserved system properties as follows: >> >> The JVM will ignore -D, -D.[*], and >> -D=[*] where is any one of these seven: >> >> jdk.module.addmods >> jdk.module.limitmods >> jdk.module.addexports >> jdk.module.addreads >> jdk.module.patch >> jdk.module.path >> jdk.module.upgrade.path >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 >> >> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ >> >> The fix was tested with the JCK Lang and VM tests, the hotpot, and java/lang, java/util and other JTreg tests, and the NSK non-colocated quick tests. >> >> Thanks, Harold From dean.long at oracle.com Tue Aug 30 19:24:59 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 30 Aug 2016 12:24:59 -0700 Subject: RFR(S) 8156137: SIGSEGV in ReceiverTypeData::clean_weak_klass_links Message-ID: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> http://cr.openjdk.java.net/~dlong/8156137/webrev/ https://bugs.openjdk.java.net/browse/JDK-8156137 The problem: JVMTI RedefineClasses creates scratch classes to hold the old methods until they can be freed (they are no longer active in a thread stack). G1ConcurrentMark sees these scratch classes, but G1MarkSweep does not. The details: G1ConcurrentMark uses ClassLoaderDataGraphKlassIteratorAtomic to iterate over *all* classes and calls clean_weak_instanceklass_links to clean. G1MarkSweep (and other GCs) use Klass::clean_weak_klass_links() to iterate over *live, non-scratch* classes, and calls clean_weak_instanceklass_links to clean. Now the problem scenario is: 0: scratch class S has MethodData that references a class U that is going to be unloaded. 1: G1MarkSweep skips class S in Klass::clean_weak_klass_links, because scratch classes are not added to the class hierarchy tree. The full GC then frees the metadata for class U. Now the MethodData for S contains stale metadata. 2. When a later G1ConcurrentMark calls clean_weak_instanceklass_links on S, it will crash on the stale metadata. Solution: have Klass::clean_weak_klass_links() process these scratch classes that can be found on the "previous versions" list. Tested with bigapps/Kitchensink. dl From coleen.phillimore at oracle.com Tue Aug 30 20:57:29 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 30 Aug 2016 16:57:29 -0400 Subject: RFR(S) 8156137: SIGSEGV in ReceiverTypeData::clean_weak_klass_links In-Reply-To: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> References: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> Message-ID: This change looks good. Thank you for doing the analysis and figuring this out. Coleen On 8/30/16 3:24 PM, dean.long at oracle.com wrote: > http://cr.openjdk.java.net/~dlong/8156137/webrev/ > > https://bugs.openjdk.java.net/browse/JDK-8156137 > > The problem: JVMTI RedefineClasses creates scratch classes to hold the > old methods until they can be freed (they are no longer active in a > thread stack). G1ConcurrentMark sees these scratch classes, but > G1MarkSweep does not. > > The details: > > G1ConcurrentMark uses ClassLoaderDataGraphKlassIteratorAtomic to > iterate over *all* classes and calls clean_weak_instanceklass_links to > clean. > > G1MarkSweep (and other GCs) use Klass::clean_weak_klass_links() to > iterate over *live, non-scratch* classes, and calls > clean_weak_instanceklass_links to clean. > > Now the problem scenario is: > > 0: scratch class S has MethodData that references a class U that is > going to be unloaded. > > 1: G1MarkSweep skips class S in Klass::clean_weak_klass_links, because > scratch classes are not added to the class hierarchy tree. The full GC > then frees the metadata for class U. Now the MethodData for S contains > stale metadata. > > 2. When a later G1ConcurrentMark calls clean_weak_instanceklass_links > on S, it will crash on the stale metadata. > > Solution: have Klass::clean_weak_klass_links() process these scratch > classes that can be found on the "previous versions" list. > > Tested with bigapps/Kitchensink. > > dl > From dean.long at oracle.com Tue Aug 30 21:40:57 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 30 Aug 2016 14:40:57 -0700 Subject: RFR(S) 8156137: SIGSEGV in ReceiverTypeData::clean_weak_klass_links In-Reply-To: References: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> Message-ID: Thanks for the review. Do I need another review before pushing? dl On 8/30/16 1:57 PM, Coleen Phillimore wrote: > > This change looks good. Thank you for doing the analysis and figuring > this out. > Coleen > > On 8/30/16 3:24 PM, dean.long at oracle.com wrote: >> http://cr.openjdk.java.net/~dlong/8156137/webrev/ >> >> https://bugs.openjdk.java.net/browse/JDK-8156137 >> >> The problem: JVMTI RedefineClasses creates scratch classes to hold >> the old methods until they can be freed (they are no longer active in >> a thread stack). G1ConcurrentMark sees these scratch classes, but >> G1MarkSweep does not. >> >> The details: >> >> G1ConcurrentMark uses ClassLoaderDataGraphKlassIteratorAtomic to >> iterate over *all* classes and calls clean_weak_instanceklass_links >> to clean. >> >> G1MarkSweep (and other GCs) use Klass::clean_weak_klass_links() to >> iterate over *live, non-scratch* classes, and calls >> clean_weak_instanceklass_links to clean. >> >> Now the problem scenario is: >> >> 0: scratch class S has MethodData that references a class U that is >> going to be unloaded. >> >> 1: G1MarkSweep skips class S in Klass::clean_weak_klass_links, >> because scratch classes are not added to the class hierarchy tree. >> The full GC then frees the metadata for class U. Now the MethodData >> for S contains stale metadata. >> >> 2. When a later G1ConcurrentMark calls clean_weak_instanceklass_links >> on S, it will crash on the stale metadata. >> >> Solution: have Klass::clean_weak_klass_links() process these scratch >> classes that can be found on the "previous versions" list. >> >> Tested with bigapps/Kitchensink. >> >> dl >> > From coleen.phillimore at oracle.com Tue Aug 30 21:49:55 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 30 Aug 2016 17:49:55 -0400 Subject: RFR(S) 8156137: SIGSEGV in ReceiverTypeData::clean_weak_klass_links In-Reply-To: References: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> Message-ID: <74ecb32c-00f5-4a2e-ee6b-c9b9076edc2e@oracle.com> On 8/30/16 5:40 PM, dean.long at oracle.com wrote: > Thanks for the review. Do I need another review before pushing? Maybe Dan or Serguei could look at it too. Coleen > > dl > > > On 8/30/16 1:57 PM, Coleen Phillimore wrote: >> >> This change looks good. Thank you for doing the analysis and >> figuring this out. >> Coleen >> >> On 8/30/16 3:24 PM, dean.long at oracle.com wrote: >>> http://cr.openjdk.java.net/~dlong/8156137/webrev/ >>> >>> https://bugs.openjdk.java.net/browse/JDK-8156137 >>> >>> The problem: JVMTI RedefineClasses creates scratch classes to hold >>> the old methods until they can be freed (they are no longer active >>> in a thread stack). G1ConcurrentMark sees these scratch classes, >>> but G1MarkSweep does not. >>> >>> The details: >>> >>> G1ConcurrentMark uses ClassLoaderDataGraphKlassIteratorAtomic to >>> iterate over *all* classes and calls clean_weak_instanceklass_links >>> to clean. >>> >>> G1MarkSweep (and other GCs) use Klass::clean_weak_klass_links() to >>> iterate over *live, non-scratch* classes, and calls >>> clean_weak_instanceklass_links to clean. >>> >>> Now the problem scenario is: >>> >>> 0: scratch class S has MethodData that references a class U that is >>> going to be unloaded. >>> >>> 1: G1MarkSweep skips class S in Klass::clean_weak_klass_links, >>> because scratch classes are not added to the class hierarchy tree. >>> The full GC then frees the metadata for class U. Now the MethodData >>> for S contains stale metadata. >>> >>> 2. When a later G1ConcurrentMark calls >>> clean_weak_instanceklass_links on S, it will crash on the stale >>> metadata. >>> >>> Solution: have Klass::clean_weak_klass_links() process these scratch >>> classes that can be found on the "previous versions" list. >>> >>> Tested with bigapps/Kitchensink. >>> >>> dl >>> >> > From frederic.parain at oracle.com Tue Aug 30 23:04:45 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Tue, 30 Aug 2016 19:04:45 -0400 Subject: RFR(S): JDK-8137035 tests got EXCEPTION_STACK_OVERFLOW on Windows 64 bit In-Reply-To: References: <64c09902-a570-1270-0e28-910cff34a139@oracle.com> <1137f725-3f0e-5298-4d7c-1c3bd573b832@oracle.com> Message-ID: <207a6f0b-330d-2034-12b1-b22f20eb01b4@oracle.com> Hi David, On 08/29/2016 11:10 PM, David Holmes wrote: > Hi Fred, >>> While examining the thread state logic in the exception handler I >>> noticed some pre-existing bugs: >>> >>> 2506 if (exception_code == EXCEPTION_ACCESS_VIOLATION) { >>> 2507 JavaThread* thread = (JavaThread*) t; >>> >>> there is no check that t is in fact a JavaThread, or even that t is >>> non-NULL. Such checks occur slightly later: >> >> I've investigated this issue, and it is currently harmless. >> The casted pointer is only used to call a method requiring >> a JavaThread* pointer and the only usage of its argument it's >> a NULL check. Unfortunately, fixing this issue would require >> to modify the prototype of os::is_memory_serialize_page() >> and propagate the change across all platforms using it. >> It's a wider scope fix than JDK-8137035. > > I was only expecting you to move (if the scoping allows it, else copy) > the later: The scoping doesn't allow his move, and adding the test would potentially change the behavior (I'm expecting that only JavaThreads could be blocked on the serialization page, but I might be wrong). Fred > if (t != NULL && t->is_Java_thread()) { > > check. ;) > > Thanks, > David > >> I've added a comment the unsafe cast in os_windows.cpp file, >> highlighting the fact it was unsafe, and explaining why it >> is currently harmless. >> >>> >>> 2523 if (t != NULL && t->is_Java_thread()) { >>> 2524 JavaThread* thread = (JavaThread*) t; >>> From serguei.spitsyn at oracle.com Wed Aug 31 00:54:23 2016 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 30 Aug 2016 17:54:23 -0700 Subject: RFR(S) 8156137: SIGSEGV in ReceiverTypeData::clean_weak_klass_links In-Reply-To: <74ecb32c-00f5-4a2e-ee6b-c9b9076edc2e@oracle.com> References: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> <74ecb32c-00f5-4a2e-ee6b-c9b9076edc2e@oracle.com> Message-ID: <57C62ABF.1010103@oracle.com> Hi Dean, It looks good. Nice catch! Thanks, Serguei On 8/30/16 14:49, Coleen Phillimore wrote: > > > On 8/30/16 5:40 PM, dean.long at oracle.com wrote: >> Thanks for the review. Do I need another review before pushing? > > Maybe Dan or Serguei could look at it too. > > Coleen > >> >> dl >> >> >> On 8/30/16 1:57 PM, Coleen Phillimore wrote: >>> >>> This change looks good. Thank you for doing the analysis and >>> figuring this out. >>> Coleen >>> >>> On 8/30/16 3:24 PM, dean.long at oracle.com wrote: >>>> http://cr.openjdk.java.net/~dlong/8156137/webrev/ >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8156137 >>>> >>>> The problem: JVMTI RedefineClasses creates scratch classes to hold >>>> the old methods until they can be freed (they are no longer active >>>> in a thread stack). G1ConcurrentMark sees these scratch classes, >>>> but G1MarkSweep does not. >>>> >>>> The details: >>>> >>>> G1ConcurrentMark uses ClassLoaderDataGraphKlassIteratorAtomic to >>>> iterate over *all* classes and calls clean_weak_instanceklass_links >>>> to clean. >>>> >>>> G1MarkSweep (and other GCs) use Klass::clean_weak_klass_links() to >>>> iterate over *live, non-scratch* classes, and calls >>>> clean_weak_instanceklass_links to clean. >>>> >>>> Now the problem scenario is: >>>> >>>> 0: scratch class S has MethodData that references a class U that is >>>> going to be unloaded. >>>> >>>> 1: G1MarkSweep skips class S in Klass::clean_weak_klass_links, >>>> because scratch classes are not added to the class hierarchy tree. >>>> The full GC then frees the metadata for class U. Now the MethodData >>>> for S contains stale metadata. >>>> >>>> 2. When a later G1ConcurrentMark calls >>>> clean_weak_instanceklass_links on S, it will crash on the stale >>>> metadata. >>>> >>>> Solution: have Klass::clean_weak_klass_links() process these >>>> scratch classes that can be found on the "previous versions" list. >>>> >>>> Tested with bigapps/Kitchensink. >>>> >>>> dl >>>> >>> >> > From dean.long at oracle.com Wed Aug 31 01:23:39 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 30 Aug 2016 18:23:39 -0700 Subject: RFR(S) 8156137: SIGSEGV in ReceiverTypeData::clean_weak_klass_links In-Reply-To: <74ecb32c-00f5-4a2e-ee6b-c9b9076edc2e@oracle.com> References: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> <74ecb32c-00f5-4a2e-ee6b-c9b9076edc2e@oracle.com> Message-ID: Adding hotspot-dev at openjdk.java.net. dl On 8/30/16 2:49 PM, Coleen Phillimore wrote: > > > On 8/30/16 5:40 PM, dean.long at oracle.com wrote: >> Thanks for the review. Do I need another review before pushing? > > Maybe Dan or Serguei could look at it too. > > Coleen > >> >> dl >> >> >> On 8/30/16 1:57 PM, Coleen Phillimore wrote: >>> >>> This change looks good. Thank you for doing the analysis and >>> figuring this out. >>> Coleen >>> >>> On 8/30/16 3:24 PM, dean.long at oracle.com wrote: >>>> http://cr.openjdk.java.net/~dlong/8156137/webrev/ >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8156137 >>>> >>>> The problem: JVMTI RedefineClasses creates scratch classes to hold >>>> the old methods until they can be freed (they are no longer active >>>> in a thread stack). G1ConcurrentMark sees these scratch classes, >>>> but G1MarkSweep does not. >>>> >>>> The details: >>>> >>>> G1ConcurrentMark uses ClassLoaderDataGraphKlassIteratorAtomic to >>>> iterate over *all* classes and calls clean_weak_instanceklass_links >>>> to clean. >>>> >>>> G1MarkSweep (and other GCs) use Klass::clean_weak_klass_links() to >>>> iterate over *live, non-scratch* classes, and calls >>>> clean_weak_instanceklass_links to clean. >>>> >>>> Now the problem scenario is: >>>> >>>> 0: scratch class S has MethodData that references a class U that is >>>> going to be unloaded. >>>> >>>> 1: G1MarkSweep skips class S in Klass::clean_weak_klass_links, >>>> because scratch classes are not added to the class hierarchy tree. >>>> The full GC then frees the metadata for class U. Now the MethodData >>>> for S contains stale metadata. >>>> >>>> 2. When a later G1ConcurrentMark calls >>>> clean_weak_instanceklass_links on S, it will crash on the stale >>>> metadata. >>>> >>>> Solution: have Klass::clean_weak_klass_links() process these >>>> scratch classes that can be found on the "previous versions" list. >>>> >>>> Tested with bigapps/Kitchensink. >>>> >>>> dl >>>> >>> >> > From dean.long at oracle.com Wed Aug 31 01:25:34 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 30 Aug 2016 18:25:34 -0700 Subject: RFR(S) 8156137: SIGSEGV in ReceiverTypeData::clean_weak_klass_links In-Reply-To: <57C62ABF.1010103@oracle.com> References: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> <74ecb32c-00f5-4a2e-ee6b-c9b9076edc2e@oracle.com> <57C62ABF.1010103@oracle.com> Message-ID: <4ab7db7f-8741-4597-8d0b-8a62bf91273a@oracle.com> Thanks Serguei! dl On 8/30/16 5:54 PM, serguei.spitsyn at oracle.com wrote: > Hi Dean, > > It looks good. > Nice catch! > > Thanks, > Serguei > > > On 8/30/16 14:49, Coleen Phillimore wrote: >> >> >> On 8/30/16 5:40 PM, dean.long at oracle.com wrote: >>> Thanks for the review. Do I need another review before pushing? >> >> Maybe Dan or Serguei could look at it too. >> >> Coleen >> >>> >>> dl >>> >>> >>> On 8/30/16 1:57 PM, Coleen Phillimore wrote: >>>> >>>> This change looks good. Thank you for doing the analysis and >>>> figuring this out. >>>> Coleen >>>> >>>> On 8/30/16 3:24 PM, dean.long at oracle.com wrote: >>>>> http://cr.openjdk.java.net/~dlong/8156137/webrev/ >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8156137 >>>>> >>>>> The problem: JVMTI RedefineClasses creates scratch classes to hold >>>>> the old methods until they can be freed (they are no longer active >>>>> in a thread stack). G1ConcurrentMark sees these scratch classes, >>>>> but G1MarkSweep does not. >>>>> >>>>> The details: >>>>> >>>>> G1ConcurrentMark uses ClassLoaderDataGraphKlassIteratorAtomic to >>>>> iterate over *all* classes and calls >>>>> clean_weak_instanceklass_links to clean. >>>>> >>>>> G1MarkSweep (and other GCs) use Klass::clean_weak_klass_links() to >>>>> iterate over *live, non-scratch* classes, and calls >>>>> clean_weak_instanceklass_links to clean. >>>>> >>>>> Now the problem scenario is: >>>>> >>>>> 0: scratch class S has MethodData that references a class U that >>>>> is going to be unloaded. >>>>> >>>>> 1: G1MarkSweep skips class S in Klass::clean_weak_klass_links, >>>>> because scratch classes are not added to the class hierarchy tree. >>>>> The full GC then frees the metadata for class U. Now the >>>>> MethodData for S contains stale metadata. >>>>> >>>>> 2. When a later G1ConcurrentMark calls >>>>> clean_weak_instanceklass_links on S, it will crash on the stale >>>>> metadata. >>>>> >>>>> Solution: have Klass::clean_weak_klass_links() process these >>>>> scratch classes that can be found on the "previous versions" list. >>>>> >>>>> Tested with bigapps/Kitchensink. >>>>> >>>>> dl >>>>> >>>> >>> >> > From david.holmes at oracle.com Wed Aug 31 06:18:38 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Aug 2016 16:18:38 +1000 Subject: RFR(XXS): 8165014: Unaligned unsafe access should throw InternalError on Solaris In-Reply-To: <7d8057d9e08d4448aeaab6d35bf3768b@DEWDFE13DE14.global.corp.sap> References: <7d8057d9e08d4448aeaab6d35bf3768b@DEWDFE13DE14.global.corp.sap> Message-ID: <89235bd2-0920-5ee1-5d65-0e86d2acf823@oracle.com> Hi Martin, On 30/08/2016 8:42 PM, Doerr, Martin wrote: > Hi, > > as discussed in http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-August/020888.html, > the signal handler on Solaris SPARC should throw a java.lang.InternalError when Unsafe accesses unaligned addresses. > This is currently broken because the signal handler only accepts BUS_OBJERR which is not sufficient. > > Proposed fix: > http://cr.openjdk.java.net/~mdoerr/8165014_sparcSIGBUS/webrev.00/ > > Please review. I will also need a sponsor. I can sponsor this for you. I was unsure whether we should in fact cater for unaligned accesses, but causing InternalError to be thrown instead of crashing is somewhat cleaner. Reviewed. Thanks, David > Best regards, > Martin > > From marcus.larsson at oracle.com Wed Aug 31 07:48:32 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Wed, 31 Aug 2016 09:48:32 +0200 Subject: RFR(XS): 8164939: GTest LogDecorations.iso8601_time_test fails on macOS Message-ID: <0427bed2-6a5d-3d94-4b67-29d4a92fa418@oracle.com> Hi, Please review the following patch fixing a daylight savings issue with mktime/localtime in the unit tests on Mac OSX. Avoiding the use of 'daylight' and instead letting mktime handle DST seems to solve the issue, and still works as intended on all the other platforms. Webrev: http://cr.openjdk.java.net/~mlarsson/8164939/webrev.00/ Issue: https://bugs.openjdk.java.net/browse/JDK-8164939 Testing: Manually verified on a previously failing host. Unit tests through JPRT. Thanks, Marcus From martin.doerr at sap.com Wed Aug 31 09:06:26 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 31 Aug 2016 09:06:26 +0000 Subject: RFR(XXS): 8165014: Unaligned unsafe access should throw InternalError on Solaris In-Reply-To: <89235bd2-0920-5ee1-5d65-0e86d2acf823@oracle.com> References: <7d8057d9e08d4448aeaab6d35bf3768b@DEWDFE13DE14.global.corp.sap> <89235bd2-0920-5ee1-5d65-0e86d2acf823@oracle.com> Message-ID: <702a5f409df74419ad9353e6a8a69539@DEWDFE13DE14.global.corp.sap> Hi David, thank you very much for the review and for sponsoring. Best regards, Martin -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Mittwoch, 31. August 2016 08:19 To: Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(XXS): 8165014: Unaligned unsafe access should throw InternalError on Solaris Hi Martin, On 30/08/2016 8:42 PM, Doerr, Martin wrote: > Hi, > > as discussed in http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-August/020888.html, > the signal handler on Solaris SPARC should throw a java.lang.InternalError when Unsafe accesses unaligned addresses. > This is currently broken because the signal handler only accepts BUS_OBJERR which is not sufficient. > > Proposed fix: > http://cr.openjdk.java.net/~mdoerr/8165014_sparcSIGBUS/webrev.00/ > > Please review. I will also need a sponsor. I can sponsor this for you. I was unsure whether we should in fact cater for unaligned accesses, but causing InternalError to be thrown instead of crashing is somewhat cleaner. Reviewed. Thanks, David > Best regards, > Martin > > From coleen.phillimore at oracle.com Wed Aug 31 12:20:29 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 31 Aug 2016 08:20:29 -0400 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> Message-ID: <83a2076f-8356-e86f-de25-9b8223008d7a@oracle.com> Harold, This looks good to me. Thank you for making this change. Coleen On 8/30/16 1:47 PM, harold seigel wrote: > Hi, > > Please review this fix for JDK-8162412. This fix allows user > properties that start with "-Djdk.module." unless they match any of > the seven reserved system properties as follows: > > The JVM will ignore -D, -D.[*], and > -D=[*] where is any one of these seven: > > jdk.module.addmods > jdk.module.limitmods > jdk.module.addexports > jdk.module.addreads > jdk.module.patch > jdk.module.path > jdk.module.upgrade.path > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 > > Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ > > The fix was tested with the JCK Lang and VM tests, the hotpot, and > java/lang, java/util and other JTreg tests, and the NSK non-colocated > quick tests. > > Thanks, Harold From dmitry.dmitriev at oracle.com Wed Aug 31 12:25:10 2016 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Wed, 31 Aug 2016 15:25:10 +0300 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> Message-ID: Hello Harold, I think that it will be great to have at least one test case which verifies that VM prints warning when property is specified in the source which is different from the command line, e.g. when property specified in _JAVA_OPTIONS environment variable. runtime/logging/ExceptionsTest.java have test cases for environment variables, I think you can done similar thing. Thank you! Dmitry On 30.08.2016 20:47, harold seigel wrote: > Hi, > > Please review this fix for JDK-8162412. This fix allows user > properties that start with "-Djdk.module." unless they match any of > the seven reserved system properties as follows: > > The JVM will ignore -D, -D.[*], and > -D=[*] where is any one of these seven: > > jdk.module.addmods > jdk.module.limitmods > jdk.module.addexports > jdk.module.addreads > jdk.module.patch > jdk.module.path > jdk.module.upgrade.path > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 > > Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ > > The fix was tested with the JCK Lang and VM tests, the hotpot, and > java/lang, java/util and other JTreg tests, and the NSK non-colocated > quick tests. > > Thanks, Harold From harold.seigel at oracle.com Wed Aug 31 12:26:13 2016 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 31 Aug 2016 08:26:13 -0400 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: <83a2076f-8356-e86f-de25-9b8223008d7a@oracle.com> References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> <83a2076f-8356-e86f-de25-9b8223008d7a@oracle.com> Message-ID: <9450dbd7-7dea-1c25-02ab-c460fb60bafd@oracle.com> Thanks Coleen! Harold On 8/31/2016 8:20 AM, Coleen Phillimore wrote: > > Harold, > This looks good to me. > Thank you for making this change. > Coleen > > > On 8/30/16 1:47 PM, harold seigel wrote: >> Hi, >> >> Please review this fix for JDK-8162412. This fix allows user >> properties that start with "-Djdk.module." unless they match any of >> the seven reserved system properties as follows: >> >> The JVM will ignore -D, -D.[*], and >> -D=[*] where is any one of these seven: >> >> jdk.module.addmods >> jdk.module.limitmods >> jdk.module.addexports >> jdk.module.addreads >> jdk.module.patch >> jdk.module.path >> jdk.module.upgrade.path >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 >> >> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ >> >> The fix was tested with the JCK Lang and VM tests, the hotpot, and >> java/lang, java/util and other JTreg tests, and the NSK non-colocated >> quick tests. >> >> Thanks, Harold > From Alan.Burlison at oracle.com Wed Aug 31 12:42:39 2016 From: Alan.Burlison at oracle.com (Alan Burlison) Date: Wed, 31 Aug 2016 13:42:39 +0100 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: References: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> <57C52C9F.40301@oracle.com> <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> Message-ID: On 30/08/2016 11:38, David Holmes wrote: >> We have removed "&& info->si_code == BUS_OBJERR" in our JVM. It should >> also catch other types like BUS_ADRALN. > > I'm not yet convinced that we should be anticipating/supporting > unaligned access from Unsafe. But I'll look into this more closely later > this week. Hadoop makes misaligned accesses using Unsafe all over the place. https://issues.apache.org/jira/browse/HADOOP-12630 https://issues.apache.org/jira/browse/HADOOP-12720 That's because the relevant Hadoop code is a pile and assumes it will only ever run on architectures that support misaligned accesses. As it is using Unsafe to do so there is a case to be made that they are getting what they deserve, but on the other hand it means that they've created a Java app that will not run on all the platforms that Java supports. I know Unsafe is going away, I don't know if the replacement will prevent or allow similar misaligned accesses. It might be useful if there was a run-time check that you could turn on that checked for misaligned accesses, that way people writing such code would at least have some way of figuring out if they have shot themselves in the foot with regards to cross-platform portability. -- Alan Burlison -- From harold.seigel at oracle.com Wed Aug 31 13:02:30 2016 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 31 Aug 2016 09:02:30 -0400 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> Message-ID: Hi Dmitry, I added an additional test case to the end of test ModuleOptionsWarn.java that tests for a warning if a module related property is specified using _JAVA_OPTIONS. Could you review it? Thanks, Harold On 8/31/2016 8:25 AM, Dmitry Dmitriev wrote: > Hello Harold, > > I think that it will be great to have at least one test case which > verifies that VM prints warning when property is specified in the > source which is different from the command line, e.g. when property > specified in _JAVA_OPTIONS environment variable. > runtime/logging/ExceptionsTest.java have test cases for environment > variables, I think you can done similar thing. Thank you! > > Dmitry > > On 30.08.2016 20:47, harold seigel wrote: >> Hi, >> >> Please review this fix for JDK-8162412. This fix allows user >> properties that start with "-Djdk.module." unless they match any of >> the seven reserved system properties as follows: >> >> The JVM will ignore -D, -D.[*], and >> -D=[*] where is any one of these seven: >> >> jdk.module.addmods >> jdk.module.limitmods >> jdk.module.addexports >> jdk.module.addreads >> jdk.module.patch >> jdk.module.path >> jdk.module.upgrade.path >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 >> >> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ >> >> The fix was tested with the JCK Lang and VM tests, the hotpot, and >> java/lang, java/util and other JTreg tests, and the NSK non-colocated >> quick tests. >> >> Thanks, Harold > From dmitry.dmitriev at oracle.com Wed Aug 31 13:05:57 2016 From: dmitry.dmitriev at oracle.com (Dmitry Dmitriev) Date: Wed, 31 Aug 2016 16:05:57 +0300 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> Message-ID: <137dbd57-7916-fc81-577b-a34eb6463a32@oracle.com> Harold, I've looked into following webrev and it's looks good: http://cr.openjdk.java.net/~hseigel/bug_8162412.2/ Thank you, Dmitry On 31.08.2016 16:02, harold seigel wrote: > Hi Dmitry, > > I added an additional test case to the end of test > ModuleOptionsWarn.java that tests for a warning if a module related > property is specified using _JAVA_OPTIONS. Could you review it? > > Thanks, Harold > > > On 8/31/2016 8:25 AM, Dmitry Dmitriev wrote: >> Hello Harold, >> >> I think that it will be great to have at least one test case which >> verifies that VM prints warning when property is specified in the >> source which is different from the command line, e.g. when property >> specified in _JAVA_OPTIONS environment variable. >> runtime/logging/ExceptionsTest.java have test cases for environment >> variables, I think you can done similar thing. Thank you! >> >> Dmitry >> >> On 30.08.2016 20:47, harold seigel wrote: >>> Hi, >>> >>> Please review this fix for JDK-8162412. This fix allows user >>> properties that start with "-Djdk.module." unless they match any of >>> the seven reserved system properties as follows: >>> >>> The JVM will ignore -D, -D.[*], and >>> -D=[*] where is any one of these seven: >>> >>> jdk.module.addmods >>> jdk.module.limitmods >>> jdk.module.addexports >>> jdk.module.addreads >>> jdk.module.patch >>> jdk.module.path >>> jdk.module.upgrade.path >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 >>> >>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ >>> >>> The fix was tested with the JCK Lang and VM tests, the hotpot, and >>> java/lang, java/util and other JTreg tests, and the NSK >>> non-colocated quick tests. >>> >>> Thanks, Harold >> > From harold.seigel at oracle.com Wed Aug 31 13:06:00 2016 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 31 Aug 2016 09:06:00 -0400 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> Message-ID: The new webrev is: http://cr.openjdk.java.net/~hseigel/bug_8162412.2/ Harold On 8/31/2016 9:02 AM, harold seigel wrote: > Hi Dmitry, > > I added an additional test case to the end of test > ModuleOptionsWarn.java that tests for a warning if a module related > property is specified using _JAVA_OPTIONS. Could you review it? > > Thanks, Harold > > > On 8/31/2016 8:25 AM, Dmitry Dmitriev wrote: >> Hello Harold, >> >> I think that it will be great to have at least one test case which >> verifies that VM prints warning when property is specified in the >> source which is different from the command line, e.g. when property >> specified in _JAVA_OPTIONS environment variable. >> runtime/logging/ExceptionsTest.java have test cases for environment >> variables, I think you can done similar thing. Thank you! >> >> Dmitry >> >> On 30.08.2016 20:47, harold seigel wrote: >>> Hi, >>> >>> Please review this fix for JDK-8162412. This fix allows user >>> properties that start with "-Djdk.module." unless they match any of >>> the seven reserved system properties as follows: >>> >>> The JVM will ignore -D, -D.[*], and >>> -D=[*] where is any one of these seven: >>> >>> jdk.module.addmods >>> jdk.module.limitmods >>> jdk.module.addexports >>> jdk.module.addreads >>> jdk.module.patch >>> jdk.module.path >>> jdk.module.upgrade.path >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 >>> >>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ >>> >>> The fix was tested with the JCK Lang and VM tests, the hotpot, and >>> java/lang, java/util and other JTreg tests, and the NSK >>> non-colocated quick tests. >>> >>> Thanks, Harold >> > From harold.seigel at oracle.com Wed Aug 31 13:06:34 2016 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 31 Aug 2016 09:06:34 -0400 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: <137dbd57-7916-fc81-577b-a34eb6463a32@oracle.com> References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> <137dbd57-7916-fc81-577b-a34eb6463a32@oracle.com> Message-ID: Thanks! Harold On 8/31/2016 9:05 AM, Dmitry Dmitriev wrote: > Harold, > > I've looked into following webrev and it's looks good: > http://cr.openjdk.java.net/~hseigel/bug_8162412.2/ > > > Thank you, > Dmitry > > On 31.08.2016 16:02, harold seigel wrote: >> Hi Dmitry, >> >> I added an additional test case to the end of test >> ModuleOptionsWarn.java that tests for a warning if a module related >> property is specified using _JAVA_OPTIONS. Could you review it? >> >> Thanks, Harold >> >> >> On 8/31/2016 8:25 AM, Dmitry Dmitriev wrote: >>> Hello Harold, >>> >>> I think that it will be great to have at least one test case which >>> verifies that VM prints warning when property is specified in the >>> source which is different from the command line, e.g. when property >>> specified in _JAVA_OPTIONS environment variable. >>> runtime/logging/ExceptionsTest.java have test cases for environment >>> variables, I think you can done similar thing. Thank you! >>> >>> Dmitry >>> >>> On 30.08.2016 20:47, harold seigel wrote: >>>> Hi, >>>> >>>> Please review this fix for JDK-8162412. This fix allows user >>>> properties that start with "-Djdk.module." unless they match any of >>>> the seven reserved system properties as follows: >>>> >>>> The JVM will ignore -D, -D.[*], and >>>> -D=[*] where is any one of these seven: >>>> >>>> jdk.module.addmods >>>> jdk.module.limitmods >>>> jdk.module.addexports >>>> jdk.module.addreads >>>> jdk.module.patch >>>> jdk.module.path >>>> jdk.module.upgrade.path >>>> >>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 >>>> >>>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ >>>> >>>> The fix was tested with the JCK Lang and VM tests, the hotpot, and >>>> java/lang, java/util and other JTreg tests, and the NSK >>>> non-colocated quick tests. >>>> >>>> Thanks, Harold >>> >> > From aph at redhat.com Wed Aug 31 13:48:48 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 31 Aug 2016 14:48:48 +0100 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: References: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> <57C52C9F.40301@oracle.com> <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> Message-ID: <510bb079-c374-21d8-d879-56c9c51bd877@redhat.com> On 31/08/16 13:42, Alan Burlison wrote: > That's because the relevant Hadoop code is a pile and assumes it > will only ever run on architectures that support misaligned > accesses. As it is using Unsafe to do so there is a case to be made > that they are getting what they deserve, but on the other hand it > means that they've created a Java app that will not run on all the > platforms that Java supports. > > I know Unsafe is going away, I don't know if the replacement will > prevent or allow similar misaligned accesses. This is just a lexicographic comparison in Hadoop. We do have misaligned accesses in the new Unsafe, but it's internal. But I would have thought the Hadoop code should use Arrays.compare. Andrew. From george.triantafillou at oracle.com Wed Aug 31 13:59:06 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Wed, 31 Aug 2016 09:59:06 -0400 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> Message-ID: <3be93d6a-e4c2-b0bc-2948-3f0d854dc650@oracle.com> Hi Harold, Looks good! -George On 8/31/2016 9:06 AM, harold seigel wrote: > The new webrev is: http://cr.openjdk.java.net/~hseigel/bug_8162412.2/ > > Harold > > > On 8/31/2016 9:02 AM, harold seigel wrote: >> Hi Dmitry, >> >> I added an additional test case to the end of test >> ModuleOptionsWarn.java that tests for a warning if a module related >> property is specified using _JAVA_OPTIONS. Could you review it? >> >> Thanks, Harold >> >> >> On 8/31/2016 8:25 AM, Dmitry Dmitriev wrote: >>> Hello Harold, >>> >>> I think that it will be great to have at least one test case which >>> verifies that VM prints warning when property is specified in the >>> source which is different from the command line, e.g. when property >>> specified in _JAVA_OPTIONS environment variable. >>> runtime/logging/ExceptionsTest.java have test cases for environment >>> variables, I think you can done similar thing. Thank you! >>> >>> Dmitry >>> >>> On 30.08.2016 20:47, harold seigel wrote: >>>> Hi, >>>> >>>> Please review this fix for JDK-8162412. This fix allows user >>>> properties that start with "-Djdk.module." unless they match any of >>>> the seven reserved system properties as follows: >>>> >>>> The JVM will ignore -D, -D.[*], and >>>> -D=[*] where is any one of these seven: >>>> >>>> jdk.module.addmods >>>> jdk.module.limitmods >>>> jdk.module.addexports >>>> jdk.module.addreads >>>> jdk.module.patch >>>> jdk.module.path >>>> jdk.module.upgrade.path >>>> >>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 >>>> >>>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ >>>> >>>> The fix was tested with the JCK Lang and VM tests, the hotpot, and >>>> java/lang, java/util and other JTreg tests, and the NSK >>>> non-colocated quick tests. >>>> >>>> Thanks, Harold >>> >> > From harold.seigel at oracle.com Wed Aug 31 15:30:47 2016 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 31 Aug 2016 11:30:47 -0400 Subject: RFR(S): 8162412: Ignore any System property specified as -Djdk.module... In-Reply-To: <3be93d6a-e4c2-b0bc-2948-3f0d854dc650@oracle.com> References: <71e5e74f-cb50-0a31-7784-7468aede5a09@oracle.com> <3be93d6a-e4c2-b0bc-2948-3f0d854dc650@oracle.com> Message-ID: <04972729-777c-3e83-9125-57ad95c4c658@oracle.com> Thanks George. Harold On 8/31/2016 9:59 AM, George Triantafillou wrote: > Hi Harold, > > Looks good! > > -George > > On 8/31/2016 9:06 AM, harold seigel wrote: >> The new webrev is: http://cr.openjdk.java.net/~hseigel/bug_8162412.2/ >> >> Harold >> >> >> On 8/31/2016 9:02 AM, harold seigel wrote: >>> Hi Dmitry, >>> >>> I added an additional test case to the end of test >>> ModuleOptionsWarn.java that tests for a warning if a module related >>> property is specified using _JAVA_OPTIONS. Could you review it? >>> >>> Thanks, Harold >>> >>> >>> On 8/31/2016 8:25 AM, Dmitry Dmitriev wrote: >>>> Hello Harold, >>>> >>>> I think that it will be great to have at least one test case which >>>> verifies that VM prints warning when property is specified in the >>>> source which is different from the command line, e.g. when property >>>> specified in _JAVA_OPTIONS environment variable. >>>> runtime/logging/ExceptionsTest.java have test cases for environment >>>> variables, I think you can done similar thing. Thank you! >>>> >>>> Dmitry >>>> >>>> On 30.08.2016 20:47, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this fix for JDK-8162412. This fix allows user >>>>> properties that start with "-Djdk.module." unless they match any >>>>> of the seven reserved system properties as follows: >>>>> >>>>> The JVM will ignore -D, -D.[*], and >>>>> -D=[*] where is any one of these seven: >>>>> >>>>> jdk.module.addmods >>>>> jdk.module.limitmods >>>>> jdk.module.addexports >>>>> jdk.module.addreads >>>>> jdk.module.patch >>>>> jdk.module.path >>>>> jdk.module.upgrade.path >>>>> >>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8162412 >>>>> >>>>> Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8162412/ >>>>> >>>>> The fix was tested with the JCK Lang and VM tests, the hotpot, and >>>>> java/lang, java/util and other JTreg tests, and the NSK >>>>> non-colocated quick tests. >>>>> >>>>> Thanks, Harold >>>> >>> >> > From Alan.Burlison at oracle.com Wed Aug 31 16:30:36 2016 From: Alan.Burlison at oracle.com (Alan Burlison) Date: Wed, 31 Aug 2016 17:30:36 +0100 Subject: UnsafeAtomicityTest crashes on SPARC In-Reply-To: <510bb079-c374-21d8-d879-56c9c51bd877@redhat.com> References: <7C9B87B351A4BA4AA9EC95BB418116567226DD5A@DEWDFEMB19C.global.corp.sap> <57C52C9F.40301@oracle.com> <96fc93cf0e6b45e3a83fba2c631fca00@DEWDFE13DE14.global.corp.sap> <510bb079-c374-21d8-d879-56c9c51bd877@redhat.com> Message-ID: On 31/08/2016 14:48, Andrew Haley wrote: > This is just a lexicographic comparison in Hadoop. We do have > misaligned accesses in the new Unsafe, but it's internal. But I would > have thought the Hadoop code should use Arrays.compare. Yes, it probably should but they are focused on squeezing every last drop of performance out wherever they can. The misaligned 64-bit word hack for lexicographic comparisons makes things a bit faster on the Java versions and platforms they concentrate on, which is why they've done it. For Java9 that's no longer true any more and pure-Java is the fastest way, but of course they aren't using that yet - in fact they have only moved up to Java7 fairly recently and are currently working on Java8 support: https://issues.apache.org/jira/browse/HADOOP-11090 -- Alan Burlison -- From daniel.daugherty at oracle.com Wed Aug 31 17:38:53 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 31 Aug 2016 11:38:53 -0600 Subject: RFR(S) 8156137: SIGSEGV in ReceiverTypeData::clean_weak_klass_links In-Reply-To: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> References: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> Message-ID: <68a6ecf1-acd7-662c-ef87-e7d7d84fc609@oracle.com> On 8/30/16 1:24 PM, dean.long at oracle.com wrote: > http://cr.openjdk.java.net/~dlong/8156137/webrev/ src/share/vm/oops/klass.cpp No comments. Thumbs up! Long and probably frustrating hunt for this one. Good job on catching it! Dan > > https://bugs.openjdk.java.net/browse/JDK-8156137 > > The problem: JVMTI RedefineClasses creates scratch classes to hold the > old methods until they can be freed (they are no longer active in a > thread stack). G1ConcurrentMark sees these scratch classes, but > G1MarkSweep does not. > > The details: > > G1ConcurrentMark uses ClassLoaderDataGraphKlassIteratorAtomic to > iterate over *all* classes and calls clean_weak_instanceklass_links to > clean. > > G1MarkSweep (and other GCs) use Klass::clean_weak_klass_links() to > iterate over *live, non-scratch* classes, and calls > clean_weak_instanceklass_links to clean. > > Now the problem scenario is: > > 0: scratch class S has MethodData that references a class U that is > going to be unloaded. > > 1: G1MarkSweep skips class S in Klass::clean_weak_klass_links, because > scratch classes are not added to the class hierarchy tree. The full GC > then frees the metadata for class U. Now the MethodData for S contains > stale metadata. > > 2. When a later G1ConcurrentMark calls clean_weak_instanceklass_links > on S, it will crash on the stale metadata. > > Solution: have Klass::clean_weak_klass_links() process these scratch > classes that can be found on the "previous versions" list. > > Tested with bigapps/Kitchensink. > > dl > > From harold.seigel at oracle.com Wed Aug 31 18:21:43 2016 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 31 Aug 2016 14:21:43 -0400 Subject: RFR(S) 8149607: [Verifier] Do not verify pop, pop2, swap, dup* against top Message-ID: <36406e08-7fca-81e3-6f3a-e6df0060ed9e@oracle.com> Hi, Please review this verifier fix to disallow popping a single operand of type 'top' from the stack in case it's the upper half of a long or double operand. See the Java Virtual Machine Spec for information about type 'top' operands. JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8149607 Open webrev: http://cr.openjdk.java.net/~hseigel/bug_8149607/ The fix was tested with the JCK Lang and VM tests, the hotpot, and java/lang, java/util and other JTreg tests, and the NSK colocated and non-colocated quick tests. Thanks, Harold From dean.long at oracle.com Wed Aug 31 19:08:22 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 31 Aug 2016 12:08:22 -0700 Subject: RFR(S) 8156137: SIGSEGV in ReceiverTypeData::clean_weak_klass_links In-Reply-To: <68a6ecf1-acd7-662c-ef87-e7d7d84fc609@oracle.com> References: <0cf5b839-e251-c363-207f-d709e76c4070@oracle.com> <68a6ecf1-acd7-662c-ef87-e7d7d84fc609@oracle.com> Message-ID: <0a30b71b-f1b1-c52f-fb44-996661d2bf8b@oracle.com> Thanks Dan! dl On 8/31/16 10:38 AM, Daniel D. Daugherty wrote: > On 8/30/16 1:24 PM, dean.long at oracle.com wrote: >> http://cr.openjdk.java.net/~dlong/8156137/webrev/ > > src/share/vm/oops/klass.cpp > No comments. > > Thumbs up! > > Long and probably frustrating hunt for this one. > Good job on catching it! > > Dan > > >> >> https://bugs.openjdk.java.net/browse/JDK-8156137 >> >> The problem: JVMTI RedefineClasses creates scratch classes to hold >> the old methods until they can be freed (they are no longer active in >> a thread stack). G1ConcurrentMark sees these scratch classes, but >> G1MarkSweep does not. >> >> The details: >> >> G1ConcurrentMark uses ClassLoaderDataGraphKlassIteratorAtomic to >> iterate over *all* classes and calls clean_weak_instanceklass_links >> to clean. >> >> G1MarkSweep (and other GCs) use Klass::clean_weak_klass_links() to >> iterate over *live, non-scratch* classes, and calls >> clean_weak_instanceklass_links to clean. >> >> Now the problem scenario is: >> >> 0: scratch class S has MethodData that references a class U that is >> going to be unloaded. >> >> 1: G1MarkSweep skips class S in Klass::clean_weak_klass_links, >> because scratch classes are not added to the class hierarchy tree. >> The full GC then frees the metadata for class U. Now the MethodData >> for S contains stale metadata. >> >> 2. When a later G1ConcurrentMark calls clean_weak_instanceklass_links >> on S, it will crash on the stale metadata. >> >> Solution: have Klass::clean_weak_klass_links() process these scratch >> classes that can be found on the "previous versions" list. >> >> Tested with bigapps/Kitchensink. >> >> dl >> >> > From staffan.larsen at oracle.com Wed Aug 31 19:14:52 2016 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 31 Aug 2016 21:14:52 +0200 Subject: RFR(XS): 8164939: GTest LogDecorations.iso8601_time_test fails on macOS In-Reply-To: <0427bed2-6a5d-3d94-4b67-29d4a92fa418@oracle.com> References: <0427bed2-6a5d-3d94-4b67-29d4a92fa418@oracle.com> Message-ID: Looks good! Thanks, /Staffan > On 31 aug. 2016, at 09:48, Marcus Larsson wrote: > > Hi, > > Please review the following patch fixing a daylight savings issue with mktime/localtime in the unit tests on Mac OSX. Avoiding the use of 'daylight' and instead letting mktime handle DST seems to solve the issue, and still works as intended on all the other platforms. > > Webrev: > http://cr.openjdk.java.net/~mlarsson/8164939/webrev.00/ > > Issue: > https://bugs.openjdk.java.net/browse/JDK-8164939 > > Testing: > Manually verified on a previously failing host. Unit tests through JPRT. > > Thanks, > Marcus From coleen.phillimore at oracle.com Wed Aug 31 23:36:36 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 31 Aug 2016 19:36:36 -0400 Subject: RFR(XXS): 8165014: Unaligned unsafe access should throw InternalError on Solaris In-Reply-To: <7d8057d9e08d4448aeaab6d35bf3768b@DEWDFE13DE14.global.corp.sap> References: <7d8057d9e08d4448aeaab6d35bf3768b@DEWDFE13DE14.global.corp.sap> Message-ID: This seems fine. It's the only platform that also tests for BUS_OBJERR, not commented out. Thanks, Coleen On 8/30/16 6:42 AM, Doerr, Martin wrote: > Hi, > > as discussed in http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-August/020888.html, > the signal handler on Solaris SPARC should throw a java.lang.InternalError when Unsafe accesses unaligned addresses. > This is currently broken because the signal handler only accepts BUS_OBJERR which is not sufficient. > > Proposed fix: > http://cr.openjdk.java.net/~mdoerr/8165014_sparcSIGBUS/webrev.00/ > > Please review. I will also need a sponsor. > > Best regards, > Martin > >