From ivan.gerasimov at oracle.com Wed Oct 1 09:07:40 2014 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Wed, 01 Oct 2014 13:07:40 +0400 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] Message-ID: <542BC45C.8080408@oracle.com> Hello! The tests that continue to fail with wrong exit codes suggest that the fix for JDK-8057744 wasn't sufficient. Here's another proposal, which expands the synchronized portion of the code. It is proposed to make the exiting process wait for the threads that have already started exiting. This should help to make sure that no thread is executing any potentially racy code concurrently with the exiting process. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ Comments, suggestion are welcome! Sincerely yours, Ivan From attila.szegedi at oracle.com Thu Oct 2 14:47:49 2014 From: attila.szegedi at oracle.com (Attila Szegedi) Date: Thu, 2 Oct 2014 16:47:49 +0200 Subject: Fwd: Nashorn and JVMTI References: <2695ADB3-5D95-485E-9D7F-0090256079AE@oracle.com> Message-ID: <21F36CFC-174E-4C77-A6CE-D1555B0EDB86@oracle.com> Folks, I'm forwarding a message from nashorn-dev (as well as my initial reply); there seems to be an issue with trying to profile Nashorn using JVMTI through NetBeans profiler. If anyone has any insight into this, it'd be appreciated. Thanks, Attila. Begin forwarded message: > From: Attila Szegedi > Subject: Re: Nashorn and JVMTI > Date: October 2, 2014 at 4:39:07 PM GMT+2 > To: "David P. Caldwell" > Cc: nashorn-dev > > No, we don't do anything to conceal Nashorn's internals. Granted, most of it lives in jdk.internal and jdk.nashorn.internal that are designated as restricted packages, but that shouldn't stop a debugger from looking into them. We often use jvisualvm ourselves to investigate Nashorn performance. > > I often tell people that one of the benefits of running anything (including JavaScript) on the JVM is monitoring and management, so I'd definitely be against obscured visibility into the runtime. > > I'll try to make NetBeans and JVMTI people aware of this. > > Attila. > > On Sep 25, 2014, at 11:13 PM, David P. Caldwell wrote: > >> Team, >> >> When I attempt to connect the NetBeans profiler (which I understand to >> be essentially the same as jvisualvm) to a Nashorn embedding, I get an >> error (JVMTI error 62) for essentially every class that relates to >> scripting, including the dynalink stuff and Nashorn itself, as well as >> generated classes named NashornJavaAdapter. >> >> If I persist through all of this (or filter them out of being >> profiled, or turn on -Xverify:none), I end up with profiling data that >> doesn't involve the JavaScript code at all; it basically treats the >> call to eval() as atomic. >> >> Do you guys do this stuff? My customers are constantly objecting to >> the "fact" that running Java on the JVM is going to be a terrible >> idea, performance-wise -- especially compared to Node, which they >> believe is lightning fast -- and I am having difficulty generating >> data on this point. >> >> More generally, of course, profiling is a normal and necessary >> development activity. I wrote a Java agent for Rhino that mangled the >> classes using Javassist to wrap all JavaScript function invocations in >> instrumented methods, but I'm not clear on (a) whether that's >> necessary, or (b) how it would work given the Nashorn implementation >> is probably using constructs I don't yet understand. But if that's the >> route, let me know and give me a pointer or two and I'll be on my way. >> >> -- David P. Caldwell >> http://www.davidpcaldwell.com/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tomas.hurka at googlemail.com Thu Oct 2 16:01:42 2014 From: tomas.hurka at googlemail.com (Tomas Hurka) Date: Thu, 2 Oct 2014 18:01:42 +0200 Subject: Nashorn and JVMTI In-Reply-To: <21F36CFC-174E-4C77-A6CE-D1555B0EDB86@oracle.com> References: <2695ADB3-5D95-485E-9D7F-0090256079AE@oracle.com> <21F36CFC-174E-4C77-A6CE-D1555B0EDB86@oracle.com> Message-ID: <3ADB9B41-32BA-4B37-A411-34CA58A9DBB7@googlemail.com> Hi Attila, JVMTI error 62 is caused by bug in JDK. See -- Tomas Hurka NetBeans Profiler http://profiler.netbeans.org VisualVM http://visualvm.java.net Software Developer Oracle, Praha Czech Republic On 2 Oct 2014, at 16:47, Attila Szegedi wrote: > Folks, > > I'm forwarding a message from nashorn-dev (as well as my initial reply); there seems to be an issue with trying to profile Nashorn using JVMTI through NetBeans profiler. If anyone has any insight into this, it'd be appreciated. > > Thanks, > Attila. > > > Begin forwarded message: > >> From: Attila Szegedi >> Subject: Re: Nashorn and JVMTI >> Date: October 2, 2014 at 4:39:07 PM GMT+2 >> To: "David P. Caldwell" >> Cc: nashorn-dev >> >> No, we don't do anything to conceal Nashorn's internals. Granted, most of it lives in jdk.internal and jdk.nashorn.internal that are designated as restricted packages, but that shouldn't stop a debugger from looking into them. We often use jvisualvm ourselves to investigate Nashorn performance. >> >> I often tell people that one of the benefits of running anything (including JavaScript) on the JVM is monitoring and management, so I'd definitely be against obscured visibility into the runtime. >> >> I'll try to make NetBeans and JVMTI people aware of this. >> >> Attila. >> >> On Sep 25, 2014, at 11:13 PM, David P. Caldwell wrote: >> >>> Team, >>> >>> When I attempt to connect the NetBeans profiler (which I understand to >>> be essentially the same as jvisualvm) to a Nashorn embedding, I get an >>> error (JVMTI error 62) for essentially every class that relates to >>> scripting, including the dynalink stuff and Nashorn itself, as well as >>> generated classes named NashornJavaAdapter. >>> >>> If I persist through all of this (or filter them out of being >>> profiled, or turn on -Xverify:none), I end up with profiling data that >>> doesn't involve the JavaScript code at all; it basically treats the >>> call to eval() as atomic. >>> >>> Do you guys do this stuff? My customers are constantly objecting to >>> the "fact" that running Java on the JVM is going to be a terrible >>> idea, performance-wise -- especially compared to Node, which they >>> believe is lightning fast -- and I am having difficulty generating >>> data on this point. >>> >>> More generally, of course, profiling is a normal and necessary >>> development activity. I wrote a Java agent for Rhino that mangled the >>> classes using Javassist to wrap all JavaScript function invocations in >>> instrumented methods, but I'm not clear on (a) whether that's >>> necessary, or (b) how it would work given the Nashorn implementation >>> is probably using constructs I don't yet understand. But if that's the >>> route, let me know and give me a pointer or two and I'll be on my way. >>> >>> -- David P. Caldwell >>> http://www.davidpcaldwell.com/ >> > From michel.trudeau at oracle.com Thu Oct 2 16:51:25 2014 From: michel.trudeau at oracle.com (Michel Trudeau) Date: Thu, 02 Oct 2014 09:51:25 -0700 Subject: Nashorn and JVMTI In-Reply-To: <3ADB9B41-32BA-4B37-A411-34CA58A9DBB7@googlemail.com> References: <2695ADB3-5D95-485E-9D7F-0090256079AE@oracle.com> <21F36CFC-174E-4C77-A6CE-D1555B0EDB86@oracle.com> <3ADB9B41-32BA-4B37-A411-34CA58A9DBB7@googlemail.com> Message-ID: <542D828D.1080407@oracle.com> An HTML attachment was scrubbed... URL: From attila.szegedi at oracle.com Thu Oct 2 17:06:41 2014 From: attila.szegedi at oracle.com (Attila Szegedi) Date: Thu, 2 Oct 2014 19:06:41 +0200 Subject: Nashorn and JVMTI In-Reply-To: <542D828D.1080407@oracle.com> References: <2695ADB3-5D95-485E-9D7F-0090256079AE@oracle.com> <21F36CFC-174E-4C77-A6CE-D1555B0EDB86@oracle.com> <3ADB9B41-32BA-4B37-A411-34CA58A9DBB7@googlemail.com> <542D828D.1080407@oracle.com> Message-ID: <173C9B07-4421-4EE0-9093-11CE0F3CAEA9@oracle.com> Thanks for the quick answer, folks. I have answered on nashorn-dev appropriately (suggested they try 8u40 EA as the fix should be in there too). On Oct 2, 2014, at 6:51 PM, Michel Trudeau wrote: > Right. It's fixed in many releases. > > https://bugs.openjdk.java.net/browse/JDK-8050485 > > 8u25, which ships on October 14, will have this fixed. > > Thanks, > Michel > > > > > Tomas Hurka wrote: >> >> Hi Attila, >> JVMTI error 62 is caused by bug in JDK. See >> >> -- >> Tomas Hurka >> NetBeans Profiler http://profiler.netbeans.org >> VisualVM http://visualvm.java.net >> Software Developer >> Oracle, Praha Czech Republic >> >> >> On 2 Oct 2014, at 16:47, Attila Szegedi wrote: >> >>> Folks, >>> >>> I'm forwarding a message from nashorn-dev (as well as my initial reply); there seems to be an issue with trying to profile Nashorn using JVMTI through NetBeans profiler. If anyone has any insight into this, it'd be appreciated. >>> >>> Thanks, >>> Attila. >>> >>> >>> Begin forwarded message: >>> >>>> From: Attila Szegedi >>>> Subject: Re: Nashorn and JVMTI >>>> Date: October 2, 2014 at 4:39:07 PM GMT+2 >>>> To: "David P. Caldwell" >>>> Cc: nashorn-dev >>>> >>>> No, we don't do anything to conceal Nashorn's internals. Granted, most of it lives in jdk.internal and jdk.nashorn.internal that are designated as restricted packages, but that shouldn't stop a debugger from looking into them. We often use jvisualvm ourselves to investigate Nashorn performance. >>>> >>>> I often tell people that one of the benefits of running anything (including JavaScript) on the JVM is monitoring and management, so I'd definitely be against obscured visibility into the runtime. >>>> >>>> I'll try to make NetBeans and JVMTI people aware of this. >>>> >>>> Attila. >>>> >>>> On Sep 25, 2014, at 11:13 PM, David P. Caldwell wrote: >>>> >>>>> Team, >>>>> >>>>> When I attempt to connect the NetBeans profiler (which I understand to >>>>> be essentially the same as jvisualvm) to a Nashorn embedding, I get an >>>>> error (JVMTI error 62) for essentially every class that relates to >>>>> scripting, including the dynalink stuff and Nashorn itself, as well as >>>>> generated classes named NashornJavaAdapter. >>>>> >>>>> If I persist through all of this (or filter them out of being >>>>> profiled, or turn on -Xverify:none), I end up with profiling data that >>>>> doesn't involve the JavaScript code at all; it basically treats the >>>>> call to eval() as atomic. >>>>> >>>>> Do you guys do this stuff? My customers are constantly objecting to >>>>> the "fact" that running Java on the JVM is going to be a terrible >>>>> idea, performance-wise -- especially compared to Node, which they >>>>> believe is lightning fast -- and I am having difficulty generating >>>>> data on this point. >>>>> >>>>> More generally, of course, profiling is a normal and necessary >>>>> development activity. I wrote a Java agent for Rhino that mangled the >>>>> classes using Javassist to wrap all JavaScript function invocations in >>>>> instrumented methods, but I'm not clear on (a) whether that's >>>>> necessary, or (b) how it would work given the Nashorn implementation >>>>> is probably using constructs I don't yet understand. But if that's the >>>>> route, let me know and give me a pointer or two and I'll be on my way. >>>>> >>>>> -- David P. Caldwell >>>>> http://www.davidpcaldwell.com/ >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaroslav.bachorik at oracle.com Fri Oct 3 15:02:13 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 03 Oct 2014 17:02:13 +0200 Subject: RFR: 8002307 javax.management.modelmbean.ModelMBeanInfoSupport may expose internal representation by storing an externally mutable object Message-ID: <542EBA75.1000401@oracle.com> This is a second take on fixing JDK-8002307. The previous discussions are accessible as * http://mail.openjdk.java.net/pipermail/jmx-dev/2013-May/000225.html * http://mail.openjdk.java.net/pipermail/jmx-dev/2013-July/000280.html * http://mail.openjdk.java.net/pipermail/jmx-dev/2013-September/000346.html The fix assures the immutability by cloning the provided arrays in the constructor and then cloning them again in the getters. The constructors are fixed in the javax/management/MBeanInfo.java and the arrays used in getters are cloned using an already existing functionality in the same class. javax/management/modelmbean/ModelMBeanInfoSupport.java is modified to take advantage of extending javax/management/MBeanInfo.java and not to store duplicate information (attributes/operations/constructors/notifications arrays). The deserialization routine is adjusted to reflect this and also to enforce data consistency and backward compatibility. Regtests and JCK tests were run successfully. http://cr.openjdk.java.net/~jbachorik/8002307/webrev.05 Thanks, -JB- From dmitry.samersoff at oracle.com Mon Oct 6 10:18:45 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 06 Oct 2014 14:18:45 +0400 Subject: RFR(S): JDK-8059037: JdpTest.sh hangs when trying to kill the test VM Message-ID: <54326C85.5040009@oracle.com> Hi Everybody, Please, review the changes: http://cr.openjdk.java.net/~dsamersoff/JDK-8059037/webrev.01/ Obsoleted shell-based test is removed, Java based tests is fixed to be run by jtreg. -Dmitry -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From staffan.larsen at oracle.com Mon Oct 6 11:09:09 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 6 Oct 2014 13:09:09 +0200 Subject: RFR(S): JDK-8059037: JdpTest.sh hangs when trying to kill the test VM In-Reply-To: <54326C85.5040009@oracle.com> References: <54326C85.5040009@oracle.com> Message-ID: <3494694A-8265-4C48-9AFB-FF0C8300EE5D@oracle.com> Changes looks good. How much testing have you done with the newly enabled tests? I?m worried that they could be unstable since they have never been run. Thanks, /Staffan On 6 okt 2014, at 12:18, Dmitry Samersoff wrote: > Hi Everybody, > > Please, review the changes: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8059037/webrev.01/ > > Obsoleted shell-based test is removed, Java based tests is fixed to be > run by jtreg. > > -Dmitry > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Mon Oct 6 11:51:14 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 06 Oct 2014 15:51:14 +0400 Subject: RFR(S): JDK-8059037: JdpTest.sh hangs when trying to kill the test VM In-Reply-To: <3494694A-8265-4C48-9AFB-FF0C8300EE5D@oracle.com> References: <54326C85.5040009@oracle.com> <3494694A-8265-4C48-9AFB-FF0C8300EE5D@oracle.com> Message-ID: <54328232.3070008@oracle.com> On 2014-10-06 15:09, Staffan Larsen wrote: > Changes looks good. How much testing have you done with the newly > enabled tests? I?m worried that they could be unstable since they > have never been run. Only smoke test: Linux jtreg for promoted jdk9 and for the current workspace. Yes, I also suspect that these tests aren't quite stable but I don't see better way than enable it and address failures if they comes. -Dmitry > > Thanks, /Staffan > > > On 6 okt 2014, at 12:18, Dmitry Samersoff > wrote: > >> Hi Everybody, >> >> Please, review the changes: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8059037/webrev.01/ >> >> Obsoleted shell-based test is removed, Java based tests is fixed to >> be run by jtreg. >> >> -Dmitry >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >> Russia * I would love to change the world, but they won't give me >> the sources. > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From mikael.auno at oracle.com Mon Oct 6 12:00:36 2014 From: mikael.auno at oracle.com (Mikael Auno) Date: Mon, 06 Oct 2014 14:00:36 +0200 Subject: RFR(S): JDK-8059037: JdpTest.sh hangs when trying to kill the test VM In-Reply-To: <54328232.3070008@oracle.com> References: <54326C85.5040009@oracle.com> <3494694A-8265-4C48-9AFB-FF0C8300EE5D@oracle.com> <54328232.3070008@oracle.com> Message-ID: <54328464.8080805@oracle.com> On 2014-10-06 13:51, Dmitry Samersoff wrote: > On 2014-10-06 15:09, Staffan Larsen wrote: >> Changes looks good. How much testing have you done with the newly >> enabled tests? I?m worried that they could be unstable since they >> have never been run. > > Only smoke test: Linux jtreg for promoted jdk9 and for the current > workspace. > > > Yes, I also suspect that these tests aren't quite stable but I don't see > better way than enable it and address failures if they comes. I could help you start a distributed adhoc run if you'd like. Mikael From staffan.larsen at oracle.com Mon Oct 6 12:07:10 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Mon, 6 Oct 2014 14:07:10 +0200 Subject: RFR(S): JDK-8059037: JdpTest.sh hangs when trying to kill the test VM In-Reply-To: <54328232.3070008@oracle.com> References: <54326C85.5040009@oracle.com> <3494694A-8265-4C48-9AFB-FF0C8300EE5D@oracle.com> <54328232.3070008@oracle.com> Message-ID: <80257EA3-234A-4695-8FEE-7E39927864E1@oracle.com> On 6 okt 2014, at 13:51, Dmitry Samersoff wrote: > On 2014-10-06 15:09, Staffan Larsen wrote: >> Changes looks good. How much testing have you done with the newly >> enabled tests? I?m worried that they could be unstable since they >> have never been run. > > Only smoke test: Linux jtreg for promoted jdk9 and for the current > workspace. > > > Yes, I also suspect that these tests aren't quite stable but I don't see > better way than enable it and address failures if they comes. You can at least do a JPRT run on all platforms. Perhaps multiple runs to stress them more? Thanks, /Staffan > > > -Dmitry > > > >> >> Thanks, /Staffan >> >> >> On 6 okt 2014, at 12:18, Dmitry Samersoff >> wrote: >> >>> Hi Everybody, >>> >>> Please, review the changes: >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059037/webrev.01/ >>> >>> Obsoleted shell-based test is removed, Java based tests is fixed to >>> be run by jtreg. >>> >>> -Dmitry >>> >>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >>> Russia * I would love to change the world, but they won't give me >>> the sources. >> > > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitry.samersoff at oracle.com Mon Oct 6 12:26:04 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 06 Oct 2014 16:26:04 +0400 Subject: RFR(S): JDK-8059037: JdpTest.sh hangs when trying to kill the test VM In-Reply-To: <54328464.8080805@oracle.com> References: <54326C85.5040009@oracle.com> <3494694A-8265-4C48-9AFB-FF0C8300EE5D@oracle.com> <54328232.3070008@oracle.com> <54328464.8080805@oracle.com> Message-ID: <54328A5C.5070303@oracle.com> Mikael, > I could help you start a distributed adhoc run if you'd like. Much appreciated! -Dmitry On 2014-10-06 16:00, Mikael Auno wrote: > On 2014-10-06 13:51, Dmitry Samersoff wrote: >> On 2014-10-06 15:09, Staffan Larsen wrote: >>> Changes looks good. How much testing have you done with the newly >>> enabled tests? I?m worried that they could be unstable since they >>> have never been run. >> >> Only smoke test: Linux jtreg for promoted jdk9 and for the current >> workspace. >> >> >> Yes, I also suspect that these tests aren't quite stable but I don't see >> better way than enable it and address failures if they comes. > > I could help you start a distributed adhoc run if you'd like. > > Mikael > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Mon Oct 6 12:26:36 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 06 Oct 2014 16:26:36 +0400 Subject: RFR(S): JDK-8059037: JdpTest.sh hangs when trying to kill the test VM In-Reply-To: <80257EA3-234A-4695-8FEE-7E39927864E1@oracle.com> References: <54326C85.5040009@oracle.com> <3494694A-8265-4C48-9AFB-FF0C8300EE5D@oracle.com> <54328232.3070008@oracle.com> <80257EA3-234A-4695-8FEE-7E39927864E1@oracle.com> Message-ID: <54328A7C.6060109@oracle.com> Staffan, I'll submit jprt job. -Dmitry On 2014-10-06 16:07, Staffan Larsen wrote: > > On 6 okt 2014, at 13:51, Dmitry Samersoff > wrote: > >> On 2014-10-06 15:09, Staffan Larsen wrote: >>> Changes looks good. How much testing have you done with the newly >>> enabled tests? I?m worried that they could be unstable since they >>> have never been run. >> >> Only smoke test: Linux jtreg for promoted jdk9 and for the current >> workspace. >> >> >> Yes, I also suspect that these tests aren't quite stable but I don't see >> better way than enable it and address failures if they comes. > > You can at least do a JPRT run on all platforms. Perhaps multiple runs > to stress them more? > > Thanks, > /Staffan > >> >> >> -Dmitry >> >> >> >>> >>> Thanks, /Staffan >>> >>> >>> On 6 okt 2014, at 12:18, Dmitry Samersoff >>> > wrote: >>> >>>> Hi Everybody, >>>> >>>> Please, review the changes: >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059037/webrev.01/ >>>> >>>> Obsoleted shell-based test is removed, Java based tests is fixed to >>>> be run by jtreg. >>>> >>>> -Dmitry >>>> >>>> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, >>>> Russia * I would love to change the world, but they won't give me >>>> the sources. >>> >> >> >> -- >> Dmitry Samersoff >> Oracle Java development team, Saint Petersburg, Russia >> * I would love to change the world, but they won't give me the sources. > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From mikael.auno at oracle.com Mon Oct 6 14:42:24 2014 From: mikael.auno at oracle.com (Mikael Auno) Date: Mon, 06 Oct 2014 16:42:24 +0200 Subject: RFR(S): JDK-8059037: JdpTest.sh hangs when trying to kill the test VM In-Reply-To: <54328A5C.5070303@oracle.com> References: <54326C85.5040009@oracle.com> <3494694A-8265-4C48-9AFB-FF0C8300EE5D@oracle.com> <54328232.3070008@oracle.com> <54328464.8080805@oracle.com> <54328A5C.5070303@oracle.com> Message-ID: <5432AA50.5060401@oracle.com> Sorry for the delay, here are the results: http://vmsqe-app.russia.sun.com/surl/b0 All tests passed on all platforms. That run is not concurrent with any other tests though, so there might still be some issues there, but it's better than nothing. Mikael On 2014-10-06 14:26, Dmitry Samersoff wrote: > Mikael, > >> I could help you start a distributed adhoc run if you'd like. > > Much appreciated! > > -Dmitry > > On 2014-10-06 16:00, Mikael Auno wrote: >> On 2014-10-06 13:51, Dmitry Samersoff wrote: >>> On 2014-10-06 15:09, Staffan Larsen wrote: >>>> Changes looks good. How much testing have you done with the newly >>>> enabled tests? I?m worried that they could be unstable since they >>>> have never been run. >>> >>> Only smoke test: Linux jtreg for promoted jdk9 and for the current >>> workspace. >>> >>> >>> Yes, I also suspect that these tests aren't quite stable but I don't see >>> better way than enable it and address failures if they comes. >> >> I could help you start a distributed adhoc run if you'd like. >> >> Mikael >> > > From dmitry.samersoff at oracle.com Mon Oct 6 15:12:52 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 06 Oct 2014 19:12:52 +0400 Subject: RFR(S): JDK-8059037: JdpTest.sh hangs when trying to kill the test VM In-Reply-To: <5432AA50.5060401@oracle.com> References: <54326C85.5040009@oracle.com> <3494694A-8265-4C48-9AFB-FF0C8300EE5D@oracle.com> <54328232.3070008@oracle.com> <54328464.8080805@oracle.com> <54328A5C.5070303@oracle.com> <5432AA50.5060401@oracle.com> Message-ID: <5432B174.7050804@oracle.com> Mikael, Thank you! I'll proceed with integration. -Dmitry On 2014-10-06 18:42, Mikael Auno wrote: > Sorry for the delay, here are the results: > > http://vmsqe-app.russia.sun.com/surl/b0 > > All tests passed on all platforms. That run is not concurrent with any > other tests though, so there might still be some issues there, but it's > better than nothing. > > Mikael > > On 2014-10-06 14:26, Dmitry Samersoff wrote: >> Mikael, >> >>> I could help you start a distributed adhoc run if you'd like. >> >> Much appreciated! >> >> -Dmitry >> >> On 2014-10-06 16:00, Mikael Auno wrote: >>> On 2014-10-06 13:51, Dmitry Samersoff wrote: >>>> On 2014-10-06 15:09, Staffan Larsen wrote: >>>>> Changes looks good. How much testing have you done with the newly >>>>> enabled tests? I?m worried that they could be unstable since they >>>>> have never been run. >>>> >>>> Only smoke test: Linux jtreg for promoted jdk9 and for the current >>>> workspace. >>>> >>>> >>>> Yes, I also suspect that these tests aren't quite stable but I don't see >>>> better way than enable it and address failures if they comes. >>> >>> I could help you start a distributed adhoc run if you'd like. >>> >>> Mikael >>> >> >> > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From dmitry.samersoff at oracle.com Mon Oct 6 19:08:03 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 06 Oct 2014 23:08:03 +0400 Subject: RFR(M): JDK-8041639: Don't link the java_crw_demo shared library from product tools Message-ID: <5432E893.9080500@oracle.com> Hi Everybody, Please review: http://cr.openjdk.java.net/~dsamersoff/JDK-8041639/webrev.01/ The fix moves code from java_crw_demo to libhprof. Also I remove mtrace and minst demos that uses java_crw_demo library. -Dmitry -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From kellyohair at gmail.com Mon Oct 6 19:36:27 2014 From: kellyohair at gmail.com (Kelly O'Hair) Date: Mon, 6 Oct 2014 12:36:27 -0700 Subject: RFR(M): JDK-8041639: Don't link the java_crw_demo shared library from product tools In-Reply-To: <5432E893.9080500@oracle.com> References: <5432E893.9080500@oracle.com> Message-ID: <9B37FD5E-4A28-4E5F-9669-C20A1AB06B38@gmail.com> Keep in mind, I'm not in this team anymore, and you guys can do whatever you want, so just consider this background data and comments from the peanut gallery. ;) I created this library early in the JVM TI development primarily for hprof, but also to allow old users of JVMPI to have an ability to replace some of the JVMPI functionality that JVM TI did not directly address. A simple and very limited native code byte code instrumentation library, but with an API independent of hprof so the transition from JVMPI to JVM TI was easier. I honestly do not know what Third party tools out there rely on the actual library, but I suspect quite a few have either copied this source code or depend on it in some way. The Java agent tools probably use asm or bcel and will not be impacted, and I'm not sure what native JVM TI agent tools might still be out there using this. But the bigger issue is how many products out there rely on these Third party JVM TI agents, that under the covers rely on java_crw_demo. The old transitive dependency issue. :( So how many Java developers will be impacted, very hard to tell. So do what you need to, but don't be surprised if you create a few ripples in the Java pool by removing this library. -kto On Oct 6, 2014, at 12:08 PM, Dmitry Samersoff wrote: > Hi Everybody, > > Please review: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8041639/webrev.01/ > > The fix moves code from java_crw_demo to libhprof. Also I remove mtrace > and minst demos that uses java_crw_demo library. > > -Dmitry > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources. From volker.simonis at gmail.com Tue Oct 7 08:58:54 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 7 Oct 2014 10:58:54 +0200 Subject: PowerPC: core file option not available with serviceability tools In-Reply-To: <5429E046.8030003@us.ibm.com> References: <53B5A92A.401@us.ibm.com> <53BD477D.2090107@us.ibm.com> <5429E046.8030003@us.ibm.com> Message-ID: Hi Maynard, I'm now back from JavaOne and can look at this issue. Could you please share your current implementation so I can reproduce your problem more easily. By the way, you can find the ppc frame layout description of all the different frame types (native, interpreted, compiled) in /hotspot/src/cpu/ppc/vm/frame_ppc.hpp. The different frame::sender() implementations (in frame_x86.cpp, frame_ppc.cpp, ..) contain all the gory details about how to walk a frame. That's what you have to implement in Java in order to get a full stack trace from the serviceability tools. On x86 the frame pointer (i.e. the ebp register points to the last frame pointer while (frame pointer - 1) points to the return pc. Regards, Volker On Tue, Sep 30, 2014 at 12:42 AM, Maynard Johnson wrote: > On 07/09/2014 12:38 PM, Volker Simonis wrote: >> On Wed, Jul 9, 2014 at 3:45 PM, Maynard Johnson wrote: >>> On 07/04/2014 10:59 AM, Volker Simonis wrote: >>>> Hi Maynard, >>>> >>>> we (i.e. SAP) do not currently support the SA agent on Linux/PPC64 and >>>> AIX (we have other proprietary servicibility tools). Because of that >>>> (and because SA isn't specified by the SE specification) porting the >>>> SA agent was no top priority until now. But there are no technical >>>> reasons why it should not work (it's just a lack of resources). Of >>>> course contributions are always highly welcome:) >>>> >>>> That said, the SA agent library and jar file actually gets build. If >>>> you do a complete build you'll find them under: >>>> >>>> hotspot/linux_ppc64_compiler2/generated/sa-jdi.jar >>>> hotspot/linux_ppc64_compiler2/{product,fastdebug,debug}/libsaproc.so >>>> >>>> in the build directory. They are just not copied into the jdk >>>> workspace and the created images because they don't work out of the >>>> box. >>>> >>>> The following two patches for the jdk9 top-level and hotspot >>>> repositories respectively should fix the build such that the agent >>>> files will be correctly copied into the images. >>>> >>>> http://cr.openjdk.java.net/~simonis/webrevs/sa_toplevel >>>> http://cr.openjdk.java.net/~simonis/webrevs/sa_hotspot/ >>>> >>>> They will get you to the point where for example 'jstack' will run up >>>> to the following point: >>> Ok, great. This should be enough to get me started. I should have time to begin on this later this week or early next week. >> >> Hi Maynard, >> >> great to welcome you in the ppc64 porting team:) >> >>> I may come knocking at your "door" for some occasional help, but I'll try to keep that to a minimum. > Hi, Volker. Knock, knock. :-) > I was preoccupied for a while this summer rolling out the latest release of oprofile (for which I'm the maintainer), but am now coming back to this task. I've implemented what I believe are all of the necessary ppc64-specific Java files to enable the jstack and jmap tools to work on core files. I've also updated hotspot/agent/src/os/linux/LinuxDebuggerLocal.c to implement the accumulation of register data on ppc64 vi ptrace. But now I've run into a problem I need help with. > > When I run jstack on my POWER7 system, it gets stuck in a loop in sun.jvm.hotspot.tools.StackTrace::run. There's an inner for-loop there where cur.getLastJavaVFrameDbg() is called ('cur' is a JavaThread). For the first JavaThread, we do return from getLastJavaVFrameDbg(), just as we do when running jstack on my Intel laptop. But for the second JavaThread, we never return from getLastJavaVFrameDbg() on ppc64. I believe the root of the problem is in my new sun.jvm.hotspot.runtime.ppc64.PPC64Frame class. The getLastJavaVFrameDbg method calls getCurrentFrameGuess, which is implemented in the new sun.jvm.hotspot.runtime.linux_ppc64.LinuxPPC64JavaThreadPDAccess class. In both ppc64 and x86, this first level xxxCurrentFrameGuess object is instantiated with a 'pc' value of null, so getCurrentFrameGuess then new's up a xxxFrame object, passing in the SP and FP, but no PC. The implementation of the PPC64Frame(Address,Address) constructor is currently identical to the X86Frame cons! > tructor, b > ut is almost certainly incorrect. In this constructor, the 'pc' is set as follows: > this.pc = raw_sp.getAddressAt(-1 * VM.getVM().getAddressSize()); > > This works fine on X86, but not on ppc64. But I'm not understanding how this even works on X86. From what I understand, the data below the stack pointer on X86 is the "red zone". How is that being used as a pc? But more importantly, do you know how I can ascertain what the 'pc' value should be for ppc64? > > Thanks in advance for any assistance you can give. > > -Maynard > >> >> Please feel free to ask any questions. The OpenJDK project and >> especially the HotSpot part are known to take some getting used to. >> >>> I was wondering if a bug report should be opened in JBS, just to record that the issue is being worked. Thoughts? >> >> I have opened "8049715: PPC64: First steps to enable SA on >> Linux/PPC64" (https://bugs.openjdk.java.net/browse/JDK-8049715) for >> the patch which I sent you with the last mail. I've already sent out >> webrevs for that change and hopefully it will be fixed within the next >> few days. >> >> For the actual port of the ppc64-specific stuff I opened bug "8049716 >> PPC64: Implement SA on Linux/PPC64" >> (https://bugs.openjdk.java.net/browse/JDK-8049716). I can also help >> with hosting the webrevs, once you have a running version. >> >> Regards, >> Volker >> >>> >>> -Maynard >>>> >>>>> images/j2sdk-image/bin/jstack ./jdk/bin/java core.13547 >>>> Attaching to core core.13547 from executable ./jdk/bin/java, please wait... >>>> WARNING: Hotspot VM version >>>> 1.9.0-internal-debug-d046063_2014_07_04_11_46-b00 does not match with >>>> SA version 1.9.0-internal-debug-d046063_2014_07_04_11_46-b00. You may >>>> see unexpected results. >>>> Debugger attached successfully. >>>> Server compiler detected. >>>> JVM version is 1.9.0-internal-debug-d046063_2014_07_04_11_46-b00 >>>> Deadlock Detection: >>>> >>>> Exception in thread "main" java.lang.reflect.InvocationTargetException >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> at java.lang.reflect.Method.invoke(Method.java:484) >>>> at sun.tools.jstack.JStack.runJStackTool(JStack.java:140) >>>> at sun.tools.jstack.JStack.main(JStack.java:106) >>>> Caused by: java.lang.ExceptionInInitializerError >>>> at sun.jvm.hotspot.runtime.VM.getThreads(VM.java:610) >>>> at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:54) >>>> at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:39) >>>> at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:62) >>>> at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45) >>>> at sun.jvm.hotspot.tools.JStack.run(JStack.java:66) >>>> at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260) >>>> at sun.jvm.hotspot.tools.Tool.start(Tool.java:223) >>>> at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118) >>>> at sun.jvm.hotspot.tools.JStack.main(JStack.java:92) >>>> ... 6 more >>>> Caused by: java.lang.RuntimeException: OS/CPU combination linux/ppc64 >>>> not yet supported >>>> at sun.jvm.hotspot.runtime.Threads.initialize(Threads.java:97) >>>> at sun.jvm.hotspot.runtime.Threads.access$000(Threads.java:42) >>>> at sun.jvm.hotspot.runtime.Threads$1.update(Threads.java:52) >>>> at sun.jvm.hotspot.runtime.VM.registerVMInitializedObserver(VM.java:394) >>>> at sun.jvm.hotspot.runtime.Threads.(Threads.java:50) >>>> ... 16 more >>>> >>>> And that's the point where I was saying that "contributions are always >>>> highly welcome:)" >>>> >>>> Now all the Linux/PPC64 specific class under >>>> hotspot/agent/src/share/classes/ would have to be implemented (e.g. >>>> sun/jvm/hotspot/runtime/amd64/AMD64CurrentFrameGuess). Are you >>>> interested in contributing to this project? >>>> >>>> Regards, >>>> Volker >>>> >>>> PS: I cc-ed serviceability-dev because I remember that they started a >>>> poll a while ago about who is using the SA tools. I'm therefore not >>>> quite sure what's the current status and what are the future plan for >>>> these libraries. >>>> >>>> >>>> On Thu, Jul 3, 2014 at 9:04 PM, Maynard Johnson wrote: >>>>> Hi, all, >>>>> On my Intel laptop, I note that certain jdk9 serviceability tools -- jstack, jmap, jsadebugd -- have an option to pass a core file instead of a process ID; for example: >>>>> >>>>> $ jstack -h >>>>> Usage: >>>>> jstack [-l] >>>>> (to connect to running process) >>>>> jstack -F [-m] [-l] >>>>> (to connect to a hung process) >>>>> jstack [-m] [-l] >>>>> (to connect to a core file) >>>>> jstack [-m] [-l] [server_id@] >>>>> (to connect to a remote debug server) >>>>> >>>>> But on my PowerLinux box, the core file option is missing from the usage output. I see that jdk9-dev/jdk/src/share/classes/sun/tools/jstack/JStack.java requires the existence of sun.jvm.hotspot.tools.JStack for the core file option to be made available. On my Intel system, the sun.jvm.hotspot.tools.JStack class is packaged in sa-jdi.jar in /jvm/openjdk-1.9.0-internal/lib/. But the sa-jdi.jar is missing on PowerPC. Is there a technical reason for this or is it an oversight? >>>>> >>>>> The jsadebugd tool does not run at all on PowerLinux; it gets the following error: >>>>> >>>>> Error: Could not find or load main class sun.jvm.hotspot.jdi.SADebugServer >>>>> >>>>> On my Intel system, the SADebugServer class is packaged in the sa-jdi.jar mentioned above. >>>>> >>>>> I've spent the past day or so looking at makefiles until I'm cross-eyed, but haven't yet found where the issue might be. Any tips would be appreciated. >>>>> >>>>> Thanks. >>>>> -Maynard >>>>> >>>> >>> >> > From maynardj at us.ibm.com Tue Oct 7 13:35:58 2014 From: maynardj at us.ibm.com (Maynard Johnson) Date: Tue, 07 Oct 2014 08:35:58 -0500 Subject: PowerPC: core file option not available with serviceability tools In-Reply-To: References: <53B5A92A.401@us.ibm.com> <53BD477D.2090107@us.ibm.com> <5429E046.8030003@us.ibm.com> Message-ID: <5433EC3E.1010800@us.ibm.com> On 10/07/2014 03:58 AM, Volker Simonis wrote: > Hi Maynard, > > I'm now back from JavaOne and can look at this issue. Could you please > share your current implementation so I can reproduce your problem more > easily. See attachment. The two patches in the attached tar file apply to a jdk9-dev snapshot from July. I haven't even tried forward-porting to current upstream code, so I don't know how well they would apply. > > By the way, you can find the ppc frame layout description of all the > different frame types (native, interpreted, compiled) in > /hotspot/src/cpu/ppc/vm/frame_ppc.hpp. The different frame::sender() > implementations (in frame_x86.cpp, frame_ppc.cpp, ..) contain all the Yes, I've been studying those files, but I freely admit I don't have a good grasp of the code yet. > gory details about how to walk a frame. That's what you have to > implement in Java in order to get a full stack trace from the > serviceability tools. On x86 the frame pointer (i.e. the ebp register > points to the last frame pointer while (frame pointer - 1) points to > the return pc. Ummm . . . Stack addresses grow downward, and I was under the impression 'return pc' was the word on the stack directly before the ebp, which would mean return pc is at 'frame pointer + 1'. Or am I off base here? Nevertheless, my question below concerns the "pc", not the "return pc". Perhaps I'm misunderstanding "pc" in this context; but even so, the x86 code still seems wrong: this.pc = raw_sp.getAddressAt(-1 * VM.getVM().getAddressSize()); If, in fact, 'this.pc' is supposed to represent 'return pc' in this context, then I would think the code should be: this.pc = raw_fp.getAddressAt(VM.getVM().getAddressSize()); I hope you can help set me on the right track. As you can see, I'm lost in the weeds right now. :-) -Maynard > > Regards, > Volker > > > On Tue, Sep 30, 2014 at 12:42 AM, Maynard Johnson wrote: >> On 07/09/2014 12:38 PM, Volker Simonis wrote: >>> On Wed, Jul 9, 2014 at 3:45 PM, Maynard Johnson wrote: >>>> On 07/04/2014 10:59 AM, Volker Simonis wrote: >>>>> Hi Maynard, >>>>> >>>>> we (i.e. SAP) do not currently support the SA agent on Linux/PPC64 and >>>>> AIX (we have other proprietary servicibility tools). Because of that >>>>> (and because SA isn't specified by the SE specification) porting the >>>>> SA agent was no top priority until now. But there are no technical >>>>> reasons why it should not work (it's just a lack of resources). Of >>>>> course contributions are always highly welcome:) >>>>> >>>>> That said, the SA agent library and jar file actually gets build. If >>>>> you do a complete build you'll find them under: >>>>> >>>>> hotspot/linux_ppc64_compiler2/generated/sa-jdi.jar >>>>> hotspot/linux_ppc64_compiler2/{product,fastdebug,debug}/libsaproc.so >>>>> >>>>> in the build directory. They are just not copied into the jdk >>>>> workspace and the created images because they don't work out of the >>>>> box. >>>>> >>>>> The following two patches for the jdk9 top-level and hotspot >>>>> repositories respectively should fix the build such that the agent >>>>> files will be correctly copied into the images. >>>>> >>>>> http://cr.openjdk.java.net/~simonis/webrevs/sa_toplevel >>>>> http://cr.openjdk.java.net/~simonis/webrevs/sa_hotspot/ >>>>> >>>>> They will get you to the point where for example 'jstack' will run up >>>>> to the following point: >>>> Ok, great. This should be enough to get me started. I should have time to begin on this later this week or early next week. >>> >>> Hi Maynard, >>> >>> great to welcome you in the ppc64 porting team:) >>> >>>> I may come knocking at your "door" for some occasional help, but I'll try to keep that to a minimum. >> Hi, Volker. Knock, knock. :-) >> I was preoccupied for a while this summer rolling out the latest release of oprofile (for which I'm the maintainer), but am now coming back to this task. I've implemented what I believe are all of the necessary ppc64-specific Java files to enable the jstack and jmap tools to work on core files. I've also updated hotspot/agent/src/os/linux/LinuxDebuggerLocal.c to implement the accumulation of register data on ppc64 vi ptrace. But now I've run into a problem I need help with. >> >> When I run jstack on my POWER7 system, it gets stuck in a loop in sun.jvm.hotspot.tools.StackTrace::run. There's an inner for-loop there where cur.getLastJavaVFrameDbg() is called ('cur' is a JavaThread). For the first JavaThread, we do return from getLastJavaVFrameDbg(), just as we do when running jstack on my Intel laptop. But for the second JavaThread, we never return from getLastJavaVFrameDbg() on ppc64. I believe the root of the problem is in my new sun.jvm.hotspot.runtime.ppc64.PPC64Frame class. The getLastJavaVFrameDbg method calls getCurrentFrameGuess, which is implemented in the new sun.jvm.hotspot.runtime.linux_ppc64.LinuxPPC64JavaThreadPDAccess class. In both ppc64 and x86, this first level xxxCurrentFrameGuess object is instantiated with a 'pc' value of null, so getCurrentFrameGuess then new's up a xxxFrame object, passing in the SP and FP, but no PC. The implementation of the PPC64Frame(Address,Address) constructor is currently identical to the X86Frame c! ons! >> tructor, b >> ut is almost certainly incorrect. In this constructor, the 'pc' is set as follows: >> this.pc = raw_sp.getAddressAt(-1 * VM.getVM().getAddressSize()); >> >> This works fine on X86, but not on ppc64. But I'm not understanding how this even works on X86. From what I understand, the data below the stack pointer on X86 is the "red zone". How is that being used as a pc? But more importantly, do you know how I can ascertain what the 'pc' value should be for ppc64? >> >> Thanks in advance for any assistance you can give. >> >> -Maynard >> >>> >>> Please feel free to ask any questions. The OpenJDK project and >>> especially the HotSpot part are known to take some getting used to. >>> >>>> I was wondering if a bug report should be opened in JBS, just to record that the issue is being worked. Thoughts? >>> >>> I have opened "8049715: PPC64: First steps to enable SA on >>> Linux/PPC64" (https://bugs.openjdk.java.net/browse/JDK-8049715) for >>> the patch which I sent you with the last mail. I've already sent out >>> webrevs for that change and hopefully it will be fixed within the next >>> few days. >>> >>> For the actual port of the ppc64-specific stuff I opened bug "8049716 >>> PPC64: Implement SA on Linux/PPC64" >>> (https://bugs.openjdk.java.net/browse/JDK-8049716). I can also help >>> with hosting the webrevs, once you have a running version. >>> >>> Regards, >>> Volker >>> >>>> >>>> -Maynard >>>>> >>>>>> images/j2sdk-image/bin/jstack ./jdk/bin/java core.13547 >>>>> Attaching to core core.13547 from executable ./jdk/bin/java, please wait... >>>>> WARNING: Hotspot VM version >>>>> 1.9.0-internal-debug-d046063_2014_07_04_11_46-b00 does not match with >>>>> SA version 1.9.0-internal-debug-d046063_2014_07_04_11_46-b00. You may >>>>> see unexpected results. >>>>> Debugger attached successfully. >>>>> Server compiler detected. >>>>> JVM version is 1.9.0-internal-debug-d046063_2014_07_04_11_46-b00 >>>>> Deadlock Detection: >>>>> >>>>> Exception in thread "main" java.lang.reflect.InvocationTargetException >>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>> at java.lang.reflect.Method.invoke(Method.java:484) >>>>> at sun.tools.jstack.JStack.runJStackTool(JStack.java:140) >>>>> at sun.tools.jstack.JStack.main(JStack.java:106) >>>>> Caused by: java.lang.ExceptionInInitializerError >>>>> at sun.jvm.hotspot.runtime.VM.getThreads(VM.java:610) >>>>> at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:54) >>>>> at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:39) >>>>> at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:62) >>>>> at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45) >>>>> at sun.jvm.hotspot.tools.JStack.run(JStack.java:66) >>>>> at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260) >>>>> at sun.jvm.hotspot.tools.Tool.start(Tool.java:223) >>>>> at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118) >>>>> at sun.jvm.hotspot.tools.JStack.main(JStack.java:92) >>>>> ... 6 more >>>>> Caused by: java.lang.RuntimeException: OS/CPU combination linux/ppc64 >>>>> not yet supported >>>>> at sun.jvm.hotspot.runtime.Threads.initialize(Threads.java:97) >>>>> at sun.jvm.hotspot.runtime.Threads.access$000(Threads.java:42) >>>>> at sun.jvm.hotspot.runtime.Threads$1.update(Threads.java:52) >>>>> at sun.jvm.hotspot.runtime.VM.registerVMInitializedObserver(VM.java:394) >>>>> at sun.jvm.hotspot.runtime.Threads.(Threads.java:50) >>>>> ... 16 more >>>>> >>>>> And that's the point where I was saying that "contributions are always >>>>> highly welcome:)" >>>>> >>>>> Now all the Linux/PPC64 specific class under >>>>> hotspot/agent/src/share/classes/ would have to be implemented (e.g. >>>>> sun/jvm/hotspot/runtime/amd64/AMD64CurrentFrameGuess). Are you >>>>> interested in contributing to this project? >>>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> PS: I cc-ed serviceability-dev because I remember that they started a >>>>> poll a while ago about who is using the SA tools. I'm therefore not >>>>> quite sure what's the current status and what are the future plan for >>>>> these libraries. >>>>> >>>>> >>>>> On Thu, Jul 3, 2014 at 9:04 PM, Maynard Johnson wrote: >>>>>> Hi, all, >>>>>> On my Intel laptop, I note that certain jdk9 serviceability tools -- jstack, jmap, jsadebugd -- have an option to pass a core file instead of a process ID; for example: >>>>>> >>>>>> $ jstack -h >>>>>> Usage: >>>>>> jstack [-l] >>>>>> (to connect to running process) >>>>>> jstack -F [-m] [-l] >>>>>> (to connect to a hung process) >>>>>> jstack [-m] [-l] >>>>>> (to connect to a core file) >>>>>> jstack [-m] [-l] [server_id@] >>>>>> (to connect to a remote debug server) >>>>>> >>>>>> But on my PowerLinux box, the core file option is missing from the usage output. I see that jdk9-dev/jdk/src/share/classes/sun/tools/jstack/JStack.java requires the existence of sun.jvm.hotspot.tools.JStack for the core file option to be made available. On my Intel system, the sun.jvm.hotspot.tools.JStack class is packaged in sa-jdi.jar in /jvm/openjdk-1.9.0-internal/lib/. But the sa-jdi.jar is missing on PowerPC. Is there a technical reason for this or is it an oversight? >>>>>> >>>>>> The jsadebugd tool does not run at all on PowerLinux; it gets the following error: >>>>>> >>>>>> Error: Could not find or load main class sun.jvm.hotspot.jdi.SADebugServer >>>>>> >>>>>> On my Intel system, the SADebugServer class is packaged in the sa-jdi.jar mentioned above. >>>>>> >>>>>> I've spent the past day or so looking at makefiles until I'm cross-eyed, but haven't yet found where the issue might be. Any tips would be appreciated. >>>>>> >>>>>> Thanks. >>>>>> -Maynard >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: jdk9-ppc64-serviceability-patches.tar.gz Type: application/x-gzip Size: 14389 bytes Desc: not available URL: From thomas.stuefe at gmail.com Tue Oct 7 14:26:29 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 7 Oct 2014 16:26:29 +0200 Subject: RFR(XS): 8059868: JVM crashes on attach on Windows when compiled with /RTC1 Message-ID: Hi all, We saw crashes when connecting to a target VM using com.sun.tools.attach.WindowsVirtualMachine when injecting VM was compiled with /RTC1. The error turned out to be in VirtualMachineImpl.c: the function "jvm_attach_thread_func" - the one which is injected into the target VM and used as thread entry point for CreateRemoteThread() - must be compiled with runtime checks disabled in order to keep the code-to-inject position independent. Using /rtc1 will cause the Microsoft compiler to generate relative calls to a check function ("_RTC_CheckEsp") which will not work if code is planted in target process at a different address. This change adds a pragma to locally disable the runtime checks and re-enable them below the function. http://cr.openjdk.java.net/~simonis/webrevs/8059868/ Kind regards, Thomas Stuefe -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Tue Oct 7 14:27:44 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 7 Oct 2014 16:27:44 +0200 Subject: RFR(XS): 8059868: JVM crashes on attach on Windows when compiled with /RTC1 In-Reply-To: References: Message-ID: (the corresponding bug report is https://bugs.openjdk.java.net/browse/JDK-8059868 ) On Tue, Oct 7, 2014 at 4:26 PM, Thomas St?fe wrote: > Hi all, > > We saw crashes when connecting to a target VM using > com.sun.tools.attach.WindowsVirtualMachine when injecting VM was compiled > with /RTC1. > > The error turned out to be in VirtualMachineImpl.c: the function > "jvm_attach_thread_func" - the one which is injected into the target VM and > used as thread entry point for CreateRemoteThread() - must be compiled with > runtime checks disabled in order to keep the code-to-inject position > independent. > > Using /rtc1 will cause the Microsoft compiler to generate relative calls > to a check function ("_RTC_CheckEsp") which will not work if code is > planted in target process at a different address. > > This change adds a pragma to locally disable the runtime checks and > re-enable them below the function. > > http://cr.openjdk.java.net/~simonis/webrevs/8059868/ > > Kind regards, > > Thomas Stuefe > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Tue Oct 7 16:35:53 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 7 Oct 2014 18:35:53 +0200 Subject: RFR(XS): 8059868: JVM crashes on attach on Windows when compiled with /RTC1 In-Reply-To: References: Message-ID: Hi, the change looks good to me. I can also sponsor this change. Nevertheless I'd like to get one more opinion from the serviceability group. Thanks, Volker On Tue, Oct 7, 2014 at 4:27 PM, Thomas St?fe wrote: > (the corresponding bug report is > https://bugs.openjdk.java.net/browse/JDK-8059868 ) > > > On Tue, Oct 7, 2014 at 4:26 PM, Thomas St?fe > wrote: >> >> Hi all, >> >> We saw crashes when connecting to a target VM using >> com.sun.tools.attach.WindowsVirtualMachine when injecting VM was compiled >> with /RTC1. >> >> The error turned out to be in VirtualMachineImpl.c: the function >> "jvm_attach_thread_func" - the one which is injected into the target VM and >> used as thread entry point for CreateRemoteThread() - must be compiled with >> runtime checks disabled in order to keep the code-to-inject position >> independent. >> >> Using /rtc1 will cause the Microsoft compiler to generate relative calls >> to a check function ("_RTC_CheckEsp") which will not work if code is planted >> in target process at a different address. >> >> This change adds a pragma to locally disable the runtime checks and >> re-enable them below the function. >> >> http://cr.openjdk.java.net/~simonis/webrevs/8059868/ >> >> Kind regards, >> >> Thomas Stuefe >> > From Alan.Bateman at oracle.com Wed Oct 8 04:17:56 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 07 Oct 2014 21:17:56 -0700 Subject: RFR(XS): 8059868: JVM crashes on attach on Windows when compiled with /RTC1 In-Reply-To: References: Message-ID: <5434BAF4.2080108@oracle.com> On 07/10/2014 07:26, Thomas St?fe wrote: > Hi all, > > We saw crashes when connecting to a target VM using > com.sun.tools.attach.WindowsVirtualMachine when injecting VM was > compiled with /RTC1. > > The error turned out to be in VirtualMachineImpl.c: the function > "jvm_attach_thread_func" - the one which is injected into the target > VM and used as thread entry point for CreateRemoteThread() - must be > compiled with runtime checks disabled in order to keep the > code-to-inject position independent. > > Using /rtc1 will cause the Microsoft compiler to generate relative > calls to a check function ("_RTC_CheckEsp") which will not work if > code is planted in target process at a different address. > > This change adds a pragma to locally disable the runtime checks and > re-enable them below the function. > > http://cr.openjdk.java.net/~simonis/webrevs/8059868/ > > > This make sense to me. A very tiny comment is that we should have use consistent spacing in the #pragma values (check_stack and runtime_checks should be consistent, I don't think it matters which way). -Alan From staffan.larsen at oracle.com Wed Oct 8 07:33:25 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 8 Oct 2014 09:33:25 +0200 Subject: RFR(XS): 8059868: JVM crashes on attach on Windows when compiled with /RTC1 In-Reply-To: References: Message-ID: <45214B2F-2518-4CFC-B202-DE156751C1D6@oracle.com> Looks good! Thanks, /Staffan On 7 okt 2014, at 16:26, Thomas St?fe wrote: > Hi all, > > We saw crashes when connecting to a target VM using com.sun.tools.attach.WindowsVirtualMachine when injecting VM was compiled with /RTC1. > > The error turned out to be in VirtualMachineImpl.c: the function "jvm_attach_thread_func" - the one which is injected into the target VM and used as thread entry point for CreateRemoteThread() - must be compiled with runtime checks disabled in order to keep the code-to-inject position independent. > > Using /rtc1 will cause the Microsoft compiler to generate relative calls to a check function ("_RTC_CheckEsp") which will not work if code is planted in target process at a different address. > > This change adds a pragma to locally disable the runtime checks and re-enable them below the function. > > http://cr.openjdk.java.net/~simonis/webrevs/8059868/ > > Kind regards, > > Thomas Stuefe > -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Wed Oct 8 07:51:41 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 8 Oct 2014 09:51:41 +0200 Subject: RFR(XS): 8059868: JVM crashes on attach on Windows when compiled with /RTC1 In-Reply-To: <45214B2F-2518-4CFC-B202-DE156751C1D6@oracle.com> References: <45214B2F-2518-4CFC-B202-DE156751C1D6@oracle.com> Message-ID: Thanks Alan, Staffan, I'll push the change with the spacing adjusted as suggested by Alan. Regards, Volker On Wed, Oct 8, 2014 at 9:33 AM, Staffan Larsen wrote: > Looks good! > > Thanks, > /Staffan > > On 7 okt 2014, at 16:26, Thomas St?fe wrote: > > Hi all, > > We saw crashes when connecting to a target VM using > com.sun.tools.attach.WindowsVirtualMachine when injecting VM was compiled > with /RTC1. > > The error turned out to be in VirtualMachineImpl.c: the function > "jvm_attach_thread_func" - the one which is injected into the target VM and > used as thread entry point for CreateRemoteThread() - must be compiled with > runtime checks disabled in order to keep the code-to-inject position > independent. > > Using /rtc1 will cause the Microsoft compiler to generate relative calls to > a check function ("_RTC_CheckEsp") which will not work if code is planted in > target process at a different address. > > This change adds a pragma to locally disable the runtime checks and > re-enable them below the function. > > http://cr.openjdk.java.net/~simonis/webrevs/8059868/ > > Kind regards, > > Thomas Stuefe > > From thomas.stuefe at gmail.com Wed Oct 8 09:00:37 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 8 Oct 2014 11:00:37 +0200 Subject: RFR(XS): 8059868: JVM crashes on attach on Windows when compiled with /RTC1 In-Reply-To: References: <45214B2F-2518-4CFC-B202-DE156751C1D6@oracle.com> Message-ID: Thank you all! On Wed, Oct 8, 2014 at 9:51 AM, Volker Simonis wrote: > Thanks Alan, Staffan, > > I'll push the change with the spacing adjusted as suggested by Alan. > > Regards, > Volker > > > On Wed, Oct 8, 2014 at 9:33 AM, Staffan Larsen > wrote: > > Looks good! > > > > Thanks, > > /Staffan > > > > On 7 okt 2014, at 16:26, Thomas St?fe wrote: > > > > Hi all, > > > > We saw crashes when connecting to a target VM using > > com.sun.tools.attach.WindowsVirtualMachine when injecting VM was compiled > > with /RTC1. > > > > The error turned out to be in VirtualMachineImpl.c: the function > > "jvm_attach_thread_func" - the one which is injected into the target VM > and > > used as thread entry point for CreateRemoteThread() - must be compiled > with > > runtime checks disabled in order to keep the code-to-inject position > > independent. > > > > Using /rtc1 will cause the Microsoft compiler to generate relative calls > to > > a check function ("_RTC_CheckEsp") which will not work if code is > planted in > > target process at a different address. > > > > This change adds a pragma to locally disable the runtime checks and > > re-enable them below the function. > > > > http://cr.openjdk.java.net/~simonis/webrevs/8059868/ > > > > Kind regards, > > > > Thomas Stuefe > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Oct 9 00:47:21 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 08 Oct 2014 17:47:21 -0700 Subject: RFR (XS) 8059904: libjvm_db.c warnings in solaris/sparc build with SS Message-ID: <5435DB19.7080005@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-8059904 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8059904-jvmdb-warn.1/ Summary: Several warnings started to appear at compilation of the libjvm_db.c after the switch from Sun C++ 5.10 to 5.12. The fix is to cast the result of calloc() to the correct type. Testing: Running the adhoc pstack tests on Solaris sparcv9 and amd64 Thanks, Serguei From david.holmes at oracle.com Thu Oct 9 04:18:41 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 09 Oct 2014 14:18:41 +1000 Subject: RFR (XS) 8059904: libjvm_db.c warnings in solaris/sparc build with SS In-Reply-To: <5435DB19.7080005@oracle.com> References: <5435DB19.7080005@oracle.com> Message-ID: <54360CA1.1050608@oracle.com> On 9/10/2014 10:47 AM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-8059904 Shouldn't the same fix be applied to the bsd version? Otherwise looks okay. Thanks, David > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8059904-jvmdb-warn.1/ > > > > Summary: > > Several warnings started to appear at compilation of the libjvm_db.c > after the switch from Sun C++ 5.10 to 5.12. > The fix is to cast the result of calloc() to the correct type. > > > Testing: > Running the adhoc pstack tests on Solaris sparcv9 and amd64 > > > Thanks, > Serguei From mandy.chung at oracle.com Thu Oct 9 06:22:33 2014 From: mandy.chung at oracle.com (Mandy Chung) Date: Wed, 08 Oct 2014 23:22:33 -0700 Subject: RFR(M): JDK-8041639: Don't link the java_crw_demo shared library from product tools In-Reply-To: <5432E893.9080500@oracle.com> References: <5432E893.9080500@oracle.com> Message-ID: <543629A9.1060407@oracle.com> Hi Dmitry, On 10/6/2014 12:08 PM, Dmitry Samersoff wrote: > Hi Everybody, > > Please review: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8041639/webrev.01/ > > The fix moves code from java_crw_demo to libhprof. Also I remove mtrace > and minst demos that uses java_crw_demo library. > Do you really want to remove mtrace and minst demos? Perhaps just make a copy of crw.c and crw.h (instead of moving) although not ideal. I think you could simply copy the main part of README_CRW.txt (from line 44) to crw.h and no need to move that README to hprof source. Mandy From serguei.spitsyn at oracle.com Thu Oct 9 08:30:24 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 09 Oct 2014 01:30:24 -0700 Subject: RFR (XS) 8059904: libjvm_db.c warnings in solaris/sparc build with SS In-Reply-To: <54360CA1.1050608@oracle.com> References: <5435DB19.7080005@oracle.com> <54360CA1.1050608@oracle.com> Message-ID: <543647A0.5000308@oracle.com> David, Thank you for review! I've applied the same fix to the bsd version as it exists (see the patch below). However, I do not think the libjvm_db.so is currently used by the pstack utility on bsd or Mac OSX. The pstack utility must be updated to get use of it. One more observation is that the libjvm_db.c has a minor difference on bsd: % diff bsd/dtrace/libjvm_db.c solaris/dtrace/libjvm_db.c 29c29 < // not available on macosx #include --- > #include 493c493 < err = find_symbol(J, "__1cNMethodG__vtbl_", &J->Method_vtbl); --- > err = find_symbol(J, "__1cGMethodG__vtbl_", &J->Method_vtbl); This difference came with the fix of 6964458. It looks like the mangling rules are different on bsd vs on solaris. Thanks, Serguei % hg diff diff -r 795fc0cef7c9 src/os/bsd/dtrace/libjvm_db.c --- a/src/os/bsd/dtrace/libjvm_db.c Fri Oct 03 13:56:18 2014 -0700 +++ b/src/os/bsd/dtrace/libjvm_db.c Thu Oct 09 00:55:34 2014 -0700 @@ -347,10 +347,10 @@ &J->Number_of_heaps, sizeof(J->Number_of_heaps)); /* Allocate memory for heap configurations */ - J->Heap_low = (jvm_agent_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); - J->Heap_high = (jvm_agent_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); - J->Heap_segmap_low = (jvm_agent_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); - J->Heap_segmap_high = (jvm_agent_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); + J->Heap_low = (uint64_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); + J->Heap_high = (uint64_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); + J->Heap_segmap_low = (uint64_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); + J->Heap_segmap_high = (uint64_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); /* Read code heap configurations */ for (i = 0; i < J->Number_of_heaps; ++i) { diff -r 795fc0cef7c9 src/os/solaris/dtrace/libjvm_db.c --- a/src/os/solaris/dtrace/libjvm_db.c Fri Oct 03 13:56:18 2014 -0700 +++ b/src/os/solaris/dtrace/libjvm_db.c Thu Oct 09 00:55:34 2014 -0700 @@ -347,10 +347,10 @@ &J->Number_of_heaps, sizeof(J->Number_of_heaps)); /* Allocate memory for heap configurations */ - J->Heap_low = (jvm_agent_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); - J->Heap_high = (jvm_agent_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); - J->Heap_segmap_low = (jvm_agent_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); - J->Heap_segmap_high = (jvm_agent_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); + J->Heap_low = (uint64_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); + J->Heap_high = (uint64_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); + J->Heap_segmap_low = (uint64_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); + J->Heap_segmap_high = (uint64_t*)calloc(J->Number_of_heaps, sizeof(uint64_t)); /* Read code heap configurations */ for (i = 0; i < J->Number_of_heaps; ++i) { On 10/8/14 9:18 PM, David Holmes wrote: > On 9/10/2014 10:47 AM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-8059904 > > Shouldn't the same fix be applied to the bsd version? > > Otherwise looks okay. > > Thanks, > David > >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8059904-jvmdb-warn.1/ >> >> >> >> >> Summary: >> >> Several warnings started to appear at compilation of the libjvm_db.c >> after the switch from Sun C++ 5.10 to 5.12. >> The fix is to cast the result of calloc() to the correct type. >> >> >> Testing: >> Running the adhoc pstack tests on Solaris sparcv9 and amd64 >> >> >> Thanks, >> Serguei From thomas.stuefe at gmail.com Thu Oct 9 09:17:24 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 9 Oct 2014 11:17:24 +0200 Subject: Questions about com.sun.tools.attach implementation on windows Message-ID: Hi all, I have a question about the implementation of com.sun.tools.attach on Windows. when investigating a crashes which were related to VirtualMachineImpl.c (see https://bugs.openjdk.java.net/browse/JDK-8059868), the implementation seemed to me very unusual and fragile: We want to call monitoring functions in the target VM. To do that, we inject a code stub (compiled by the c++ compiler used to build the injector VM) into the target VM. Then we use CreateRemoteThread() to run this code. As a parameter to this function, we hand over addresses to Win32 APIS taken from the injector-VM. The target VM will call those functions, implicitly assuming the locations to be the same in both processes. Problems which may occur (at least as far as I understand it): - the code injected from the injector VM into the target VM must be position independent - something which may easily break with new security features Microsoft adds to their compiler (e.g. the /RTC1 crash reported above). - kernel32.dll must be always loaded at the same address for the Win32 API calls to work. I am not sure this is always the case now, with address layout randomization. - I also could imagine (though I did not see that) virus scanners not being happy and reporting this activity as suspicious. Also, if the coding fails - most likely crashes - effects are a bit harsh: - the target VM, which is "innocent" and my be stable and years old, will crash if the injector VM was compiled wrong, e.g. in a way which makes the injected code position dependend. This may happen by a new compiler switch or compiler update. - There is no way to get decent error handling, at least for 32bit, because we rely on stack based SEH handling and in the remote thread there is no SEH handler yet set up. The effect is sudden death of the target VM with no hs-err file. - The crash is difficult to figure out: first one will analyze the target VM, which is innocent, because it crashes. But we have no matching debug information for the injected stub, so debugging is difficult. I would love to know why we do it this way. I am sure there is a valid reason for it. Maybe Backward compatibility? Thanks! and Kind Regards, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From staffan.larsen at oracle.com Thu Oct 9 10:57:51 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 9 Oct 2014 12:57:51 +0200 Subject: Questions about com.sun.tools.attach implementation on windows In-Reply-To: References: Message-ID: I completely agree that this code is extremely fragile and quite ?hacky? - there is no use defending (except that it works most of the time). If we can come up with a better solution that allows us to attach to a running process, I am all for it. Perhaps shared memory and a shared mutex could be used? Or the windows mailslots API? (The mechanism on the non-windows platforms is also full of races and prone to leak resources, so changing those is also welcome.) /Staffan On 9 okt 2014, at 11:17, Thomas St?fe wrote: > > Hi all, > > I have a question about the implementation of com.sun.tools.attach on Windows. > > when investigating a crashes which were related to VirtualMachineImpl.c (see https://bugs.openjdk.java.net/browse/JDK-8059868), the implementation seemed to me very unusual and fragile: > > We want to call monitoring functions in the target VM. > > To do that, we inject a code stub (compiled by the c++ compiler used to build the injector VM) into the target VM. Then we use CreateRemoteThread() to run this code. > > As a parameter to this function, we hand over addresses to Win32 APIS taken from the injector-VM. The target VM will call those functions, implicitly assuming the locations to be the same in both processes. > > Problems which may occur (at least as far as I understand it): > > - the code injected from the injector VM into the target VM must be position independent - something which may easily break with new security features Microsoft adds to their compiler (e.g. the /RTC1 crash reported above). > > - kernel32.dll must be always loaded at the same address for the Win32 API calls to work. I am not sure this is always the case now, with address layout randomization. > > - I also could imagine (though I did not see that) virus scanners not being happy and reporting this activity as suspicious. > > Also, if the coding fails - most likely crashes - effects are a bit harsh: > > - the target VM, which is "innocent" and my be stable and years old, will crash if the injector VM was compiled wrong, e.g. in a way which makes the injected code position dependend. This may happen by a new compiler switch or compiler update. > > - There is no way to get decent error handling, at least for 32bit, because we rely on stack based SEH handling and in the remote thread there is no SEH handler yet set up. The effect is sudden death > of the target VM with no hs-err file. > > - The crash is difficult to figure out: first one will analyze the target VM, which is innocent, because it crashes. But we have no matching debug information for the injected stub, so debugging is difficult. > > I would love to know why we do it this way. I am sure there is a valid reason for it. Maybe Backward compatibility? > > Thanks! and Kind Regards, > > Thomas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Thu Oct 9 12:19:26 2014 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 09 Oct 2014 05:19:26 -0700 Subject: Questions about com.sun.tools.attach implementation on windows In-Reply-To: References: Message-ID: <54367D4E.3030002@oracle.com> On 09/10/2014 02:17, Thomas St?fe wrote: > > : > > I would love to know why we do it this way. I am sure there is a valid > reason for it. Maybe Backward compatibility? > This was a very typical way for debugging utilities to work at the time. It was never intended of course to be used to attach to non-HotSpot VMs and also pre-dates a lot of the security protections and other features that were subsequently introduced. One thing to know is there wasn't a strict requirement to be able to attach to JDKs running previous releases so if a new underlying mechanism is introduced then it shouldn't be a major compatibility issue. Also as the attach API is pluggable then it would be possible to have both old and new providers for a time for cases where it was really necessary to attach to older releases. -Alan From thomas.stuefe at gmail.com Fri Oct 10 10:54:58 2014 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 10 Oct 2014 12:54:58 +0200 Subject: Questions about com.sun.tools.attach implementation on windows In-Reply-To: <54367D4E.3030002@oracle.com> References: <54367D4E.3030002@oracle.com> Message-ID: Thanks for answering and clarifying the history! It seems after reading up on it, that kernel32.dll has a fixed base address and therefore is always loaded to the same base, system wide. If that is true, at least handing over addresses to GetProcAddress() etc from injector to target VM should always work. On Thu, Oct 9, 2014 at 2:19 PM, Alan Bateman wrote: > On 09/10/2014 02:17, Thomas St?fe wrote: > >> >> : >> >> I would love to know why we do it this way. I am sure there is a valid >> reason for it. Maybe Backward compatibility? >> >> This was a very typical way for debugging utilities to work at the time. > It was never intended of course to be used to attach to non-HotSpot VMs and > also pre-dates a lot of the security protections and other features that > were subsequently introduced. One thing to know is there wasn't a strict > requirement to be able to attach to JDKs running previous releases so if a > new underlying mechanism is introduced then it shouldn't be a major > compatibility issue. Also as the attach API is pluggable then it would be > possible to have both old and new providers for a time for cases where it > was really necessary to attach to older releases. > > -Alan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From staffan.larsen at oracle.com Fri Oct 10 10:59:22 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 10 Oct 2014 12:59:22 +0200 Subject: Questions about com.sun.tools.attach implementation on windows In-Reply-To: References: <54367D4E.3030002@oracle.com> Message-ID: Another suggestion for alternative implementation is to use window?s named pipes. /Staffan On 10 okt 2014, at 12:54, Thomas St?fe wrote: > Thanks for answering and clarifying the history! > > It seems after reading up on it, that kernel32.dll has a fixed base address and therefore is always loaded to the same base, system wide. If that is true, at least handing over addresses to GetProcAddress() etc from injector to target VM should always work. > > > > > > > On Thu, Oct 9, 2014 at 2:19 PM, Alan Bateman wrote: > On 09/10/2014 02:17, Thomas St?fe wrote: > > : > > I would love to know why we do it this way. I am sure there is a valid reason for it. Maybe Backward compatibility? > > This was a very typical way for debugging utilities to work at the time. It was never intended of course to be used to attach to non-HotSpot VMs and also pre-dates a lot of the security protections and other features that were subsequently introduced. One thing to know is there wasn't a strict requirement to be able to attach to JDKs running previous releases so if a new underlying mechanism is introduced then it shouldn't be a major compatibility issue. Also as the attach API is pluggable then it would be possible to have both old and new providers for a time for cases where it was really necessary to attach to older releases. > > -Alan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaroslav.bachorik at oracle.com Fri Oct 10 12:45:30 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 10 Oct 2014 14:45:30 +0200 Subject: RFR 8060120: Improve diagnostic output of StartManagementAgent test Message-ID: <5437D4EA.7030909@oracle.com> Please, review this simple patch to add some more diagnostic information for com/sun/tools/attach/StartManagementAgent.java Issue : https://bugs.openjdk.java.net/browse/JDK-8060120 Webrev: http://cr.openjdk.java.net/~jbachorik/8060120/webrev.00 This change is adding more details about what the test is actually performing. This is needed in order to get more insight into intermittent failures. Thanks, -JB- From serguei.spitsyn at oracle.com Mon Oct 13 21:29:05 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 13 Oct 2014 14:29:05 -0700 Subject: RFR (XS) 8060245: update bsd version of jhelper.d to be in sync with the fix of 8009204 on solaris Message-ID: <543C4421.2070604@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-8060245 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/hotspot/8060245-dtrace.1/ Summary: The fix of 8009204 for jhelper.d was applied to the Solaris version only but the bsd version must match it too. Testing: N/A: The jhelper.d is not used on bsd yet Thanks, Serguei From jaroslav.bachorik at oracle.com Tue Oct 14 09:35:34 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 14 Oct 2014 11:35:34 +0200 Subject: [ping] Re: RFR 8060120: Improve diagnostic output of StartManagementAgent test In-Reply-To: <5437D4EA.7030909@oracle.com> References: <5437D4EA.7030909@oracle.com> Message-ID: <543CEE66.80305@oracle.com> Pretty please .. this is a really simple change - just a few println() statements to gather more info about the intermittent failure. Thanks, -JB- On 10/10/2014 02:45 PM, Jaroslav Bachorik wrote: > Please, review this simple patch to add some more diagnostic information > for com/sun/tools/attach/StartManagementAgent.java > > Issue : https://bugs.openjdk.java.net/browse/JDK-8060120 > Webrev: http://cr.openjdk.java.net/~jbachorik/8060120/webrev.00 > > This change is adding more details about what the test is actually > performing. This is needed in order to get more insight into > intermittent failures. > > Thanks, > > -JB- From daniel.fuchs at oracle.com Tue Oct 14 09:39:36 2014 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Tue, 14 Oct 2014 11:39:36 +0200 Subject: [ping] Re: RFR 8060120: Improve diagnostic output of StartManagementAgent test In-Reply-To: <543CEE66.80305@oracle.com> References: <5437D4EA.7030909@oracle.com> <543CEE66.80305@oracle.com> Message-ID: <543CEF58.2090900@oracle.com> On 14/10/14 11:35, Jaroslav Bachorik wrote: > Pretty please .. this is a really simple change - just a few println() > statements to gather more info about the intermittent failure. > > Thanks, Looks good to me Jaroslav. I see no harm in it. best regards, -- daniel > > -JB- > > On 10/10/2014 02:45 PM, Jaroslav Bachorik wrote: >> Please, review this simple patch to add some more diagnostic information >> for com/sun/tools/attach/StartManagementAgent.java >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-8060120 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8060120/webrev.00 >> >> This change is adding more details about what the test is actually >> performing. This is needed in order to get more insight into >> intermittent failures. >> >> Thanks, >> >> -JB- > From jaroslav.bachorik at oracle.com Tue Oct 14 10:46:55 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 14 Oct 2014 12:46:55 +0200 Subject: RFR 8056143: interrupted java/lang/management/MemoryMXBean/LowMemoryTest.java leaves running process Message-ID: <543CFF1F.3040302@oracle.com> Please, review the following test change Issue : https://bugs.openjdk.java.net/browse/JDK-8056143 Webrev: http://cr.openjdk.java.net/~jbachorik/8056143/webrev.00 The method jdk.testlibrary.ProcessTools.getOutput(process) waits for the given process to finish (process.waitFor()) before grabbing its outputs. However, the code does not handle the process.waitFor() being interrupted correctly - it just goes ahead and tries to obtain the exit code which will fail and leave the tested process running. The correct way is to forcibly destroy the process when process.waitFor() is interrupted or throws ExecutionException to make sure the process has actually exited before checking its exit code. Thanks, -JB- From staffan.larsen at oracle.com Tue Oct 14 14:13:17 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 14 Oct 2014 07:13:17 -0700 Subject: [ping] RFR 8060120: Improve diagnostic output of StartManagementAgent test In-Reply-To: <543CEE66.80305@oracle.com> References: <5437D4EA.7030909@oracle.com> <543CEE66.80305@oracle.com> Message-ID: Should you re-throw the Throwable at line 102? /Staffan On 14 okt 2014, at 02:35, Jaroslav Bachorik wrote: > Pretty please .. this is a really simple change - just a few println() statements to gather more info about the intermittent failure. > > Thanks, > > -JB- > > On 10/10/2014 02:45 PM, Jaroslav Bachorik wrote: >> Please, review this simple patch to add some more diagnostic information >> for com/sun/tools/attach/StartManagementAgent.java >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-8060120 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8060120/webrev.00 >> >> This change is adding more details about what the test is actually >> performing. This is needed in order to get more insight into >> intermittent failures. >> >> Thanks, >> >> -JB- > From staffan.larsen at oracle.com Tue Oct 14 14:14:51 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 14 Oct 2014 07:14:51 -0700 Subject: RFR 8056143: interrupted java/lang/management/MemoryMXBean/LowMemoryTest.java leaves running process In-Reply-To: <543CFF1F.3040302@oracle.com> References: <543CFF1F.3040302@oracle.com> Message-ID: <863325FE-AC64-44F8-93CD-D65B7AFE4840@oracle.com> Looks good! Thanks, /Staffan On 14 okt 2014, at 03:46, Jaroslav Bachorik wrote: > Please, review the following test change > > Issue : https://bugs.openjdk.java.net/browse/JDK-8056143 > Webrev: http://cr.openjdk.java.net/~jbachorik/8056143/webrev.00 > > The method jdk.testlibrary.ProcessTools.getOutput(process) waits for the given process to finish (process.waitFor()) before grabbing its outputs. However, the code does not handle the process.waitFor() being interrupted correctly - it just goes ahead and tries to obtain the exit code which will fail and leave the tested process running. > > The correct way is to forcibly destroy the process when process.waitFor() is interrupted or throws ExecutionException to make sure the process has actually exited before checking its exit code. > > Thanks, > > -JB- From dmitry.samersoff at oracle.com Tue Oct 14 17:46:34 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Tue, 14 Oct 2014 21:46:34 +0400 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending Message-ID: <543D617A.7030402@oracle.com> Please review a small fix: http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ Added couple of missed exception checks. -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From yumin.qi at oracle.com Tue Oct 14 18:40:24 2014 From: yumin.qi at oracle.com (Yumin Qi) Date: Tue, 14 Oct 2014 11:40:24 -0700 Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh fails with ClassCircularityError In-Reply-To: <543C8ADE.5000309@oracle.com> References: <543C591E.8010602@oracle.com> <543C8ADE.5000309@oracle.com> Message-ID: <543D6E18.9060205@oracle.com> David, Thanks for the comment. See embedded. On 10/13/2014 7:30 PM, David Holmes wrote: > Hi Yumin, > > jdk9-dev is not the best place for code review requests. > serviceability-dev would be better for this test. > > On 14/10/2014 8:58 AM, Yumin Qi wrote: >> bug: https://bugs.openjdk.java.net/browse/JDK-8038468 >> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/ >> >> the bug marked as confidential so post the webrev internally. > > Not any more :) > Thanks. I changed to non security related bug. Usually when test failed, a confidential bug is filed. I would like to create bug open if the test is in open part. >> Problem: The test case tries to load a class from the same jar via agent >> in the middle of loading another class from the jar via same class >> loader in same thread. The call happens in transform which is a rare >> case --- in middle of loading class, loading another class. The result >> is a CircularityError. When first class is in loading, in vm we put >> JarLoader$2 on place holder table, then we start the defineClass, which >> calls transform, begins loading the second class so go along the same >> routine for loading JarLoader$2 first, found it already in placeholder >> table. A CircularityError is thrown. >> Fix: The test case should not call loading class with same class loader >> in same thread from same jar in 'transform' method. I modify it loading >> with system class loader and we expect see ClassNotFoundException. >> Detail see bug comments. > > It is not clear to me that the test is incorrect. It is also unclear > why such an old test is now failing - we must have changed something. > And it's unclear whether what the test does with your change is > actually testing what the test wanted to test. > > It seems to me that the actual problem in the test is the reference to > the "main" thread ie: > > if (!tName.equals("main")) > > The test knows not to do the loading in the main thread, but has > overlooked the fact that the main thread, upon the end of main() > becomes the DestroyJavaVM thread - and it is that thread which > encounters the ClassCircularityError: > > Starting test with 1000 iterations > Thread 'DestroyJavaVM' has called transform() > > So perhaps the right fix is to expand the above to: > > if (!tName.equals("main") && !tName.equals("DestroyJavaVM")) > > ? I admit I'm having trouble seeing the full picture in this test. > It is not DestroyJavaVM thread cause CircularityError. It is TestThread cause CircularityError. In TestThread (DestroyJavaVM may cause same I think, but not seen in debug): forName("TestClass2", true, classLoader); <---- the loader is customer loader which is obtained from agent code. -->...... transform(...) -->defineClass(...) -->...... call into vm, we need to load JarLoader$2 since JarLoader$1 used ->resolve_instance_class_or_null // here we create PlaceTableEntry for JarLoader$2, put into place holder table -->...... --->forName("TestClass3", true, classLoader); -->... transform(...) -->defineClass(...) -->...... call into vm again. Now JarLoader$2 is not loaded, but it is in placeholder table, so throw_circularity_error set and throw. ....... With custom loader, agent's transform will be called, then it loads TestClass3, repeat the same steps as loading TestClass2. The problem is JarLoader$2 has not been loaded yet but in place holder table (this is for checking CircularityError), then begins loading TestClass3, this is a recursive and embedded case. The non-failed case also saw CircularityError thrown, but somehow the test case did not fail. Design like this will cause call transform in transform which is the reason CircularityError thrown. I have no idea about the original desin of the test case, but think it should do this. > > Looking at your change, don't leave commented out lines in the code: > 115 // ClassLoader loader = > ParallelTransformerLoaderAgent.getClassLoader(); > 118 //Class.forName("TestClass" + > index, true, loader); > Will remove Thanks Yumin > Thanks, > David > >> Thanks >> Yumin * From yumin.qi at oracle.com Tue Oct 14 18:52:01 2014 From: yumin.qi at oracle.com (Yumin Qi) Date: Tue, 14 Oct 2014 11:52:01 -0700 Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh fails with ClassCircularityError In-Reply-To: <543D6E18.9060205@oracle.com> References: <543C591E.8010602@oracle.com> <543C8ADE.5000309@oracle.com> <543D6E18.9060205@oracle.com> Message-ID: <543D70D1.9080608@oracle.com> I have to make a correction: DestroyJavaVM thread will not load TestClass2 so no CircularityError happened with it. When it is created and run, the loading of TestClass2 already finished. On 10/14/2014 11:40 AM, Yumin Qi wrote: > David, Thanks for the comment. See embedded. > > > On 10/13/2014 7:30 PM, David Holmes wrote: >> Hi Yumin, >> >> jdk9-dev is not the best place for code review requests. >> serviceability-dev would be better for this test. >> >> On 14/10/2014 8:58 AM, Yumin Qi wrote: >>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468 >>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/ >>> >>> the bug marked as confidential so post the webrev internally. >> >> Not any more :) >> > Thanks. I changed to non security related bug. Usually when test > failed, a confidential bug is filed. I would like to create bug open > if the test is in open part. >>> Problem: The test case tries to load a class from the same jar via >>> agent >>> in the middle of loading another class from the jar via same class >>> loader in same thread. The call happens in transform which is a rare >>> case --- in middle of loading class, loading another class. The result >>> is a CircularityError. When first class is in loading, in vm we put >>> JarLoader$2 on place holder table, then we start the defineClass, which >>> calls transform, begins loading the second class so go along the same >>> routine for loading JarLoader$2 first, found it already in placeholder >>> table. A CircularityError is thrown. >>> Fix: The test case should not call loading class with same class loader >>> in same thread from same jar in 'transform' method. I modify it loading >>> with system class loader and we expect see ClassNotFoundException. >>> Detail see bug comments. >> >> It is not clear to me that the test is incorrect. It is also unclear >> why such an old test is now failing - we must have changed something. >> And it's unclear whether what the test does with your change is >> actually testing what the test wanted to test. >> >> It seems to me that the actual problem in the test is the reference >> to the "main" thread ie: >> >> if (!tName.equals("main")) >> >> The test knows not to do the loading in the main thread, but has >> overlooked the fact that the main thread, upon the end of main() >> becomes the DestroyJavaVM thread - and it is that thread which >> encounters the ClassCircularityError: >> >> Starting test with 1000 iterations >> Thread 'DestroyJavaVM' has called transform() >> >> So perhaps the right fix is to expand the above to: >> >> if (!tName.equals("main") && !tName.equals("DestroyJavaVM")) >> >> ? I admit I'm having trouble seeing the full picture in this test. >> > It is not DestroyJavaVM thread cause CircularityError. It is > TestThread cause CircularityError. > In TestThread (DestroyJavaVM may cause same I think, but not seen in > debug): > > forName("TestClass2", true, classLoader); <---- the loader is > customer loader which is obtained from agent code. > -->...... transform(...) > -->defineClass(...) > -->...... call into vm, we need to load JarLoader$2 > since JarLoader$1 used > ->resolve_instance_class_or_null > // here we create PlaceTableEntry for > JarLoader$2, put into place holder table > -->...... > --->forName("TestClass3", true, > classLoader); > -->... transform(...) > -->defineClass(...) > -->...... call into vm > again. Now JarLoader$2 is not loaded, but it is in placeholder table, > so throw_circularity_error set and throw. > ....... > With custom loader, agent's transform will be called, then it > loads TestClass3, repeat the same steps as loading TestClass2. The > problem is JarLoader$2 has not been loaded yet but in place holder > table (this is for checking CircularityError), then begins loading > TestClass3, this is a recursive and embedded case. The non-failed > case also saw CircularityError thrown, but somehow the test case did > not fail. Design like this will cause call transform in transform > which is the reason CircularityError thrown. > > I have no idea about the original desin of the test case, but think > it should do this. > >> >> Looking at your change, don't leave commented out lines in the code: >> 115 // ClassLoader loader = >> ParallelTransformerLoaderAgent.getClassLoader(); >> 118 //Class.forName("TestClass" + >> index, true, loader); >> > Will remove > > Thanks > Yumin >> Thanks, >> David >> >>> Thanks >>> Yumin * > From david.holmes at oracle.com Wed Oct 15 00:10:41 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 15 Oct 2014 10:10:41 +1000 Subject: RFR 8056143: interrupted java/lang/management/MemoryMXBean/LowMemoryTest.java leaves running process In-Reply-To: <543CFF1F.3040302@oracle.com> References: <543CFF1F.3040302@oracle.com> Message-ID: <543DBB81.60409@oracle.com> On 14/10/2014 8:46 PM, Jaroslav Bachorik wrote: > Please, review the following test change > > Issue : https://bugs.openjdk.java.net/browse/JDK-8056143 > Webrev: http://cr.openjdk.java.net/~jbachorik/8056143/webrev.00 > > The method jdk.testlibrary.ProcessTools.getOutput(process) waits for the > given process to finish (process.waitFor()) before grabbing its outputs. > However, the code does not handle the process.waitFor() being > interrupted correctly - it just goes ahead and tries to obtain the exit > code which will fail and leave the tested process running. > > The correct way is to forcibly destroy the process when > process.waitFor() is interrupted or throws ExecutionException to make > sure the process has actually exited before checking its exit code. Why is this correct? What gives the thread calling getOutput the right to terminate the target process just because that thread was interrupted while waiting? If the interrupting thread intended the interrupt to mean "forcibly terminate the process and interrupt all threads waiting on it" then that thread should be doing the termination _not_ the one that was interrupted! David > Thanks, > > -JB- From david.holmes at oracle.com Wed Oct 15 00:21:08 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 15 Oct 2014 10:21:08 +1000 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending In-Reply-To: <543D617A.7030402@oracle.com> References: <543D617A.7030402@oracle.com> Message-ID: <543DBDF4.5050608@oracle.com> Hi Dmitry, On 15/10/2014 3:46 AM, Dmitry Samersoff wrote: > Please review a small fix: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ > > Added couple of missed exception checks. Added checks look fine. Am wondering about: 104 resultList = (*env)->CallStaticObjectMethod(env, arraysCls, mid, result); 105 return resultList; If there is an exception pending due to the call what will resultList be set to? Hopefully NULL but the JNI spec says nothing. Thanks, David From david.holmes at oracle.com Wed Oct 15 00:28:15 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 15 Oct 2014 10:28:15 +1000 Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh fails with ClassCircularityError In-Reply-To: <543D6E18.9060205@oracle.com> References: <543C591E.8010602@oracle.com> <543C8ADE.5000309@oracle.com> <543D6E18.9060205@oracle.com> Message-ID: <543DBF9F.5050507@oracle.com> Hi Yumin, On 15/10/2014 4:40 AM, Yumin Qi wrote: > David, Thanks for the comment. See embedded. > > > On 10/13/2014 7:30 PM, David Holmes wrote: >> Hi Yumin, >> >> jdk9-dev is not the best place for code review requests. >> serviceability-dev would be better for this test. >> >> On 14/10/2014 8:58 AM, Yumin Qi wrote: >>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468 >>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/ >>> >>> the bug marked as confidential so post the webrev internally. >> >> Not any more :) >> > Thanks. I changed to non security related bug. Usually when test failed, > a confidential bug is filed. I would like to create bug open if the test > is in open part. >>> Problem: The test case tries to load a class from the same jar via agent >>> in the middle of loading another class from the jar via same class >>> loader in same thread. The call happens in transform which is a rare >>> case --- in middle of loading class, loading another class. The result >>> is a CircularityError. When first class is in loading, in vm we put >>> JarLoader$2 on place holder table, then we start the defineClass, which >>> calls transform, begins loading the second class so go along the same >>> routine for loading JarLoader$2 first, found it already in placeholder >>> table. A CircularityError is thrown. >>> Fix: The test case should not call loading class with same class loader >>> in same thread from same jar in 'transform' method. I modify it loading >>> with system class loader and we expect see ClassNotFoundException. >>> Detail see bug comments. >> >> It is not clear to me that the test is incorrect. It is also unclear >> why such an old test is now failing - we must have changed something. >> And it's unclear whether what the test does with your change is >> actually testing what the test wanted to test. >> >> It seems to me that the actual problem in the test is the reference to >> the "main" thread ie: >> >> if (!tName.equals("main")) >> >> The test knows not to do the loading in the main thread, but has >> overlooked the fact that the main thread, upon the end of main() >> becomes the DestroyJavaVM thread - and it is that thread which >> encounters the ClassCircularityError: >> >> Starting test with 1000 iterations >> Thread 'DestroyJavaVM' has called transform() >> >> So perhaps the right fix is to expand the above to: >> >> if (!tName.equals("main") && !tName.equals("DestroyJavaVM")) >> >> ? I admit I'm having trouble seeing the full picture in this test. >> > It is not DestroyJavaVM thread cause CircularityError. It is TestThread > cause CircularityError. Not according to the bug report: Starting test with 1000 iterationsThread 'DestroyJavaVM' has called transform() Thread 'DestroyJavaVM' has called transform() result=1 ----------System.err:(14/920)---------- Exception in thread "main" java.lang.ClassCircularityError: sun/misc/URLClassPath$JarLoader$2 This shows that "main" got the CCE. Which in itself is confusing given we also report "Thread 'DestroyJavaVM' has called transform()" and they are in fact the same thread! David ----- > In TestThread (DestroyJavaVM may cause same I think, but not seen in > debug): > > forName("TestClass2", true, classLoader); <---- the loader is > customer loader which is obtained from agent code. > -->...... transform(...) > -->defineClass(...) > -->...... call into vm, we need to load JarLoader$2 > since JarLoader$1 used > ->resolve_instance_class_or_null > // here we create PlaceTableEntry for > JarLoader$2, put into place holder table > -->...... > --->forName("TestClass3", true, > classLoader); > -->... transform(...) > -->defineClass(...) > -->...... call into vm > again. Now JarLoader$2 is not loaded, but it is in placeholder table, so > throw_circularity_error set and throw. > ....... > With custom loader, agent's transform will be called, then it > loads TestClass3, repeat the same steps as loading TestClass2. The > problem is JarLoader$2 has not been loaded yet but in place holder table > (this is for checking CircularityError), then begins loading TestClass3, > this is a recursive and embedded case. The non-failed case also saw > CircularityError thrown, but somehow the test case did not fail. Design > like this will cause call transform in transform which is the reason > CircularityError thrown. > > I have no idea about the original desin of the test case, but think > it should do this. > >> >> Looking at your change, don't leave commented out lines in the code: >> 115 // ClassLoader loader = >> ParallelTransformerLoaderAgent.getClassLoader(); >> 118 //Class.forName("TestClass" + >> index, true, loader); >> > Will remove > > Thanks > Yumin >> Thanks, >> David >> >>> Thanks >>> Yumin * > From serguei.spitsyn at oracle.com Wed Oct 15 04:51:55 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 14 Oct 2014 21:51:55 -0700 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending In-Reply-To: <543D617A.7030402@oracle.com> References: <543D617A.7030402@oracle.com> Message-ID: <543DFD6B.2010300@oracle.com> Hi Dmitry, Looks good. Thanks, Serguei On 10/14/14 10:46 AM, Dmitry Samersoff wrote: > Please review a small fix: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ > > Added couple of missed exception checks. > From dmitry.samersoff at oracle.com Wed Oct 15 05:45:11 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 15 Oct 2014 09:45:11 +0400 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending In-Reply-To: <543DBDF4.5050608@oracle.com> References: <543D617A.7030402@oracle.com> <543DBDF4.5050608@oracle.com> Message-ID: <543E09E7.5030601@oracle.com> David, Agree, It's better to set resultList to NULL explicitly in case of exception. I'll change it. Thank you for the pointer. -Dmitry On 2014-10-15 04:21, David Holmes wrote: > Hi Dmitry, > > On 15/10/2014 3:46 AM, Dmitry Samersoff wrote: >> Please review a small fix: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >> >> Added couple of missed exception checks. > > Added checks look fine. > > Am wondering about: > > 104 resultList = (*env)->CallStaticObjectMethod(env, arraysCls, mid, > result); > 105 return resultList; > > If there is an exception pending due to the call what will resultList be > set to? Hopefully NULL but the JNI spec says nothing. > > Thanks, > David > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the source code. From jaroslav.bachorik at oracle.com Wed Oct 15 07:50:29 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 15 Oct 2014 09:50:29 +0200 Subject: RFR 8056143: interrupted java/lang/management/MemoryMXBean/LowMemoryTest.java leaves running process In-Reply-To: <543DBB81.60409@oracle.com> References: <543CFF1F.3040302@oracle.com> <543DBB81.60409@oracle.com> Message-ID: <543E2745.3000107@oracle.com> On 10/15/2014 02:10 AM, David Holmes wrote: > On 14/10/2014 8:46 PM, Jaroslav Bachorik wrote: >> Please, review the following test change >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-8056143 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8056143/webrev.00 >> >> The method jdk.testlibrary.ProcessTools.getOutput(process) waits for the >> given process to finish (process.waitFor()) before grabbing its outputs. >> However, the code does not handle the process.waitFor() being >> interrupted correctly - it just goes ahead and tries to obtain the exit >> code which will fail and leave the tested process running. >> >> The correct way is to forcibly destroy the process when >> process.waitFor() is interrupted or throws ExecutionException to make >> sure the process has actually exited before checking its exit code. > > Why is this correct? What gives the thread calling getOutput the right > to terminate the target process just because that thread was interrupted > while waiting? If the interrupting thread intended the interrupt to mean > "forcibly terminate the process and interrupt all threads waiting on it" > then that thread should be doing the termination _not_ the one that was > interrupted! Process.waitFor() gets interrupted by a thread unknown to the actual test case - probably the JTreg timeout thread. The interrupting thread doesn't know that it is supposed to destroy a process. Once JTreg can take care of cleaning up process tree upon exit this code wouldn't be needed. I was contemplating adding the check for "null" returned from ProcessTools.getOutput() and destroying the process inside the caller code - but this would have the same results as doing it in ProcessTools.getOutput() with the drawback of duplicating the same check everywhere ProcessTools.getOutput() would be used. A silent postcondition of ProcessTools.getOuptut() is that the target process has finished - and it holds for all the code paths except the InterruptedException handler. -JB- > > David > >> Thanks, >> >> -JB- From dmitry.samersoff at oracle.com Wed Oct 15 07:58:14 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 15 Oct 2014 11:58:14 +0400 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending In-Reply-To: <543DBDF4.5050608@oracle.com> References: <543D617A.7030402@oracle.com> <543DBDF4.5050608@oracle.com> Message-ID: <543E2916.3030409@oracle.com> David, After close look I don't think that Arrays.asList() could throw any exception here. We are checking for possible null pointer at ll. 74 So I would prefer to leave the code as is. -Dmitry On 2014-10-15 04:21, David Holmes wrote: > Hi Dmitry, > > On 15/10/2014 3:46 AM, Dmitry Samersoff wrote: >> Please review a small fix: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >> >> Added couple of missed exception checks. > > Added checks look fine. > > Am wondering about: > > 104 resultList = (*env)->CallStaticObjectMethod(env, arraysCls, mid, > result); > 105 return resultList; > > If there is an exception pending due to the call what will resultList be > set to? Hopefully NULL but the JNI spec says nothing. > > Thanks, > David > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.holmes at oracle.com Wed Oct 15 08:11:36 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 15 Oct 2014 18:11:36 +1000 Subject: RFR 8056143: interrupted java/lang/management/MemoryMXBean/LowMemoryTest.java leaves running process In-Reply-To: <543E2745.3000107@oracle.com> References: <543CFF1F.3040302@oracle.com> <543DBB81.60409@oracle.com> <543E2745.3000107@oracle.com> Message-ID: <543E2C38.3030601@oracle.com> On 15/10/2014 5:50 PM, Jaroslav Bachorik wrote: > On 10/15/2014 02:10 AM, David Holmes wrote: >> On 14/10/2014 8:46 PM, Jaroslav Bachorik wrote: >>> Please, review the following test change >>> >>> Issue : https://bugs.openjdk.java.net/browse/JDK-8056143 >>> Webrev: http://cr.openjdk.java.net/~jbachorik/8056143/webrev.00 >>> >>> The method jdk.testlibrary.ProcessTools.getOutput(process) waits for the >>> given process to finish (process.waitFor()) before grabbing its outputs. >>> However, the code does not handle the process.waitFor() being >>> interrupted correctly - it just goes ahead and tries to obtain the exit >>> code which will fail and leave the tested process running. >>> >>> The correct way is to forcibly destroy the process when >>> process.waitFor() is interrupted or throws ExecutionException to make >>> sure the process has actually exited before checking its exit code. >> >> Why is this correct? What gives the thread calling getOutput the right >> to terminate the target process just because that thread was interrupted >> while waiting? If the interrupting thread intended the interrupt to mean >> "forcibly terminate the process and interrupt all threads waiting on it" >> then that thread should be doing the termination _not_ the one that was >> interrupted! > > Process.waitFor() gets interrupted by a thread unknown to the actual > test case - probably the JTreg timeout thread. The interrupting thread > doesn't know that it is supposed to destroy a process. Once JTreg can > take care of cleaning up process tree upon exit this code wouldn't be > needed. > > I was contemplating adding the check for "null" returned from > ProcessTools.getOutput() and destroying the process inside the caller > code - but this would have the same results as doing it in > ProcessTools.getOutput() with the drawback of duplicating the same check > everywhere ProcessTools.getOutput() would be used. > > A silent postcondition of ProcessTools.getOuptut() is that the target > process has finished - and it holds for all the code paths except the > InterruptedException handler. That doesn't mean it is up to getOutput to forcibly terminate the process. Multi-process cancellation is tricky, and yes eventually jtreg will handle it. But this seems the wrong place to handle it now. Part of the flaw here is that getOutput should itself throw InterruptedException so that the caller is forced to deal with this - instead it just re-asserts the interrupt state. The caller has to be aware that the thread can be interrupted and do something appropriate - which may mean punting to its caller. This is akin to a thread catching InterruptedException and calling System.exit - it simply is not its job to make that kind of decision at that level! David > -JB- > > > >> >> David >> >>> Thanks, >>> >>> -JB- > From david.holmes at oracle.com Wed Oct 15 08:17:47 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 15 Oct 2014 18:17:47 +1000 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending In-Reply-To: <543E2916.3030409@oracle.com> References: <543D617A.7030402@oracle.com> <543DBDF4.5050608@oracle.com> <543E2916.3030409@oracle.com> Message-ID: <543E2DAB.8030502@oracle.com> On 15/10/2014 5:58 PM, Dmitry Samersoff wrote: > David, > > After close look I don't think that Arrays.asList() could throw any > exception here. We are checking for possible null pointer at ll. 74 I would think OOME is always possible. But it was just a concern - if the caller of getDiagnosticCommandArgumentInfoArray checks for exceptions before touching the return value, or the return value is actually NULL in that case, then it is okay. Cheers, David > So I would prefer to leave the code as is. > > -Dmitry > > On 2014-10-15 04:21, David Holmes wrote: >> Hi Dmitry, >> >> On 15/10/2014 3:46 AM, Dmitry Samersoff wrote: >>> Please review a small fix: >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >>> >>> Added couple of missed exception checks. >> >> Added checks look fine. >> >> Am wondering about: >> >> 104 resultList = (*env)->CallStaticObjectMethod(env, arraysCls, mid, >> result); >> 105 return resultList; >> >> If there is an exception pending due to the call what will resultList be >> set to? Hopefully NULL but the JNI spec says nothing. >> >> Thanks, >> David >> > > From dmitry.samersoff at oracle.com Wed Oct 15 09:02:28 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 15 Oct 2014 13:02:28 +0400 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending In-Reply-To: <543E2DAB.8030502@oracle.com> References: <543D617A.7030402@oracle.com> <543DBDF4.5050608@oracle.com> <543E2916.3030409@oracle.com> <543E2DAB.8030502@oracle.com> Message-ID: <543E3824.3080406@oracle.com> David, Added extra check to be on safe side. (in place - press shift-reload) http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ -Dmitry On 2014-10-15 12:17, David Holmes wrote: > On 15/10/2014 5:58 PM, Dmitry Samersoff wrote: >> David, >> >> After close look I don't think that Arrays.asList() could throw any >> exception here. We are checking for possible null pointer at ll. 74 > > I would think OOME is always possible. But it was just a concern - if > the caller of getDiagnosticCommandArgumentInfoArray checks for > exceptions before touching the return value, or the return value is > actually NULL in that case, then it is okay. > > Cheers, > David > >> So I would prefer to leave the code as is. >> >> -Dmitry >> http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >> On 2014-10-15 04:21, David Holmes wrote: >>> Hi Dmitry, >>> >>> On 15/10/2014 3:46 AM, Dmitry Samersoff wrote: >>>> Please review a small fix: >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >>>> >>>> Added couple of missed exception checks. >>> >>> Added checks look fine. >>> >>> Am wondering about: >>> >>> 104 resultList = (*env)->CallStaticObjectMethod(env, arraysCls, mid, >>> result); >>> 105 return resultList; >>> >>> If there is an exception pending due to the call what will resultList be >>> set to? Hopefully NULL but the JNI spec says nothing. >>> >>> Thanks, >>> David >>> >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.holmes at oracle.com Wed Oct 15 10:03:14 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 15 Oct 2014 20:03:14 +1000 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending In-Reply-To: <543E3824.3080406@oracle.com> References: <543D617A.7030402@oracle.com> <543DBDF4.5050608@oracle.com> <543E2916.3030409@oracle.com> <543E2DAB.8030502@oracle.com> <543E3824.3080406@oracle.com> Message-ID: <543E4662.2050603@oracle.com> Thanks Dmitry! Looks good. David On 15/10/2014 7:02 PM, Dmitry Samersoff wrote: > David, > > Added extra check to be on safe side. > > (in place - press shift-reload) > > http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ > > -Dmitry > > On 2014-10-15 12:17, David Holmes wrote: >> On 15/10/2014 5:58 PM, Dmitry Samersoff wrote: >>> David, >>> >>> After close look I don't think that Arrays.asList() could throw any >>> exception here. We are checking for possible null pointer at ll. 74 >> >> I would think OOME is always possible. But it was just a concern - if >> the caller of getDiagnosticCommandArgumentInfoArray checks for >> exceptions before touching the return value, or the return value is >> actually NULL in that case, then it is okay. >> >> Cheers, >> David >> >>> So I would prefer to leave the code as is. >>> >>> -Dmitry >>> > http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >>> On 2014-10-15 04:21, David Holmes wrote: >>>> Hi Dmitry, >>>> >>>> On 15/10/2014 3:46 AM, Dmitry Samersoff wrote: >>>>> Please review a small fix: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >>>>> >>>>> Added couple of missed exception checks. >>>> >>>> Added checks look fine. >>>> >>>> Am wondering about: >>>> >>>> 104 resultList = (*env)->CallStaticObjectMethod(env, arraysCls, mid, >>>> result); >>>> 105 return resultList; >>>> >>>> If there is an exception pending due to the call what will resultList be >>>> set to? Hopefully NULL but the JNI spec says nothing. >>>> >>>> Thanks, >>>> David >>>> >>> >>> > > From dmitry.samersoff at oracle.com Wed Oct 15 10:08:23 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 15 Oct 2014 14:08:23 +0400 Subject: RFR(S): JDK-8030708: warnings from b119 for jdk/src/share/back: JNI exception pending Message-ID: <543E4797.9050302@oracle.com> Please review the fix: http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.01/ Added missed exception checks. -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From frederic.parain at oracle.com Wed Oct 15 10:16:45 2014 From: frederic.parain at oracle.com (Frederic Parain) Date: Wed, 15 Oct 2014 12:16:45 +0200 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending In-Reply-To: <543E3824.3080406@oracle.com> References: <543D617A.7030402@oracle.com> <543DBDF4.5050608@oracle.com> <543E2916.3030409@oracle.com> <543E2DAB.8030502@oracle.com> <543E3824.3080406@oracle.com> Message-ID: <543E498D.8090105@oracle.com> Looks good to me, thank you for fixing this. Fred On 15/10/2014 11:02, Dmitry Samersoff wrote: > David, > > Added extra check to be on safe side. > > (in place - press shift-reload) > > http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ > > -Dmitry > > On 2014-10-15 12:17, David Holmes wrote: >> On 15/10/2014 5:58 PM, Dmitry Samersoff wrote: >>> David, >>> >>> After close look I don't think that Arrays.asList() could throw any >>> exception here. We are checking for possible null pointer at ll. 74 >> >> I would think OOME is always possible. But it was just a concern - if >> the caller of getDiagnosticCommandArgumentInfoArray checks for >> exceptions before touching the return value, or the return value is >> actually NULL in that case, then it is okay. >> >> Cheers, >> David >> >>> So I would prefer to leave the code as is. >>> >>> -Dmitry >>> > http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >>> On 2014-10-15 04:21, David Holmes wrote: >>>> Hi Dmitry, >>>> >>>> On 15/10/2014 3:46 AM, Dmitry Samersoff wrote: >>>>> Please review a small fix: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >>>>> >>>>> Added couple of missed exception checks. >>>> >>>> Added checks look fine. >>>> >>>> Am wondering about: >>>> >>>> 104 resultList = (*env)->CallStaticObjectMethod(env, arraysCls, mid, >>>> result); >>>> 105 return resultList; >>>> >>>> If there is an exception pending due to the call what will resultList be >>>> set to? Hopefully NULL but the JNI spec says nothing. >>>> >>>> Thanks, >>>> David >>>> >>> >>> > > -- Frederic Parain - Oracle Grenoble Engineering Center - France Phone: +33 4 76 18 81 17 Email: Frederic.Parain at oracle.com From david.holmes at oracle.com Wed Oct 15 10:27:18 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 15 Oct 2014 20:27:18 +1000 Subject: RFR(S): JDK-8030708: warnings from b119 for jdk/src/share/back: JNI exception pending In-Reply-To: <543E4797.9050302@oracle.com> References: <543E4797.9050302@oracle.com> Message-ID: <543E4C06.2080601@oracle.com> On 15/10/2014 8:08 PM, Dmitry Samersoff wrote: > Please review the fix: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.01/ > > Added missed exception checks. src/jdk.jdwp.agent/share/native/libjdwp/outStream.c What is potentially posting the exception? David From dmitry.samersoff at oracle.com Wed Oct 15 10:39:03 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 15 Oct 2014 14:39:03 +0400 Subject: RFR(S): JDK-8030708: warnings from b119 for jdk/src/share/back: JNI exception pending In-Reply-To: <543E4C06.2080601@oracle.com> References: <543E4797.9050302@oracle.com> <543E4C06.2080601@oracle.com> Message-ID: <543E4EC7.2060204@oracle.com> On 2014-10-15 14:27, David Holmes wrote: > On 15/10/2014 8:08 PM, Dmitry Samersoff wrote: >> Please review the fix: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.01/ >> >> Added missed exception checks. > > src/jdk.jdwp.agent/share/native/libjdwp/outStream.c > > What is potentially posting the exception? JvmtiEnv::GetTag(jobject object, jlong* tag_ptr) called from commonRef_refToID() -Dmitry -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.holmes at oracle.com Wed Oct 15 12:21:55 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 15 Oct 2014 22:21:55 +1000 Subject: RFR(S): JDK-8030708: warnings from b119 for jdk/src/share/back: JNI exception pending In-Reply-To: <543E4EC7.2060204@oracle.com> References: <543E4797.9050302@oracle.com> <543E4C06.2080601@oracle.com> <543E4EC7.2060204@oracle.com> Message-ID: <543E66E3.6040400@oracle.com> On 15/10/2014 8:39 PM, Dmitry Samersoff wrote: > On 2014-10-15 14:27, David Holmes wrote: >> On 15/10/2014 8:08 PM, Dmitry Samersoff wrote: >>> Please review the fix: >>> >>> http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.01/ >>> >>> Added missed exception checks. >> >> src/jdk.jdwp.agent/share/native/libjdwp/outStream.c >> >> What is potentially posting the exception? > > JvmtiEnv::GetTag(jobject object, jlong* tag_ptr) called from > commonRef_refToID() You mean this call: error = JVMTI_FUNC_PTR(gdata->jvmti,GetTag)(gdata->jvmti, ref, &tag); in findNodeByRef which is called by commonRef_refToID? JVM TI doesn't post exceptions. "JVM TI functions never throw exceptions; error conditions are communicated via the function return value. Any existing exception state is preserved across a call to a JVM TI function." http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html David > -Dmitry > > From jaroslav.bachorik at oracle.com Wed Oct 15 13:55:16 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 15 Oct 2014 15:55:16 +0200 Subject: RFR 8056143: interrupted java/lang/management/MemoryMXBean/LowMemoryTest.java leaves running process In-Reply-To: <543E2C38.3030601@oracle.com> References: <543CFF1F.3040302@oracle.com> <543DBB81.60409@oracle.com> <543E2745.3000107@oracle.com> <543E2C38.3030601@oracle.com> Message-ID: <543E7CC4.4000504@oracle.com> On 10/15/2014 10:11 AM, David Holmes wrote: > On 15/10/2014 5:50 PM, Jaroslav Bachorik wrote: >> On 10/15/2014 02:10 AM, David Holmes wrote: >>> On 14/10/2014 8:46 PM, Jaroslav Bachorik wrote: >>>> Please, review the following test change >>>> >>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8056143 >>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8056143/webrev.00 >>>> >>>> The method jdk.testlibrary.ProcessTools.getOutput(process) waits for >>>> the >>>> given process to finish (process.waitFor()) before grabbing its >>>> outputs. >>>> However, the code does not handle the process.waitFor() being >>>> interrupted correctly - it just goes ahead and tries to obtain the exit >>>> code which will fail and leave the tested process running. >>>> >>>> The correct way is to forcibly destroy the process when >>>> process.waitFor() is interrupted or throws ExecutionException to make >>>> sure the process has actually exited before checking its exit code. >>> >>> Why is this correct? What gives the thread calling getOutput the right >>> to terminate the target process just because that thread was interrupted >>> while waiting? If the interrupting thread intended the interrupt to mean >>> "forcibly terminate the process and interrupt all threads waiting on it" >>> then that thread should be doing the termination _not_ the one that was >>> interrupted! >> >> Process.waitFor() gets interrupted by a thread unknown to the actual >> test case - probably the JTreg timeout thread. The interrupting thread >> doesn't know that it is supposed to destroy a process. Once JTreg can >> take care of cleaning up process tree upon exit this code wouldn't be >> needed. >> >> I was contemplating adding the check for "null" returned from >> ProcessTools.getOutput() and destroying the process inside the caller >> code - but this would have the same results as doing it in >> ProcessTools.getOutput() with the drawback of duplicating the same check >> everywhere ProcessTools.getOutput() would be used. >> >> A silent postcondition of ProcessTools.getOuptut() is that the target >> process has finished - and it holds for all the code paths except the >> InterruptedException handler. > > That doesn't mean it is up to getOutput to forcibly terminate the > process. Multi-process cancellation is tricky, and yes eventually jtreg > will handle it. But this seems the wrong place to handle it now. Part of > the flaw here is that getOutput should itself throw InterruptedException > so that the caller is forced to deal with this - instead it just > re-asserts the interrupt state. The caller has to be aware that the > thread can be interrupted and do something appropriate - which may mean > punting to its caller. This is akin to a thread catching > InterruptedException and calling System.exit - it simply is not its job > to make that kind of decision at that level! There is no other decision to make. Not as it is written today. You can call ProcessTools.getOutput() and check whether the result is null and then end the test process. There is no other sensible action. The Process.waitFor() was interrupted you have no data to perform the checks against so the test will fail and as such it should stop any external processes it has started. Yes, I can go through all the tests using ProcessTools.getOutput() and add `if (output == null) process.destroyForcible();` - would this make it a better solution than putting this logic inside ProcessTools.getOutput()? -JB- > > David > >> -JB- >> >> >> >>> >>> David >>> >>>> Thanks, >>>> >>>> -JB- >> From serguei.spitsyn at oracle.com Wed Oct 15 14:07:52 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 15 Oct 2014 07:07:52 -0700 Subject: RFR(S): JDK-8029465 warnings from b118 for jdk.src.share.native.sun.management: JNI exception pending In-Reply-To: <543E3824.3080406@oracle.com> References: <543D617A.7030402@oracle.com> <543DBDF4.5050608@oracle.com> <543E2916.3030409@oracle.com> <543E2DAB.8030502@oracle.com> <543E3824.3080406@oracle.com> Message-ID: <543E7FB8.3090502@oracle.com> Good. Thanks, Serguei On 10/15/14 2:02 AM, Dmitry Samersoff wrote: > David, > > Added extra check to be on safe side. > > (in place - press shift-reload) > > http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ > > -Dmitry > > On 2014-10-15 12:17, David Holmes wrote: >> On 15/10/2014 5:58 PM, Dmitry Samersoff wrote: >>> David, >>> >>> After close look I don't think that Arrays.asList() could throw any >>> exception here. We are checking for possible null pointer at ll. 74 >> I would think OOME is always possible. But it was just a concern - if >> the caller of getDiagnosticCommandArgumentInfoArray checks for >> exceptions before touching the return value, or the return value is >> actually NULL in that case, then it is okay. >> >> Cheers, >> David >> >>> So I would prefer to leave the code as is. >>> >>> -Dmitry >>> > http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >>> On 2014-10-15 04:21, David Holmes wrote: >>>> Hi Dmitry, >>>> >>>> On 15/10/2014 3:46 AM, Dmitry Samersoff wrote: >>>>> Please review a small fix: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8029465/webrev.01/ >>>>> >>>>> Added couple of missed exception checks. >>>> Added checks look fine. >>>> >>>> Am wondering about: >>>> >>>> 104 resultList = (*env)->CallStaticObjectMethod(env, arraysCls, mid, >>>> result); >>>> 105 return resultList; >>>> >>>> If there is an exception pending due to the call what will resultList be >>>> set to? Hopefully NULL but the JNI spec says nothing. >>>> >>>> Thanks, >>>> David >>>> >>> > From dmitry.samersoff at oracle.com Wed Oct 15 14:33:52 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 15 Oct 2014 18:33:52 +0400 Subject: RFR(S): JDK-8030708: warnings from b119 for jdk/src/share/back: JNI exception pending In-Reply-To: <543E66E3.6040400@oracle.com> References: <543E4797.9050302@oracle.com> <543E4C06.2080601@oracle.com> <543E4EC7.2060204@oracle.com> <543E66E3.6040400@oracle.com> Message-ID: <543E85D0.8080701@oracle.com> David, Sorry, copied wrong function! I mean this call weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); that can post OutOfMemoryError commonRef_refToID() -> createNode(JNIEnv *env, jobject ref) -> weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); -Dmitry On 2014-10-15 16:21, David Holmes wrote: > On 15/10/2014 8:39 PM, Dmitry Samersoff wrote: >> On 2014-10-15 14:27, David Holmes wrote: >>> On 15/10/2014 8:08 PM, Dmitry Samersoff wrote: >>>> Please review the fix: >>>> >>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.01/ >>>> >>>> Added missed exception checks. >>> >>> src/jdk.jdwp.agent/share/native/libjdwp/outStream.c >>> >>> What is potentially posting the exception? >> >> JvmtiEnv::GetTag(jobject object, jlong* tag_ptr) called from >> commonRef_refToID() > > You mean this call: > > error = JVMTI_FUNC_PTR(gdata->jvmti,GetTag)(gdata->jvmti, ref, &tag); x > > in findNodeByRef which is called by commonRef_refToID? JVM TI doesn't > post exceptions. > > "JVM TI functions never throw exceptions; error conditions are > communicated via the function return value. Any existing exception state > is preserved across a call to a JVM TI function." > > http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html > > David > >> -Dmitry >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From yumin.qi at oracle.com Wed Oct 15 16:58:16 2014 From: yumin.qi at oracle.com (Yumin Qi) Date: Wed, 15 Oct 2014 09:58:16 -0700 Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh fails with ClassCircularityError In-Reply-To: <543DBF9F.5050507@oracle.com> References: <543C591E.8010602@oracle.com> <543C8ADE.5000309@oracle.com> <543D6E18.9060205@oracle.com> <543DBF9F.5050507@oracle.com> Message-ID: <543EA7A8.4030809@oracle.com> David, I will take another detail trace to see where the exception begins in main thread, it should not thrown in main thread. I only saw it is thrown in TestThread, not main, not DestroyJavaVM. If that happens, maybe something wrong in vm. The output in all 'failed' case (many failed not cause exception output, not caught), the main thread got the exception. That is not right. Thanks Yumin On 10/14/2014 5:28 PM, David Holmes wrote: > Hi Yumin, > > On 15/10/2014 4:40 AM, Yumin Qi wrote: >> David, Thanks for the comment. See embedded. >> >> >> On 10/13/2014 7:30 PM, David Holmes wrote: >>> Hi Yumin, >>> >>> jdk9-dev is not the best place for code review requests. >>> serviceability-dev would be better for this test. >>> >>> On 14/10/2014 8:58 AM, Yumin Qi wrote: >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468 >>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/ >>>> >>>> the bug marked as confidential so post the webrev internally. >>> >>> Not any more :) >>> >> Thanks. I changed to non security related bug. Usually when test failed, >> a confidential bug is filed. I would like to create bug open if the test >> is in open part. >>>> Problem: The test case tries to load a class from the same jar via >>>> agent >>>> in the middle of loading another class from the jar via same class >>>> loader in same thread. The call happens in transform which is a rare >>>> case --- in middle of loading class, loading another class. The result >>>> is a CircularityError. When first class is in loading, in vm we put >>>> JarLoader$2 on place holder table, then we start the defineClass, >>>> which >>>> calls transform, begins loading the second class so go along the same >>>> routine for loading JarLoader$2 first, found it already in placeholder >>>> table. A CircularityError is thrown. >>>> Fix: The test case should not call loading class with same class >>>> loader >>>> in same thread from same jar in 'transform' method. I modify it >>>> loading >>>> with system class loader and we expect see ClassNotFoundException. >>>> Detail see bug comments. >>> >>> It is not clear to me that the test is incorrect. It is also unclear >>> why such an old test is now failing - we must have changed something. >>> And it's unclear whether what the test does with your change is >>> actually testing what the test wanted to test. >>> >>> It seems to me that the actual problem in the test is the reference to >>> the "main" thread ie: >>> >>> if (!tName.equals("main")) >>> >>> The test knows not to do the loading in the main thread, but has >>> overlooked the fact that the main thread, upon the end of main() >>> becomes the DestroyJavaVM thread - and it is that thread which >>> encounters the ClassCircularityError: >>> >>> Starting test with 1000 iterations >>> Thread 'DestroyJavaVM' has called transform() >>> >>> So perhaps the right fix is to expand the above to: >>> >>> if (!tName.equals("main") && !tName.equals("DestroyJavaVM")) >>> >>> ? I admit I'm having trouble seeing the full picture in this test. >>> >> It is not DestroyJavaVM thread cause CircularityError. It is TestThread >> cause CircularityError. > > Not according to the bug report: > > Starting test with 1000 iterationsThread 'DestroyJavaVM' has called > transform() > Thread 'DestroyJavaVM' has called transform() > result=1 > ----------System.err:(14/920)---------- > Exception in thread "main" java.lang.ClassCircularityError: > sun/misc/URLClassPath$JarLoader$2 > > This shows that "main" got the CCE. Which in itself is confusing given > we also report "Thread 'DestroyJavaVM' has called transform()" and > they are in fact the same thread! > > David > ----- > > >> In TestThread (DestroyJavaVM may cause same I think, but not seen in >> debug): >> >> forName("TestClass2", true, classLoader); <---- the loader is >> customer loader which is obtained from agent code. >> -->...... transform(...) >> -->defineClass(...) >> -->...... call into vm, we need to load JarLoader$2 >> since JarLoader$1 used >> ->resolve_instance_class_or_null >> // here we create PlaceTableEntry for >> JarLoader$2, put into place holder table >> -->...... >> --->forName("TestClass3", true, >> classLoader); >> -->... transform(...) >> -->defineClass(...) >> -->...... call into vm >> again. Now JarLoader$2 is not loaded, but it is in placeholder table, so >> throw_circularity_error set and throw. >> ....... >> With custom loader, agent's transform will be called, then it >> loads TestClass3, repeat the same steps as loading TestClass2. The >> problem is JarLoader$2 has not been loaded yet but in place holder table >> (this is for checking CircularityError), then begins loading TestClass3, >> this is a recursive and embedded case. The non-failed case also saw >> CircularityError thrown, but somehow the test case did not fail. Design >> like this will cause call transform in transform which is the reason >> CircularityError thrown. >> >> I have no idea about the original desin of the test case, but think >> it should do this. >> >>> >>> Looking at your change, don't leave commented out lines in the code: >>> 115 // ClassLoader loader = >>> ParallelTransformerLoaderAgent.getClassLoader(); >>> 118 //Class.forName("TestClass" + >>> index, true, loader); >>> >> Will remove >> >> Thanks >> Yumin >>> Thanks, >>> David >>> >>>> Thanks >>>> Yumin * >> From david.holmes at oracle.com Thu Oct 16 00:14:06 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 16 Oct 2014 10:14:06 +1000 Subject: RFR 8056143: interrupted java/lang/management/MemoryMXBean/LowMemoryTest.java leaves running process In-Reply-To: <543E7CC4.4000504@oracle.com> References: <543CFF1F.3040302@oracle.com> <543DBB81.60409@oracle.com> <543E2745.3000107@oracle.com> <543E2C38.3030601@oracle.com> <543E7CC4.4000504@oracle.com> Message-ID: <543F0DCE.8090107@oracle.com> On 15/10/2014 11:55 PM, Jaroslav Bachorik wrote: > On 10/15/2014 10:11 AM, David Holmes wrote: >> On 15/10/2014 5:50 PM, Jaroslav Bachorik wrote: >>> On 10/15/2014 02:10 AM, David Holmes wrote: >>>> On 14/10/2014 8:46 PM, Jaroslav Bachorik wrote: >>>>> Please, review the following test change >>>>> >>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8056143 >>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8056143/webrev.00 >>>>> >>>>> The method jdk.testlibrary.ProcessTools.getOutput(process) waits for >>>>> the >>>>> given process to finish (process.waitFor()) before grabbing its >>>>> outputs. >>>>> However, the code does not handle the process.waitFor() being >>>>> interrupted correctly - it just goes ahead and tries to obtain the >>>>> exit >>>>> code which will fail and leave the tested process running. >>>>> >>>>> The correct way is to forcibly destroy the process when >>>>> process.waitFor() is interrupted or throws ExecutionException to make >>>>> sure the process has actually exited before checking its exit code. >>>> >>>> Why is this correct? What gives the thread calling getOutput the right >>>> to terminate the target process just because that thread was >>>> interrupted >>>> while waiting? If the interrupting thread intended the interrupt to >>>> mean >>>> "forcibly terminate the process and interrupt all threads waiting on >>>> it" >>>> then that thread should be doing the termination _not_ the one that was >>>> interrupted! >>> >>> Process.waitFor() gets interrupted by a thread unknown to the actual >>> test case - probably the JTreg timeout thread. The interrupting thread >>> doesn't know that it is supposed to destroy a process. Once JTreg can >>> take care of cleaning up process tree upon exit this code wouldn't be >>> needed. >>> >>> I was contemplating adding the check for "null" returned from >>> ProcessTools.getOutput() and destroying the process inside the caller >>> code - but this would have the same results as doing it in >>> ProcessTools.getOutput() with the drawback of duplicating the same check >>> everywhere ProcessTools.getOutput() would be used. >>> >>> A silent postcondition of ProcessTools.getOuptut() is that the target >>> process has finished - and it holds for all the code paths except the >>> InterruptedException handler. >> >> That doesn't mean it is up to getOutput to forcibly terminate the >> process. Multi-process cancellation is tricky, and yes eventually jtreg >> will handle it. But this seems the wrong place to handle it now. Part of >> the flaw here is that getOutput should itself throw InterruptedException >> so that the caller is forced to deal with this - instead it just >> re-asserts the interrupt state. The caller has to be aware that the >> thread can be interrupted and do something appropriate - which may mean >> punting to its caller. This is akin to a thread catching >> InterruptedException and calling System.exit - it simply is not its job >> to make that kind of decision at that level! > > There is no other decision to make. Not as it is written today. You can > call ProcessTools.getOutput() and check whether the result is null and > then end the test process. There is no other sensible action. The > Process.waitFor() was interrupted you have no data to perform the checks > against so the test will fail and as such it should stop any external > processes it has started. > > Yes, I can go through all the tests using ProcessTools.getOutput() and > add `if (output == null) process.destroyForcible();` - would this make > it a better solution than putting this logic inside > ProcessTools.getOutput()? It would be the correct solution. Hacking it into getOutput() is just a convenience. Problem is that none of these tests have given enough thought to the cancellation issue and general process management. Sorry. David > -JB- > >> >> David >> >>> -JB- >>> >>> >>> >>>> >>>> David >>>> >>>>> Thanks, >>>>> >>>>> -JB- >>> > From david.holmes at oracle.com Thu Oct 16 00:24:44 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 16 Oct 2014 10:24:44 +1000 Subject: RFR(S): JDK-8030708: warnings from b119 for jdk/src/share/back: JNI exception pending In-Reply-To: <543E85D0.8080701@oracle.com> References: <543E4797.9050302@oracle.com> <543E4C06.2080601@oracle.com> <543E4EC7.2060204@oracle.com> <543E66E3.6040400@oracle.com> <543E85D0.8080701@oracle.com> Message-ID: <543F104C.3010208@oracle.com> On 16/10/2014 12:33 AM, Dmitry Samersoff wrote: > David, > > Sorry, copied wrong function! > > I mean this call > > weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); > > that can post OutOfMemoryError Okay, so shouldn't that be where the exception is cleared: /* Create weak reference to make sure we have a reference */ weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); if (weakRef == NULL) { + // < clear exception here > jvmtiDeallocate(node); return NULL; } Thanks, David ----- > commonRef_refToID() -> > > createNode(JNIEnv *env, jobject ref) -> > > weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); > > -Dmitry > > On 2014-10-15 16:21, David Holmes wrote: >> On 15/10/2014 8:39 PM, Dmitry Samersoff wrote: >>> On 2014-10-15 14:27, David Holmes wrote: >>>> On 15/10/2014 8:08 PM, Dmitry Samersoff wrote: >>>>> Please review the fix: >>>>> >>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.01/ >>>>> >>>>> Added missed exception checks. >>>> >>>> src/jdk.jdwp.agent/share/native/libjdwp/outStream.c >>>> >>>> What is potentially posting the exception? >>> >>> JvmtiEnv::GetTag(jobject object, jlong* tag_ptr) called from >>> commonRef_refToID() >> >> You mean this call: >> >> error = JVMTI_FUNC_PTR(gdata->jvmti,GetTag)(gdata->jvmti, ref, &tag); > > x >> >> in findNodeByRef which is called by commonRef_refToID? JVM TI doesn't >> post exceptions. >> >> "JVM TI functions never throw exceptions; error conditions are >> communicated via the function return value. Any existing exception state >> is preserved across a call to a JVM TI function." >> >> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html >> >> David >> >>> -Dmitry >>> >>> > > From dmitry.samersoff at oracle.com Thu Oct 16 10:08:45 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Thu, 16 Oct 2014 14:08:45 +0400 Subject: RFR(S): JDK-8030708: warnings from b119 for jdk/src/share/back: JNI exception pending In-Reply-To: <543F104C.3010208@oracle.com> References: <543E4797.9050302@oracle.com> <543E4C06.2080601@oracle.com> <543E4EC7.2060204@oracle.com> <543E66E3.6040400@oracle.com> <543E85D0.8080701@oracle.com> <543F104C.3010208@oracle.com> Message-ID: <543F992D.6010000@oracle.com> David, Changed. Thank you for review! please, see: http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.02/ -Dmitry On 2014-10-16 04:24, David Holmes wrote: > On 16/10/2014 12:33 AM, Dmitry Samersoff wrote: >> David, >> >> Sorry, copied wrong function! >> >> I mean this call >> >> weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); >> >> that can post OutOfMemoryError > > Okay, so shouldn't that be where the exception is cleared: > > /* Create weak reference to make sure we have a reference */ > weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); > if (weakRef == NULL) { > + // < clear exception here > > jvmtiDeallocate(node); > return NULL; > } > > Thanks, > David > ----- > >> commonRef_refToID() -> >> >> createNode(JNIEnv *env, jobject ref) -> >> >> weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); >> >> -Dmitry >> >> On 2014-10-15 16:21, David Holmes wrote: >>> On 15/10/2014 8:39 PM, Dmitry Samersoff wrote: >>>> On 2014-10-15 14:27, David Holmes wrote: >>>>> On 15/10/2014 8:08 PM, Dmitry Samersoff wrote: >>>>>> Please review the fix: >>>>>> >>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.01/ >>>>>> >>>>>> Added missed exception checks. >>>>> >>>>> src/jdk.jdwp.agent/share/native/libjdwp/outStream.c >>>>> >>>>> What is potentially posting the exception? >>>> >>>> JvmtiEnv::GetTag(jobject object, jlong* tag_ptr) called from >>>> commonRef_refToID() >>> >>> You mean this call: >>> >>> error = JVMTI_FUNC_PTR(gdata->jvmti,GetTag)(gdata->jvmti, ref, >>> &tag); >> >> x >>> >>> in findNodeByRef which is called by commonRef_refToID? JVM TI doesn't >>> post exceptions. >>> >>> "JVM TI functions never throw exceptions; error conditions are >>> communicated via the function return value. Any existing exception state >>> is preserved across a call to a JVM TI function." >>> >>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html >>> >>> David >>> >>>> -Dmitry >>>> >>>> >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.holmes at oracle.com Thu Oct 16 12:07:11 2014 From: david.holmes at oracle.com (David Holmes) Date: Thu, 16 Oct 2014 22:07:11 +1000 Subject: RFR(S): JDK-8030708: warnings from b119 for jdk/src/share/back: JNI exception pending In-Reply-To: <543F992D.6010000@oracle.com> References: <543E4797.9050302@oracle.com> <543E4C06.2080601@oracle.com> <543E4EC7.2060204@oracle.com> <543E66E3.6040400@oracle.com> <543E85D0.8080701@oracle.com> <543F104C.3010208@oracle.com> <543F992D.6010000@oracle.com> Message-ID: <543FB4EF.5050706@oracle.com> Hi Dmitry, On 16/10/2014 8:08 PM, Dmitry Samersoff wrote: > David, > > Changed. Thank you for review! > > please, see: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.02/ 102 if (weakRef == NULL || (*env)->ExceptionCheck(env)) { Isn't the only time it will return NULL when an exception occurs? Conversely if an exception occurs then it must return NULL - so the exception check seems redundant. But this also suggests you need similar logic at: 182 weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, node->ref); 456 lref = JNI_FUNC_PTR(env,NewLocalRef)(env, node->ref); Or more generally any JNI call from JVMTI should be wrapped in a way that checks for exceptions and clears them. David > -Dmitry > > On 2014-10-16 04:24, David Holmes wrote: >> On 16/10/2014 12:33 AM, Dmitry Samersoff wrote: >>> David, >>> >>> Sorry, copied wrong function! >>> >>> I mean this call >>> >>> weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); >>> >>> that can post OutOfMemoryError >> >> Okay, so shouldn't that be where the exception is cleared: >> >> /* Create weak reference to make sure we have a reference */ >> weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); >> if (weakRef == NULL) { >> + // < clear exception here > >> jvmtiDeallocate(node); >> return NULL; >> } >> >> Thanks, >> David >> ----- >> >>> commonRef_refToID() -> >>> >>> createNode(JNIEnv *env, jobject ref) -> >>> >>> weakRef = JNI_FUNC_PTR(env,NewWeakGlobalRef)(env, ref); >>> >>> -Dmitry >>> >>> On 2014-10-15 16:21, David Holmes wrote: >>>> On 15/10/2014 8:39 PM, Dmitry Samersoff wrote: >>>>> On 2014-10-15 14:27, David Holmes wrote: >>>>>> On 15/10/2014 8:08 PM, Dmitry Samersoff wrote: >>>>>>> Please review the fix: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8030708/webrev.01/ >>>>>>> >>>>>>> Added missed exception checks. >>>>>> >>>>>> src/jdk.jdwp.agent/share/native/libjdwp/outStream.c >>>>>> >>>>>> What is potentially posting the exception? >>>>> >>>>> JvmtiEnv::GetTag(jobject object, jlong* tag_ptr) called from >>>>> commonRef_refToID() >>>> >>>> You mean this call: >>>> >>>> error = JVMTI_FUNC_PTR(gdata->jvmti,GetTag)(gdata->jvmti, ref, >>>> &tag); >>> >>> x >>>> >>>> in findNodeByRef which is called by commonRef_refToID? JVM TI doesn't >>>> post exceptions. >>>> >>>> "JVM TI functions never throw exceptions; error conditions are >>>> communicated via the function return value. Any existing exception state >>>> is preserved across a call to a JVM TI function." >>>> >>>> http://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html >>>> >>>> David >>>> >>>>> -Dmitry >>>>> >>>>> >>> >>> > > From jaroslav.bachorik at oracle.com Fri Oct 17 09:55:15 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 17 Oct 2014 11:55:15 +0200 Subject: RFR 8056143: interrupted java/lang/management/MemoryMXBean/LowMemoryTest.java leaves running process In-Reply-To: <543F0DCE.8090107@oracle.com> References: <543CFF1F.3040302@oracle.com> <543DBB81.60409@oracle.com> <543E2745.3000107@oracle.com> <543E2C38.3030601@oracle.com> <543E7CC4.4000504@oracle.com> <543F0DCE.8090107@oracle.com> Message-ID: <5440E783.1000005@oracle.com> On 10/16/2014 02:14 AM, David Holmes wrote: > On 15/10/2014 11:55 PM, Jaroslav Bachorik wrote: >> On 10/15/2014 10:11 AM, David Holmes wrote: >>> On 15/10/2014 5:50 PM, Jaroslav Bachorik wrote: >>>> On 10/15/2014 02:10 AM, David Holmes wrote: >>>>> On 14/10/2014 8:46 PM, Jaroslav Bachorik wrote: >>>>>> Please, review the following test change >>>>>> >>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8056143 >>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8056143/webrev.00 >>>>>> >>>>>> The method jdk.testlibrary.ProcessTools.getOutput(process) waits for >>>>>> the >>>>>> given process to finish (process.waitFor()) before grabbing its >>>>>> outputs. >>>>>> However, the code does not handle the process.waitFor() being >>>>>> interrupted correctly - it just goes ahead and tries to obtain the >>>>>> exit >>>>>> code which will fail and leave the tested process running. >>>>>> >>>>>> The correct way is to forcibly destroy the process when >>>>>> process.waitFor() is interrupted or throws ExecutionException to make >>>>>> sure the process has actually exited before checking its exit code. >>>>> >>>>> Why is this correct? What gives the thread calling getOutput the right >>>>> to terminate the target process just because that thread was >>>>> interrupted >>>>> while waiting? If the interrupting thread intended the interrupt to >>>>> mean >>>>> "forcibly terminate the process and interrupt all threads waiting on >>>>> it" >>>>> then that thread should be doing the termination _not_ the one that >>>>> was >>>>> interrupted! >>>> >>>> Process.waitFor() gets interrupted by a thread unknown to the actual >>>> test case - probably the JTreg timeout thread. The interrupting thread >>>> doesn't know that it is supposed to destroy a process. Once JTreg can >>>> take care of cleaning up process tree upon exit this code wouldn't be >>>> needed. >>>> >>>> I was contemplating adding the check for "null" returned from >>>> ProcessTools.getOutput() and destroying the process inside the caller >>>> code - but this would have the same results as doing it in >>>> ProcessTools.getOutput() with the drawback of duplicating the same >>>> check >>>> everywhere ProcessTools.getOutput() would be used. >>>> >>>> A silent postcondition of ProcessTools.getOuptut() is that the target >>>> process has finished - and it holds for all the code paths except the >>>> InterruptedException handler. >>> >>> That doesn't mean it is up to getOutput to forcibly terminate the >>> process. Multi-process cancellation is tricky, and yes eventually jtreg >>> will handle it. But this seems the wrong place to handle it now. Part of >>> the flaw here is that getOutput should itself throw InterruptedException >>> so that the caller is forced to deal with this - instead it just >>> re-asserts the interrupt state. The caller has to be aware that the >>> thread can be interrupted and do something appropriate - which may mean >>> punting to its caller. This is akin to a thread catching >>> InterruptedException and calling System.exit - it simply is not its job >>> to make that kind of decision at that level! >> >> There is no other decision to make. Not as it is written today. You can >> call ProcessTools.getOutput() and check whether the result is null and >> then end the test process. There is no other sensible action. The >> Process.waitFor() was interrupted you have no data to perform the checks >> against so the test will fail and as such it should stop any external >> processes it has started. >> >> Yes, I can go through all the tests using ProcessTools.getOutput() and >> add `if (output == null) process.destroyForcible();` - would this make >> it a better solution than putting this logic inside >> ProcessTools.getOutput()? > > It would be the correct solution. Hacking it into getOutput() is just a > convenience. Problem is that none of these tests have given enough > thought to the cancellation issue and general process management. Agreed. My concern was that the test code base would have been littered with `if (output == null) process.destroyForcible();` checks because there is no other way to react to the situation when process.waitFor() is interrupted - at least not in the JTreg context. Therefore I put the logic of properly ending the external process to ProcessTools.executeProcess() method and restricted access to the constructors of OutputBuffer and OutputAnalyzer to enforce their creation only via ProcessTools.executeProcess(). Also, in order to prevent the started process stdout/stderr overflow I moved the backround stream pumpers to OutputBuffer so they would be started ASAP, without waiting for the process to exit (which defeats the purpose of consuming the attached stdout/stderr streams in backround anyway). With these changes the API user doesn't need to worry about the external process cleanup anymore. The semantics of ProcessTools.executeProcess() guarantees that there will be no orphan process hanging about once this method returns. This change is significantly bigger than the previous attempt because it spans a lot of tests using the OutputAnalyzer but, hopefully, it addresses David's concerns. http://cr.openjdk.java.net/~jbachorik/8056143/webrev.01 -JB- > > Sorry. > > David > >> -JB- >> >>> >>> David >>> >>>> -JB- >>>> >>>> >>>> >>>>> >>>>> David >>>>> >>>>>> Thanks, >>>>>> >>>>>> -JB- >>>> >> From jaroslav.bachorik at oracle.com Fri Oct 17 10:26:08 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 17 Oct 2014 12:26:08 +0200 Subject: RFR 8061312: Even more debug output needed Message-ID: <5440EEC0.1040409@oracle.com> Please, review the following addition to the test debug output Issue : https://bugs.openjdk.java.net/browse/JDK-8061312 Test Issue : https://bugs.openjdk.java.net/browse/JDK-8059949 Webrev : http://cr.openjdk.java.net/~jbachorik/8061312/webrev.00 The test 'com/sun/tools/attach/StartManagementAgent.java' is failing intermittently with timeouts. This change is to add more debug output to enable locating where the test gets stuck. Thanks, -JB- From staffan.larsen at oracle.com Fri Oct 17 16:02:04 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Fri, 17 Oct 2014 09:02:04 -0700 Subject: RFR 8061312: Even more debug output needed In-Reply-To: <5440EEC0.1040409@oracle.com> References: <5440EEC0.1040409@oracle.com> Message-ID: <6A2179DA-85DB-47AE-9226-E2C60E187B61@oracle.com> Looks good! Thanks, /Staffan On 17 okt 2014, at 03:26, Jaroslav Bachorik wrote: > Please, review the following addition to the test debug output > > Issue : https://bugs.openjdk.java.net/browse/JDK-8061312 > Test Issue : https://bugs.openjdk.java.net/browse/JDK-8059949 > Webrev : http://cr.openjdk.java.net/~jbachorik/8061312/webrev.00 > > The test 'com/sun/tools/attach/StartManagementAgent.java' is failing intermittently with timeouts. This change is to add more debug output to enable locating where the test gets stuck. > > Thanks, > > -JB- From olivier.lagneau at oracle.com Fri Oct 17 17:05:07 2014 From: olivier.lagneau at oracle.com (olivier.lagneau at oracle.com) Date: Fri, 17 Oct 2014 19:05:07 +0200 Subject: RFR 8061312: Even more debug output needed In-Reply-To: <5440EEC0.1040409@oracle.com> References: <5440EEC0.1040409@oracle.com> Message-ID: <54414C43.9050900@oracle.com> Looks good, Hope you will get enough information to see what happens. Olivier. On 17/10/2014 12:26, Jaroslav Bachorik wrote: > Please, review the following addition to the test debug output > > Issue : https://bugs.openjdk.java.net/browse/JDK-8061312 > Test Issue : https://bugs.openjdk.java.net/browse/JDK-8059949 > Webrev : http://cr.openjdk.java.net/~jbachorik/8061312/webrev.00 > > The test 'com/sun/tools/attach/StartManagementAgent.java' is failing > intermittently with timeouts. This change is to add more debug output > to enable locating where the test gets stuck. > > Thanks, > > -JB- -------------- next part -------------- An HTML attachment was scrubbed... URL: From yumin.qi at oracle.com Sat Oct 18 05:08:50 2014 From: yumin.qi at oracle.com (Yumin Qi) Date: Fri, 17 Oct 2014 22:08:50 -0700 Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh fails with ClassCircularityError In-Reply-To: <543EA7A8.4030809@oracle.com> References: <543C591E.8010602@oracle.com> <543C8ADE.5000309@oracle.com> <543D6E18.9060205@oracle.com> <543DBF9F.5050507@oracle.com> <543EA7A8.4030809@oracle.com> Message-ID: <5441F5E2.4050904@oracle.com> David, (cc Karen) I think I got why it throws CircularityError in 'main' thread. The CircularityError thrown in TestThread, which was handled in classloading, the loading class is put into unresolved list. Note we clean pending exception and return null to caller, which in the search next will load the instance class. There is no exception in java level be caught in TestThread. When main ended, we create a JavaThread named 'DestroyJavaVM' and give the thread id the current thread id, which is the main thread id. Since the All JavaThread object should be freed when this last JavaThread exit, I have no idea how the 'DestroyJavaVM' thread saw the exception, from the stack trace, the calling begins with ShutDown.java: /* The preceding static fields are protected by this lock */ private static class Lock { }; private static Object lock = new Lock(); //<<<------------------------ line 61 How come the call via agent and call transform? At shutdown time, do we need to turn down the request to agent at this time? Thanks Yumin java.lang.Exception: Stack trace at java.lang.Thread.dumpStack(Thread.java:1329) at ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) at sun.instrument.TransformerManager.transform(TransformerManager.java:188) at sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) java.lang.Exception: Stack trace at java.lang.Thread.dumpStack(Thread.java:1329) at ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) at sun.instrument.TransformerManager.transform(TransformerManager.java:188) at sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) at java.lang.Shutdown.(Shutdown.java:61) This output in On 10/15/2014 9:58 AM, Yumin Qi wrote: > David, > > I will take another detail trace to see where the exception begins > in main thread, it should not thrown in main thread. I only saw it is > thrown in TestThread, not main, not DestroyJavaVM. If that happens, > maybe something wrong in vm. > The output in all 'failed' case (many failed not cause exception > output, not caught), the main thread got the exception. That is not > right. > > Thanks > Yumin > > On 10/14/2014 5:28 PM, David Holmes wrote: >> Hi Yumin, >> >> On 15/10/2014 4:40 AM, Yumin Qi wrote: >>> David, Thanks for the comment. See embedded. >>> >>> >>> On 10/13/2014 7:30 PM, David Holmes wrote: >>>> Hi Yumin, >>>> >>>> jdk9-dev is not the best place for code review requests. >>>> serviceability-dev would be better for this test. >>>> >>>> On 14/10/2014 8:58 AM, Yumin Qi wrote: >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468 >>>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/ >>>>> >>>>> the bug marked as confidential so post the webrev internally. >>>> >>>> Not any more :) >>>> >>> Thanks. I changed to non security related bug. Usually when test >>> failed, >>> a confidential bug is filed. I would like to create bug open if the >>> test >>> is in open part. >>>>> Problem: The test case tries to load a class from the same jar via >>>>> agent >>>>> in the middle of loading another class from the jar via same class >>>>> loader in same thread. The call happens in transform which is a rare >>>>> case --- in middle of loading class, loading another class. The >>>>> result >>>>> is a CircularityError. When first class is in loading, in vm we put >>>>> JarLoader$2 on place holder table, then we start the defineClass, >>>>> which >>>>> calls transform, begins loading the second class so go along the same >>>>> routine for loading JarLoader$2 first, found it already in >>>>> placeholder >>>>> table. A CircularityError is thrown. >>>>> Fix: The test case should not call loading class with same class >>>>> loader >>>>> in same thread from same jar in 'transform' method. I modify it >>>>> loading >>>>> with system class loader and we expect see ClassNotFoundException. >>>>> Detail see bug comments. >>>> >>>> It is not clear to me that the test is incorrect. It is also unclear >>>> why such an old test is now failing - we must have changed something. >>>> And it's unclear whether what the test does with your change is >>>> actually testing what the test wanted to test. >>>> >>>> It seems to me that the actual problem in the test is the reference to >>>> the "main" thread ie: >>>> >>>> if (!tName.equals("main")) >>>> >>>> The test knows not to do the loading in the main thread, but has >>>> overlooked the fact that the main thread, upon the end of main() >>>> becomes the DestroyJavaVM thread - and it is that thread which >>>> encounters the ClassCircularityError: >>>> >>>> Starting test with 1000 iterations >>>> Thread 'DestroyJavaVM' has called transform() >>>> >>>> So perhaps the right fix is to expand the above to: >>>> >>>> if (!tName.equals("main") && !tName.equals("DestroyJavaVM")) >>>> >>>> ? I admit I'm having trouble seeing the full picture in this test. >>>> >>> It is not DestroyJavaVM thread cause CircularityError. It is TestThread >>> cause CircularityError. >> >> Not according to the bug report: >> >> Starting test with 1000 iterationsThread 'DestroyJavaVM' has called >> transform() >> Thread 'DestroyJavaVM' has called transform() >> result=1 >> ----------System.err:(14/920)---------- >> Exception in thread "main" java.lang.ClassCircularityError: >> sun/misc/URLClassPath$JarLoader$2 >> >> This shows that "main" got the CCE. Which in itself is confusing >> given we also report "Thread 'DestroyJavaVM' has called transform()" >> and they are in fact the same thread! >> >> David >> ----- >> >> >>> In TestThread (DestroyJavaVM may cause same I think, but not seen in >>> debug): >>> >>> forName("TestClass2", true, classLoader); <---- the loader is >>> customer loader which is obtained from agent code. >>> -->...... transform(...) >>> -->defineClass(...) >>> -->...... call into vm, we need to load JarLoader$2 >>> since JarLoader$1 used >>> ->resolve_instance_class_or_null >>> // here we create PlaceTableEntry for >>> JarLoader$2, put into place holder table >>> -->...... >>> --->forName("TestClass3", true, >>> classLoader); >>> -->... transform(...) >>> -->defineClass(...) >>> -->...... call into vm >>> again. Now JarLoader$2 is not loaded, but it is in placeholder >>> table, so >>> throw_circularity_error set and throw. >>> ....... >>> With custom loader, agent's transform will be called, then it >>> loads TestClass3, repeat the same steps as loading TestClass2. The >>> problem is JarLoader$2 has not been loaded yet but in place holder >>> table >>> (this is for checking CircularityError), then begins loading >>> TestClass3, >>> this is a recursive and embedded case. The non-failed case also saw >>> CircularityError thrown, but somehow the test case did not fail. >>> Design >>> like this will cause call transform in transform which is the reason >>> CircularityError thrown. >>> >>> I have no idea about the original desin of the test case, but think >>> it should do this. >>> >>>> >>>> Looking at your change, don't leave commented out lines in the code: >>>> 115 // ClassLoader loader = >>>> ParallelTransformerLoaderAgent.getClassLoader(); >>>> 118 //Class.forName("TestClass" + >>>> index, true, loader); >>>> >>> Will remove >>> >>> Thanks >>> Yumin >>>> Thanks, >>>> David >>>> >>>>> Thanks >>>>> Yumin * >>> > From david.holmes at oracle.com Sat Oct 18 05:54:43 2014 From: david.holmes at oracle.com (David Holmes) Date: Sat, 18 Oct 2014 15:54:43 +1000 Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh fails with ClassCircularityError In-Reply-To: <5441F5E2.4050904@oracle.com> References: <543C591E.8010602@oracle.com> <543C8ADE.5000309@oracle.com> <543D6E18.9060205@oracle.com> <543DBF9F.5050507@oracle.com> <543EA7A8.4030809@oracle.com> <5441F5E2.4050904@oracle.com> Message-ID: <544200A3.2050608@oracle.com> Hi Yumin, Quick response ... when shutdown is initiated the Shutdown class will be loaded and initialized: at java.lang.Shutdown.(Shutdown.java:61) Presumably this static initialization is what triggers the involvement of the agent to do the transform, and hence encounters the exception. Though I'm unclear how it still reports "main" as the name when it has now become "DestroyJavaVM" David On 18/10/2014 3:08 PM, Yumin Qi wrote: > David, (cc Karen) > > I think I got why it throws CircularityError in 'main' thread. > The CircularityError thrown in TestThread, which was handled in > classloading, the loading class is put into unresolved list. Note we > clean pending exception and return null to caller, which in the search > next will load the instance class. There is no exception in java level > be caught in TestThread. > When main ended, we create a JavaThread named 'DestroyJavaVM' and > give the thread id the current thread id, which is the main thread id. > Since the All JavaThread object should be freed when this last > JavaThread exit, I have no idea how the 'DestroyJavaVM' thread saw the > exception, from the stack trace, the calling begins with > > ShutDown.java: > > /* The preceding static fields are protected by this lock */ > private static class Lock { }; > private static Object lock = new Lock(); > //<<<------------------------ line 61 > > How come the call via agent and call transform? At shutdown time, do > we need to turn down the request to agent at this time? > > Thanks > Yumin > > > java.lang.Exception: Stack trace > at java.lang.Thread.dumpStack(Thread.java:1329) > at > ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) > > at > sun.instrument.TransformerManager.transform(TransformerManager.java:188) > at > sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) > java.lang.Exception: Stack trace > at java.lang.Thread.dumpStack(Thread.java:1329) > at > ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) > > at > sun.instrument.TransformerManager.transform(TransformerManager.java:188) > at > sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) > at java.lang.Shutdown.(Shutdown.java:61) > > This output in > > > > > > On 10/15/2014 9:58 AM, Yumin Qi wrote: >> David, >> >> I will take another detail trace to see where the exception begins >> in main thread, it should not thrown in main thread. I only saw it is >> thrown in TestThread, not main, not DestroyJavaVM. If that happens, >> maybe something wrong in vm. >> The output in all 'failed' case (many failed not cause exception >> output, not caught), the main thread got the exception. That is not >> right. >> >> Thanks >> Yumin >> >> On 10/14/2014 5:28 PM, David Holmes wrote: >>> Hi Yumin, >>> >>> On 15/10/2014 4:40 AM, Yumin Qi wrote: >>>> David, Thanks for the comment. See embedded. >>>> >>>> >>>> On 10/13/2014 7:30 PM, David Holmes wrote: >>>>> Hi Yumin, >>>>> >>>>> jdk9-dev is not the best place for code review requests. >>>>> serviceability-dev would be better for this test. >>>>> >>>>> On 14/10/2014 8:58 AM, Yumin Qi wrote: >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468 >>>>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/ >>>>>> >>>>>> the bug marked as confidential so post the webrev internally. >>>>> >>>>> Not any more :) >>>>> >>>> Thanks. I changed to non security related bug. Usually when test >>>> failed, >>>> a confidential bug is filed. I would like to create bug open if the >>>> test >>>> is in open part. >>>>>> Problem: The test case tries to load a class from the same jar via >>>>>> agent >>>>>> in the middle of loading another class from the jar via same class >>>>>> loader in same thread. The call happens in transform which is a rare >>>>>> case --- in middle of loading class, loading another class. The >>>>>> result >>>>>> is a CircularityError. When first class is in loading, in vm we put >>>>>> JarLoader$2 on place holder table, then we start the defineClass, >>>>>> which >>>>>> calls transform, begins loading the second class so go along the same >>>>>> routine for loading JarLoader$2 first, found it already in >>>>>> placeholder >>>>>> table. A CircularityError is thrown. >>>>>> Fix: The test case should not call loading class with same class >>>>>> loader >>>>>> in same thread from same jar in 'transform' method. I modify it >>>>>> loading >>>>>> with system class loader and we expect see ClassNotFoundException. >>>>>> Detail see bug comments. >>>>> >>>>> It is not clear to me that the test is incorrect. It is also unclear >>>>> why such an old test is now failing - we must have changed something. >>>>> And it's unclear whether what the test does with your change is >>>>> actually testing what the test wanted to test. >>>>> >>>>> It seems to me that the actual problem in the test is the reference to >>>>> the "main" thread ie: >>>>> >>>>> if (!tName.equals("main")) >>>>> >>>>> The test knows not to do the loading in the main thread, but has >>>>> overlooked the fact that the main thread, upon the end of main() >>>>> becomes the DestroyJavaVM thread - and it is that thread which >>>>> encounters the ClassCircularityError: >>>>> >>>>> Starting test with 1000 iterations >>>>> Thread 'DestroyJavaVM' has called transform() >>>>> >>>>> So perhaps the right fix is to expand the above to: >>>>> >>>>> if (!tName.equals("main") && !tName.equals("DestroyJavaVM")) >>>>> >>>>> ? I admit I'm having trouble seeing the full picture in this test. >>>>> >>>> It is not DestroyJavaVM thread cause CircularityError. It is TestThread >>>> cause CircularityError. >>> >>> Not according to the bug report: >>> >>> Starting test with 1000 iterationsThread 'DestroyJavaVM' has called >>> transform() >>> Thread 'DestroyJavaVM' has called transform() >>> result=1 >>> ----------System.err:(14/920)---------- >>> Exception in thread "main" java.lang.ClassCircularityError: >>> sun/misc/URLClassPath$JarLoader$2 >>> >>> This shows that "main" got the CCE. Which in itself is confusing >>> given we also report "Thread 'DestroyJavaVM' has called transform()" >>> and they are in fact the same thread! >>> >>> David >>> ----- >>> >>> >>>> In TestThread (DestroyJavaVM may cause same I think, but not seen in >>>> debug): >>>> >>>> forName("TestClass2", true, classLoader); <---- the loader is >>>> customer loader which is obtained from agent code. >>>> -->...... transform(...) >>>> -->defineClass(...) >>>> -->...... call into vm, we need to load JarLoader$2 >>>> since JarLoader$1 used >>>> ->resolve_instance_class_or_null >>>> // here we create PlaceTableEntry for >>>> JarLoader$2, put into place holder table >>>> -->...... >>>> --->forName("TestClass3", true, >>>> classLoader); >>>> -->... transform(...) >>>> -->defineClass(...) >>>> -->...... call into vm >>>> again. Now JarLoader$2 is not loaded, but it is in placeholder >>>> table, so >>>> throw_circularity_error set and throw. >>>> ....... >>>> With custom loader, agent's transform will be called, then it >>>> loads TestClass3, repeat the same steps as loading TestClass2. The >>>> problem is JarLoader$2 has not been loaded yet but in place holder >>>> table >>>> (this is for checking CircularityError), then begins loading >>>> TestClass3, >>>> this is a recursive and embedded case. The non-failed case also saw >>>> CircularityError thrown, but somehow the test case did not fail. Design >>>> like this will cause call transform in transform which is the reason >>>> CircularityError thrown. >>>> >>>> I have no idea about the original desin of the test case, but think >>>> it should do this. >>>> >>>>> >>>>> Looking at your change, don't leave commented out lines in the code: >>>>> 115 // ClassLoader loader = >>>>> ParallelTransformerLoaderAgent.getClassLoader(); >>>>> 118 //Class.forName("TestClass" + >>>>> index, true, loader); >>>>> >>>> Will remove >>>> >>>> Thanks >>>> Yumin >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks >>>>>> Yumin * >>>> >> > From yumin.qi at oracle.com Sat Oct 18 06:51:15 2014 From: yumin.qi at oracle.com (Yumin Qi) Date: Fri, 17 Oct 2014 23:51:15 -0700 Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh fails with ClassCircularityError In-Reply-To: <544200A3.2050608@oracle.com> References: <543C591E.8010602@oracle.com> <543C8ADE.5000309@oracle.com> <543D6E18.9060205@oracle.com> <543DBF9F.5050507@oracle.com> <543EA7A8.4030809@oracle.com> <5441F5E2.4050904@oracle.com> <544200A3.2050608@oracle.com> Message-ID: <54420DE3.3030707@oracle.com> in debugger, it indeed is 'main' thread id: output of loading TestClass1 (which is always in main): 'TestClass1', loader class loader 0x00007ffff2c62780a 'java/net/URLClassLoader', supername 'java/lang/Object' loadInstanceThreadQ threads:AllocatedObj(0x00007ffff000b000), superThreadQ threads: defineThreadQ threads:AllocatedObj(0x00007ffff000b000), thread_id is 24590, in debugger (note: gdb attach result) I checked here: jni_DestroyJavaVM: JNIWrapper("DestroyJavaVM"); JNIEnv *env; JavaVMAttachArgs destroyargs; destroyargs.version = CurrentVersion; destroyargs.name = (char *)"DestroyJavaVM"; destroyargs.group = NULL; res = vm->AttachCurrentThread((void **)&env, (void *)&destroyargs); if (res != JNI_OK) { return res; } // Since this is not a JVM_ENTRY we have to set the thread state manually before entering. JavaThread* thread = JavaThread::current(); // <----------------------------- we returned the JavaThread, same thread created in attach ThreadStateTransition::transition_from_native(thread, _thread_in_vm); if (Threads::destroy_vm()) { // Should not change thread state, VM is gone vm_created = false; res = JNI_OK; return res; } else { ThreadStateTransition::transition_and_fence(thread, _thread_in_vm, _thread_in_native); res = JNI_ERR; return res; } (gdb) print *(OSThread*)0x7ffff000be70 $3 = {> = { = {_vptr.AllocatedObj = 0x7ffff7ca3870}, }, _start_proc = 0, _start_parm = 0x0, _state = RUNNABLE, _interrupted = 0, _thread_type = -235802127, _pthread_id = 140737326790400, _caller_sigmask = { __val = {0, 140737338666947, 4294967296, 140737344153969, 0, 140737266533664, 140737326787424, 140737338942742, 140737219968624, 140737219968624, 0, 140737219968624, 140737326787472, 140737338949853, 0, 140737338944081}}, sr = { _state = os::SuspendResume::SR_RUNNING}, _siginfo = 0x0, _ucontext = 0x0, _expanding_stack = 0, _alt_sig_stack = 0x0, _startThread_lock = 0x7ffff2c74520, _thread_id = 24590} Note the thread_id is 24590. As how the name is still 'main' in stack trace still not known. Thanks Yumin On 10/17/2014 10:54 PM, David Holmes wrote: > Hi Yumin, > > Quick response ... when shutdown is initiated the Shutdown class will > be loaded and initialized: > > at java.lang.Shutdown.(Shutdown.java:61) > > Presumably this static initialization is what triggers the involvement > of the agent to do the transform, and hence encounters the exception. > Though I'm unclear how it still reports "main" as the name when it has > now become "DestroyJavaVM" > > David > > On 18/10/2014 3:08 PM, Yumin Qi wrote: >> David, (cc Karen) >> >> I think I got why it throws CircularityError in 'main' thread. >> The CircularityError thrown in TestThread, which was handled in >> classloading, the loading class is put into unresolved list. Note we >> clean pending exception and return null to caller, which in the search >> next will load the instance class. There is no exception in java level >> be caught in TestThread. >> When main ended, we create a JavaThread named 'DestroyJavaVM' and >> give the thread id the current thread id, which is the main thread id. >> Since the All JavaThread object should be freed when this last >> JavaThread exit, I have no idea how the 'DestroyJavaVM' thread saw the >> exception, from the stack trace, the calling begins with >> >> ShutDown.java: >> >> /* The preceding static fields are protected by this lock */ >> private static class Lock { }; >> private static Object lock = new Lock(); >> //<<<------------------------ line 61 >> >> How come the call via agent and call transform? At shutdown time, do >> we need to turn down the request to agent at this time? >> >> Thanks >> Yumin >> >> >> java.lang.Exception: Stack trace >> at java.lang.Thread.dumpStack(Thread.java:1329) >> at >> ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) >> >> >> at >> sun.instrument.TransformerManager.transform(TransformerManager.java:188) >> at >> sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) >> >> java.lang.Exception: Stack trace >> at java.lang.Thread.dumpStack(Thread.java:1329) >> at >> ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) >> >> >> at >> sun.instrument.TransformerManager.transform(TransformerManager.java:188) >> at >> sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) >> >> at java.lang.Shutdown.(Shutdown.java:61) >> >> This output in >> >> >> >> >> >> On 10/15/2014 9:58 AM, Yumin Qi wrote: >>> David, >>> >>> I will take another detail trace to see where the exception begins >>> in main thread, it should not thrown in main thread. I only saw it is >>> thrown in TestThread, not main, not DestroyJavaVM. If that happens, >>> maybe something wrong in vm. >>> The output in all 'failed' case (many failed not cause exception >>> output, not caught), the main thread got the exception. That is not >>> right. >>> >>> Thanks >>> Yumin >>> >>> On 10/14/2014 5:28 PM, David Holmes wrote: >>>> Hi Yumin, >>>> >>>> On 15/10/2014 4:40 AM, Yumin Qi wrote: >>>>> David, Thanks for the comment. See embedded. >>>>> >>>>> >>>>> On 10/13/2014 7:30 PM, David Holmes wrote: >>>>>> Hi Yumin, >>>>>> >>>>>> jdk9-dev is not the best place for code review requests. >>>>>> serviceability-dev would be better for this test. >>>>>> >>>>>> On 14/10/2014 8:58 AM, Yumin Qi wrote: >>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468 >>>>>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/ >>>>>>> >>>>>>> the bug marked as confidential so post the webrev internally. >>>>>> >>>>>> Not any more :) >>>>>> >>>>> Thanks. I changed to non security related bug. Usually when test >>>>> failed, >>>>> a confidential bug is filed. I would like to create bug open if the >>>>> test >>>>> is in open part. >>>>>>> Problem: The test case tries to load a class from the same jar via >>>>>>> agent >>>>>>> in the middle of loading another class from the jar via same class >>>>>>> loader in same thread. The call happens in transform which is a >>>>>>> rare >>>>>>> case --- in middle of loading class, loading another class. The >>>>>>> result >>>>>>> is a CircularityError. When first class is in loading, in vm we put >>>>>>> JarLoader$2 on place holder table, then we start the defineClass, >>>>>>> which >>>>>>> calls transform, begins loading the second class so go along the >>>>>>> same >>>>>>> routine for loading JarLoader$2 first, found it already in >>>>>>> placeholder >>>>>>> table. A CircularityError is thrown. >>>>>>> Fix: The test case should not call loading class with same class >>>>>>> loader >>>>>>> in same thread from same jar in 'transform' method. I modify it >>>>>>> loading >>>>>>> with system class loader and we expect see ClassNotFoundException. >>>>>>> Detail see bug comments. >>>>>> >>>>>> It is not clear to me that the test is incorrect. It is also unclear >>>>>> why such an old test is now failing - we must have changed >>>>>> something. >>>>>> And it's unclear whether what the test does with your change is >>>>>> actually testing what the test wanted to test. >>>>>> >>>>>> It seems to me that the actual problem in the test is the >>>>>> reference to >>>>>> the "main" thread ie: >>>>>> >>>>>> if (!tName.equals("main")) >>>>>> >>>>>> The test knows not to do the loading in the main thread, but has >>>>>> overlooked the fact that the main thread, upon the end of main() >>>>>> becomes the DestroyJavaVM thread - and it is that thread which >>>>>> encounters the ClassCircularityError: >>>>>> >>>>>> Starting test with 1000 iterations >>>>>> Thread 'DestroyJavaVM' has called transform() >>>>>> >>>>>> So perhaps the right fix is to expand the above to: >>>>>> >>>>>> if (!tName.equals("main") && !tName.equals("DestroyJavaVM")) >>>>>> >>>>>> ? I admit I'm having trouble seeing the full picture in this test. >>>>>> >>>>> It is not DestroyJavaVM thread cause CircularityError. It is >>>>> TestThread >>>>> cause CircularityError. >>>> >>>> Not according to the bug report: >>>> >>>> Starting test with 1000 iterationsThread 'DestroyJavaVM' has called >>>> transform() >>>> Thread 'DestroyJavaVM' has called transform() >>>> result=1 >>>> ----------System.err:(14/920)---------- >>>> Exception in thread "main" java.lang.ClassCircularityError: >>>> sun/misc/URLClassPath$JarLoader$2 >>>> >>>> This shows that "main" got the CCE. Which in itself is confusing >>>> given we also report "Thread 'DestroyJavaVM' has called transform()" >>>> and they are in fact the same thread! >>>> >>>> David >>>> ----- >>>> >>>> >>>>> In TestThread (DestroyJavaVM may cause same I think, but not seen in >>>>> debug): >>>>> >>>>> forName("TestClass2", true, classLoader); <---- the loader is >>>>> customer loader which is obtained from agent code. >>>>> -->...... transform(...) >>>>> -->defineClass(...) >>>>> -->...... call into vm, we need to load >>>>> JarLoader$2 >>>>> since JarLoader$1 used >>>>> ->resolve_instance_class_or_null >>>>> // here we create PlaceTableEntry for >>>>> JarLoader$2, put into place holder table >>>>> -->...... >>>>> --->forName("TestClass3", true, >>>>> classLoader); >>>>> -->... transform(...) >>>>> -->defineClass(...) >>>>> -->...... call into vm >>>>> again. Now JarLoader$2 is not loaded, but it is in placeholder >>>>> table, so >>>>> throw_circularity_error set and throw. >>>>> ....... >>>>> With custom loader, agent's transform will be called, then it >>>>> loads TestClass3, repeat the same steps as loading TestClass2. The >>>>> problem is JarLoader$2 has not been loaded yet but in place holder >>>>> table >>>>> (this is for checking CircularityError), then begins loading >>>>> TestClass3, >>>>> this is a recursive and embedded case. The non-failed case also saw >>>>> CircularityError thrown, but somehow the test case did not fail. >>>>> Design >>>>> like this will cause call transform in transform which is the reason >>>>> CircularityError thrown. >>>>> >>>>> I have no idea about the original desin of the test case, but >>>>> think >>>>> it should do this. >>>>> >>>>>> >>>>>> Looking at your change, don't leave commented out lines in the code: >>>>>> 115 // ClassLoader loader = >>>>>> ParallelTransformerLoaderAgent.getClassLoader(); >>>>>> 118 //Class.forName("TestClass" + >>>>>> index, true, loader); >>>>>> >>>>> Will remove >>>>> >>>>> Thanks >>>>> Yumin >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Thanks >>>>>>> Yumin * >>>>> >>> >> From jaroslav.bachorik at oracle.com Mon Oct 20 11:12:32 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 20 Oct 2014 13:12:32 +0200 Subject: RFR 8058506: ThreadMXBeanStateTest throws exception Message-ID: <5444EE20.6090803@oracle.com> Please, review the following test change Issue : https://bugs.openjdk.java.net/browse/JDK-8058506 Webrev: http://cr.openjdk.java.net/~jbachorik/8058506/webrev.00 The test fails intermittently due to the log printing blocking the test thread from time to time, resulting in incorrect data reported by ThreadMXBean. The solution is to use per-thread non-blocking StringBuilders (wrapped in Formatter instances) and aggregate the log output only after the test is finished. Thanks, -JB- From mandy.chung at oracle.com Tue Oct 21 02:21:17 2014 From: mandy.chung at oracle.com (Mandy Chung) Date: Mon, 20 Oct 2014 19:21:17 -0700 Subject: [PATCH] Return -1 after throwing internal error In-Reply-To: References: Message-ID: <5445C31D.7030106@oracle.com> Hi Xiaoguang, On 10/20/2014 6:31 PM, Xiaoguang Sun wrote: > Hi All, > > I recently discovered some inconsistency in UnixOperatingSystem_md.c > that do now return -1 after throwing internal error. It usually > shouldn't be a problem, but making it more consistent to other code > within the same file shouldn't be a bad idea. I'm including serviceability-dev for your patch as UnixOperatingSystem_md.c belongs to serviceability. Are you working on a clone of jdk9/dev repo? src/solaris/native/com/sun/management/UnixOperatingSystem_md.c looks like jdk7 source. It has been renamed to src/java.management/unix/native/libmanagement/OperatingSystemImpl.c [1]. Can you rebase your patch to the latest jdk9 source and send it to serviceability-dev? Mandy [1] http://hg.openjdk.java.net/jdk9/dev/jdk/ From erik.gahlin at oracle.com Tue Oct 21 13:27:03 2014 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Tue, 21 Oct 2014 15:27:03 +0200 Subject: RFR 8058506: ThreadMXBeanStateTest throws exception In-Reply-To: <5444EE20.6090803@oracle.com> References: <5444EE20.6090803@oracle.com> Message-ID: <54465F27.3080204@oracle.com> Have you considered creating a LogMessage class that keeps the logCntr value and the log message, instead of putting the counter into the log string and parsing it. Seems simpler and easier to understand. Erik Jaroslav Bachorik skrev 2014-10-20 13:12: > Please, review the following test change > > Issue : https://bugs.openjdk.java.net/browse/JDK-8058506 > Webrev: http://cr.openjdk.java.net/~jbachorik/8058506/webrev.00 > > The test fails intermittently due to the log printing blocking the > test thread from time to time, resulting in incorrect data reported by > ThreadMXBean. > > The solution is to use per-thread non-blocking StringBuilders (wrapped > in Formatter instances) and aggregate the log output only after the > test is finished. > > Thanks, > > -JB- From jaroslav.bachorik at oracle.com Tue Oct 21 17:47:33 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 21 Oct 2014 19:47:33 +0200 Subject: RFR 8058506: ThreadMXBeanStateTest throws exception In-Reply-To: <54465F27.3080204@oracle.com> References: <5444EE20.6090803@oracle.com> <54465F27.3080204@oracle.com> Message-ID: <54469C35.8090005@oracle.com> On 10/21/2014 03:27 PM, Erik Gahlin wrote: > Have you considered creating a LogMessage class that keeps the logCntr > value and the log message, instead of putting the counter into the log > string and parsing it. Yes. And didn't go that way in order to prevent creating a lot of throwaway stringbuilder instances (the Formatter works that way) - but it might (almost certainly) be a premature optimization. I will clean it up and resubmit the request. -JB- > > Seems simpler and easier to understand. > > Erik > > Jaroslav Bachorik skrev 2014-10-20 13:12: >> Please, review the following test change >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-8058506 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8058506/webrev.00 >> >> The test fails intermittently due to the log printing blocking the >> test thread from time to time, resulting in incorrect data reported by >> ThreadMXBean. >> >> The solution is to use per-thread non-blocking StringBuilders (wrapped >> in Formatter instances) and aggregate the log output only after the >> test is finished. >> >> Thanks, >> >> -JB- > From brendan.d.gregg at gmail.com Wed Oct 22 00:10:31 2014 From: brendan.d.gregg at gmail.com (Brendan Gregg) Date: Tue, 21 Oct 2014 17:10:31 -0700 Subject: system profilers and incomplete stacks In-Reply-To: References: <1F3E1054-9947-43AC-9AD1-350E0174C9A1@oracle.com> <539FD60E.10203@oracle.com> Message-ID: G'Day, I checked the JDK 9 early access releases, but didn't see anything for JDK-6276264. I've also since learned that Twitter has an OpenJDK fork with frame pointers disabled, for the same purpose: stack profiling (using Linux perf_events). Might this be worked on for JDK 9? I can help test. thanks, Brendan On Mon, Jun 16, 2014 at 11:52 PM, Brendan Gregg wrote: > G'Day Serguei, > > On Mon, Jun 16, 2014 at 10:45 PM, serguei.spitsyn at oracle.com > wrote: >> >> Hi Brendan, >> >> We are aware of these issues and work with the Solaris team to fix them in >> JDK 9. >> One is the frame pointer is used by the server compiler as a general >> purpose register on intel. >> Another is about the virtual (or inlined) frames. >> >> There are a couple of related bugs: >> https://bugs.openjdk.java.net/browse/JDK-6617153 >> https://bugs.openjdk.java.net/browse/JDK-6276264 >> >> There can be more issues filed on this. > > > Ah, thanks, it's JDK-6276264. > > As Tom Rodriguez said at the time (2005): "The server VM uses the frame > pointer as an allocatable register and there's no way to turn that off." I > was really hoping there was a way to turn that off, like > -fno-omit-frame-pointer. > > This also means DTrace jstack() has never worked fully. For the applications > I tried it on, 50% of stacks were incomplete. Perhaps it wasn't that bad in > 2005. I've been getting more mileage today from Java profilers. > >> Please, note, that the jstack action is not implemented on Linux yet. > > > Linux doesn't have DTrace jstack(), no, but its perf_events does has support > for loading an auxiliary file of symbols, which can created via a Java agent > for that purpose (eg, https://github.com/jrudolph/perf-map-agent). But that > hasn't been working fully for the same reason - incomplete stacks. > > Brendan > >> >> Thanks, >> Serguei >> >> >> >> On 6/16/14 5:14 PM, Brendan Gregg wrote: >> >> Thanks but no, I'm aware of that bug and workarounds (I'm using the >> LD_AUDIT_64=/usr/lib/dtrace/64/libdtrace_forceload.so workaround, which >> isn't mentioned in the bug comments, but probably should be). That bug is >> about missing symbols, but the stacks shown in that bug still go all the way >> to thread_start. My stacks often don't. >> >> For simple programs, the stacks are complete. But something complex (eg, >> vert.x with event loops), and the stacks are often incomplete, one frame >> only. Very much like what I see with -fomit-frame-pointer, although this is >> hotspot, not gcc. Such incomplete stacks are seen using either DTrace or >> perf_events. >> >> It was suggested to me to email the hotspot developers, because this may >> well be a hotspot optimization they are familiar with. It may also be >> something really obvious, like that the JVM breaks native stacks due to >> optimized frames / green threads / etc, and there is absolutely no way >> around it (no way to disable it). If that's true, it may also mean that the >> DTrace jstack() action has always had this issue. I'm still reading the >> source... >> >> Brendan >> >> >> >> On Mon, Jun 16, 2014 at 4:04 AM, Staffan Larsen >> wrote: >>> >>> I think this is the bug you are looking at: >>> https://bugs.openjdk.java.net/browse/JDK-7187999, but I?ll defer to someone >>> else to confirm. >>> >>> /Staffan >>> >>> >>> On 16 jun 2014, at 12:47, Roland Westrelin >>> wrote: >>> >>> Forwarding to serviceability alias where this question belongs I think. >>> >>> Begin forwarded message: >>> >>> From: Brendan Gregg >>> Subject: system profilers and incomplete stacks >>> Date: June 12, 2014 at 7:15:54 PM GMT+2 >>> To: hotspot-compiler-dev at openjdk.java.net >>> >>> G'Day, >>> >>> Is there a way to run hotspot so that a system profiler (eg, DTrace, or >>> Linux perf_events) can measure complete stacks? I often get incomplete, >>> partial stacks, with one or a few frames only. I'm not worried about symbols >>> right now, what I'd like is to walk stacks all the way down to thread start. >>> >>> I've been browsing the hotspot code, but haven't found out how yet. I >>> suspect it's related to Java optimized frames, and has ditched the frame >>> pointer. I was looking for an equivalent -fno-omit-frame-pointer option. >>> >>> Here's an example: >>> >>> # dtrace -n 'profile-99 /execname == "java"/ { @[jstack(100, 8000)] = >>> count(); }' >>> [...] >>> org/mozilla/javascript/ >>> >>> ScriptableObject.createSlot(Ljava/lang/String;II)Lorg/mozilla/javascript/ScriptableObject$Slot;* >>> 0x884acce8200002da >>> 1 >>> >>> sun/nio/ch/SocketChannelImpl.read(Ljava/nio/ByteBuffer;)I* >>> 0xffffffff20007f4b >>> 1 >>> >>> >>> org/mozilla/javascript/ScriptRuntime.newObjectLiteral([Ljava/lang/Object;[Ljava/lang/Object;[ILorg/mozilla/javascript/Context;Lorg/mozilla/javascript/Scriptable;)Lorg/mozilla/javascript/Scriptable;* >>> 0xa20000041 >>> 1 >>> [...] >>> >>> I see similar incomplete stacks with Linux perf_events. Oracle JDKs from >>> 6 to 8, and OpenJDK. >>> >>> thanks, >>> >>> Brendan >>> -- >>> http://www.brendangregg.com >>> >>> >>> >> >> >> >> -- >> http://www.brendangregg.com >> >> > > > > -- > http://www.brendangregg.com From david.holmes at oracle.com Wed Oct 22 02:04:09 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 22 Oct 2014 12:04:09 +1000 Subject: RFR 8056143: interrupted java/lang/management/MemoryMXBean/LowMemoryTest.java leaves running process In-Reply-To: <5440E783.1000005@oracle.com> References: <543CFF1F.3040302@oracle.com> <543DBB81.60409@oracle.com> <543E2745.3000107@oracle.com> <543E2C38.3030601@oracle.com> <543E7CC4.4000504@oracle.com> <543F0DCE.8090107@oracle.com> <5440E783.1000005@oracle.com> Message-ID: <54471099.9010507@oracle.com> Sorry for the delay in getting back to this - I had a long weekend. :) I think this new approach is great! So it is a big Thumbs Up from me! Thanks, David On 17/10/2014 7:55 PM, Jaroslav Bachorik wrote: > On 10/16/2014 02:14 AM, David Holmes wrote: >> On 15/10/2014 11:55 PM, Jaroslav Bachorik wrote: >>> On 10/15/2014 10:11 AM, David Holmes wrote: >>>> On 15/10/2014 5:50 PM, Jaroslav Bachorik wrote: >>>>> On 10/15/2014 02:10 AM, David Holmes wrote: >>>>>> On 14/10/2014 8:46 PM, Jaroslav Bachorik wrote: >>>>>>> Please, review the following test change >>>>>>> >>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8056143 >>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8056143/webrev.00 >>>>>>> >>>>>>> The method jdk.testlibrary.ProcessTools.getOutput(process) waits for >>>>>>> the >>>>>>> given process to finish (process.waitFor()) before grabbing its >>>>>>> outputs. >>>>>>> However, the code does not handle the process.waitFor() being >>>>>>> interrupted correctly - it just goes ahead and tries to obtain the >>>>>>> exit >>>>>>> code which will fail and leave the tested process running. >>>>>>> >>>>>>> The correct way is to forcibly destroy the process when >>>>>>> process.waitFor() is interrupted or throws ExecutionException to >>>>>>> make >>>>>>> sure the process has actually exited before checking its exit code. >>>>>> >>>>>> Why is this correct? What gives the thread calling getOutput the >>>>>> right >>>>>> to terminate the target process just because that thread was >>>>>> interrupted >>>>>> while waiting? If the interrupting thread intended the interrupt to >>>>>> mean >>>>>> "forcibly terminate the process and interrupt all threads waiting on >>>>>> it" >>>>>> then that thread should be doing the termination _not_ the one that >>>>>> was >>>>>> interrupted! >>>>> >>>>> Process.waitFor() gets interrupted by a thread unknown to the actual >>>>> test case - probably the JTreg timeout thread. The interrupting thread >>>>> doesn't know that it is supposed to destroy a process. Once JTreg can >>>>> take care of cleaning up process tree upon exit this code wouldn't be >>>>> needed. >>>>> >>>>> I was contemplating adding the check for "null" returned from >>>>> ProcessTools.getOutput() and destroying the process inside the caller >>>>> code - but this would have the same results as doing it in >>>>> ProcessTools.getOutput() with the drawback of duplicating the same >>>>> check >>>>> everywhere ProcessTools.getOutput() would be used. >>>>> >>>>> A silent postcondition of ProcessTools.getOuptut() is that the target >>>>> process has finished - and it holds for all the code paths except the >>>>> InterruptedException handler. >>>> >>>> That doesn't mean it is up to getOutput to forcibly terminate the >>>> process. Multi-process cancellation is tricky, and yes eventually jtreg >>>> will handle it. But this seems the wrong place to handle it now. >>>> Part of >>>> the flaw here is that getOutput should itself throw >>>> InterruptedException >>>> so that the caller is forced to deal with this - instead it just >>>> re-asserts the interrupt state. The caller has to be aware that the >>>> thread can be interrupted and do something appropriate - which may mean >>>> punting to its caller. This is akin to a thread catching >>>> InterruptedException and calling System.exit - it simply is not its job >>>> to make that kind of decision at that level! >>> >>> There is no other decision to make. Not as it is written today. You can >>> call ProcessTools.getOutput() and check whether the result is null and >>> then end the test process. There is no other sensible action. The >>> Process.waitFor() was interrupted you have no data to perform the checks >>> against so the test will fail and as such it should stop any external >>> processes it has started. >>> >>> Yes, I can go through all the tests using ProcessTools.getOutput() and >>> add `if (output == null) process.destroyForcible();` - would this make >>> it a better solution than putting this logic inside >>> ProcessTools.getOutput()? >> >> It would be the correct solution. Hacking it into getOutput() is just a >> convenience. Problem is that none of these tests have given enough >> thought to the cancellation issue and general process management. > > Agreed. My concern was that the test code base would have been littered > with `if (output == null) process.destroyForcible();` checks because > there is no other way to react to the situation when process.waitFor() > is interrupted - at least not in the JTreg context. > > Therefore I put the logic of properly ending the external process to > ProcessTools.executeProcess() method and restricted access to the > constructors of OutputBuffer and OutputAnalyzer to enforce their > creation only via ProcessTools.executeProcess(). > > Also, in order to prevent the started process stdout/stderr overflow I > moved the backround stream pumpers to OutputBuffer so they would be > started ASAP, without waiting for the process to exit (which defeats the > purpose of consuming the attached stdout/stderr streams in backround > anyway). > > With these changes the API user doesn't need to worry about the external > process cleanup anymore. The semantics of ProcessTools.executeProcess() > guarantees that there will be no orphan process hanging about once this > method returns. > > This change is significantly bigger than the previous attempt because it > spans a lot of tests using the OutputAnalyzer but, hopefully, it > addresses David's concerns. > > http://cr.openjdk.java.net/~jbachorik/8056143/webrev.01 > > -JB- > >> >> Sorry. >> >> David >> >>> -JB- >>> >>>> >>>> David >>>> >>>>> -JB- >>>>> >>>>> >>>>> >>>>>> >>>>>> David >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -JB- >>>>> >>> > From yumin.qi at oracle.com Wed Oct 22 02:52:47 2014 From: yumin.qi at oracle.com (Yumin Qi) Date: Tue, 21 Oct 2014 19:52:47 -0700 Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh fails with ClassCircularityError In-Reply-To: <544200A3.2050608@oracle.com> References: <543C591E.8010602@oracle.com> <543C8ADE.5000309@oracle.com> <543D6E18.9060205@oracle.com> <543DBF9F.5050507@oracle.com> <543EA7A8.4030809@oracle.com> <5441F5E2.4050904@oracle.com> <544200A3.2050608@oracle.com> Message-ID: <54471BFF.30303@oracle.com> Hi, David and all, Second webrev here: http://cr.openjdk.java.net/~minqi/8038468/webrev01/ Answer to David's question about 'main' and 'DestroyJavaVM'. I still did not find how when exception printing the stack trace, 'main' was retrieved but, at the moment JavaThread for "DestroyJavaVM' was created, 'main' is not dead. They may exist and with same C thread and id. This may cause we got 'main' not 'DestroyJavaVM'. Loading another class from agent in 'transform' with same custom class loader is not a good design. We already have two threads loading from the agent in parallel, TestClass1 in 'main' and TestClass2 in 'TestThread'. Should avoid loading another class with same agent in 'transform' in nested. Thanks Yumin On 10/17/2014 10:54 PM, David Holmes wrote: > Hi Yumin, > > Quick response ... when shutdown is initiated the Shutdown class will > be loaded and initialized: > > at java.lang.Shutdown.(Shutdown.java:61) > > Presumably this static initialization is what triggers the involvement > of the agent to do the transform, and hence encounters the exception. > Though I'm unclear how it still reports "main" as the name when it has > now become "DestroyJavaVM" > > David > > On 18/10/2014 3:08 PM, Yumin Qi wrote: >> David, (cc Karen) >> >> I think I got why it throws CircularityError in 'main' thread. >> The CircularityError thrown in TestThread, which was handled in >> classloading, the loading class is put into unresolved list. Note we >> clean pending exception and return null to caller, which in the search >> next will load the instance class. There is no exception in java level >> be caught in TestThread. >> When main ended, we create a JavaThread named 'DestroyJavaVM' and >> give the thread id the current thread id, which is the main thread id. >> Since the All JavaThread object should be freed when this last >> JavaThread exit, I have no idea how the 'DestroyJavaVM' thread saw the >> exception, from the stack trace, the calling begins with >> >> ShutDown.java: >> >> /* The preceding static fields are protected by this lock */ >> private static class Lock { }; >> private static Object lock = new Lock(); >> //<<<------------------------ line 61 >> >> How come the call via agent and call transform? At shutdown time, do >> we need to turn down the request to agent at this time? >> >> Thanks >> Yumin >> >> >> java.lang.Exception: Stack trace >> at java.lang.Thread.dumpStack(Thread.java:1329) >> at >> ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) >> >> >> at >> sun.instrument.TransformerManager.transform(TransformerManager.java:188) >> at >> sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) >> >> java.lang.Exception: Stack trace >> at java.lang.Thread.dumpStack(Thread.java:1329) >> at >> ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) >> >> >> at >> sun.instrument.TransformerManager.transform(TransformerManager.java:188) >> at >> sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) >> >> at java.lang.Shutdown.(Shutdown.java:61) >> >> This output in >> >> >> >> >> >> On 10/15/2014 9:58 AM, Yumin Qi wrote: >>> David, >>> >>> I will take another detail trace to see where the exception begins >>> in main thread, it should not thrown in main thread. I only saw it is >>> thrown in TestThread, not main, not DestroyJavaVM. If that happens, >>> maybe something wrong in vm. >>> The output in all 'failed' case (many failed not cause exception >>> output, not caught), the main thread got the exception. That is not >>> right. >>> >>> Thanks >>> Yumin >>> >>> On 10/14/2014 5:28 PM, David Holmes wrote: >>>> Hi Yumin, >>>> >>>> On 15/10/2014 4:40 AM, Yumin Qi wrote: >>>>> David, Thanks for the comment. See embedded. >>>>> >>>>> >>>>> On 10/13/2014 7:30 PM, David Holmes wrote: >>>>>> Hi Yumin, >>>>>> >>>>>> jdk9-dev is not the best place for code review requests. >>>>>> serviceability-dev would be better for this test. >>>>>> >>>>>> On 14/10/2014 8:58 AM, Yumin Qi wrote: >>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468 >>>>>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/ >>>>>>> >>>>>>> the bug marked as confidential so post the webrev internally. >>>>>> >>>>>> Not any more :) >>>>>> >>>>> Thanks. I changed to non security related bug. Usually when test >>>>> failed, >>>>> a confidential bug is filed. I would like to create bug open if the >>>>> test >>>>> is in open part. >>>>>>> Problem: The test case tries to load a class from the same jar via >>>>>>> agent >>>>>>> in the middle of loading another class from the jar via same class >>>>>>> loader in same thread. The call happens in transform which is a >>>>>>> rare >>>>>>> case --- in middle of loading class, loading another class. The >>>>>>> result >>>>>>> is a CircularityError. When first class is in loading, in vm we put >>>>>>> JarLoader$2 on place holder table, then we start the defineClass, >>>>>>> which >>>>>>> calls transform, begins loading the second class so go along the >>>>>>> same >>>>>>> routine for loading JarLoader$2 first, found it already in >>>>>>> placeholder >>>>>>> table. A CircularityError is thrown. >>>>>>> Fix: The test case should not call loading class with same class >>>>>>> loader >>>>>>> in same thread from same jar in 'transform' method. I modify it >>>>>>> loading >>>>>>> with system class loader and we expect see ClassNotFoundException. >>>>>>> Detail see bug comments. >>>>>> >>>>>> It is not clear to me that the test is incorrect. It is also unclear >>>>>> why such an old test is now failing - we must have changed >>>>>> something. >>>>>> And it's unclear whether what the test does with your change is >>>>>> actually testing what the test wanted to test. >>>>>> >>>>>> It seems to me that the actual problem in the test is the >>>>>> reference to >>>>>> the "main" thread ie: >>>>>> >>>>>> if (!tName.equals("main")) >>>>>> >>>>>> The test knows not to do the loading in the main thread, but has >>>>>> overlooked the fact that the main thread, upon the end of main() >>>>>> becomes the DestroyJavaVM thread - and it is that thread which >>>>>> encounters the ClassCircularityError: >>>>>> >>>>>> Starting test with 1000 iterations >>>>>> Thread 'DestroyJavaVM' has called transform() >>>>>> >>>>>> So perhaps the right fix is to expand the above to: >>>>>> >>>>>> if (!tName.equals("main") && !tName.equals("DestroyJavaVM")) >>>>>> >>>>>> ? I admit I'm having trouble seeing the full picture in this test. >>>>>> >>>>> It is not DestroyJavaVM thread cause CircularityError. It is >>>>> TestThread >>>>> cause CircularityError. >>>> >>>> Not according to the bug report: >>>> >>>> Starting test with 1000 iterationsThread 'DestroyJavaVM' has called >>>> transform() >>>> Thread 'DestroyJavaVM' has called transform() >>>> result=1 >>>> ----------System.err:(14/920)---------- >>>> Exception in thread "main" java.lang.ClassCircularityError: >>>> sun/misc/URLClassPath$JarLoader$2 >>>> >>>> This shows that "main" got the CCE. Which in itself is confusing >>>> given we also report "Thread 'DestroyJavaVM' has called transform()" >>>> and they are in fact the same thread! >>>> >>>> David >>>> ----- >>>> >>>> >>>>> In TestThread (DestroyJavaVM may cause same I think, but not seen in >>>>> debug): >>>>> >>>>> forName("TestClass2", true, classLoader); <---- the loader is >>>>> customer loader which is obtained from agent code. >>>>> -->...... transform(...) >>>>> -->defineClass(...) >>>>> -->...... call into vm, we need to load >>>>> JarLoader$2 >>>>> since JarLoader$1 used >>>>> ->resolve_instance_class_or_null >>>>> // here we create PlaceTableEntry for >>>>> JarLoader$2, put into place holder table >>>>> -->...... >>>>> --->forName("TestClass3", true, >>>>> classLoader); >>>>> -->... transform(...) >>>>> -->defineClass(...) >>>>> -->...... call into vm >>>>> again. Now JarLoader$2 is not loaded, but it is in placeholder >>>>> table, so >>>>> throw_circularity_error set and throw. >>>>> ....... >>>>> With custom loader, agent's transform will be called, then it >>>>> loads TestClass3, repeat the same steps as loading TestClass2. The >>>>> problem is JarLoader$2 has not been loaded yet but in place holder >>>>> table >>>>> (this is for checking CircularityError), then begins loading >>>>> TestClass3, >>>>> this is a recursive and embedded case. The non-failed case also saw >>>>> CircularityError thrown, but somehow the test case did not fail. >>>>> Design >>>>> like this will cause call transform in transform which is the reason >>>>> CircularityError thrown. >>>>> >>>>> I have no idea about the original desin of the test case, but >>>>> think >>>>> it should do this. >>>>> >>>>>> >>>>>> Looking at your change, don't leave commented out lines in the code: >>>>>> 115 // ClassLoader loader = >>>>>> ParallelTransformerLoaderAgent.getClassLoader(); >>>>>> 118 //Class.forName("TestClass" + >>>>>> index, true, loader); >>>>>> >>>>> Will remove >>>>> >>>>> Thanks >>>>> Yumin >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Thanks >>>>>>> Yumin * >>>>> >>> >> From david.holmes at oracle.com Wed Oct 22 03:12:52 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 22 Oct 2014 13:12:52 +1000 Subject: RFR: 8038468: java/lang/instrument/ParallelTransformerLoader.sh fails with ClassCircularityError In-Reply-To: <54471BFF.30303@oracle.com> References: <543C591E.8010602@oracle.com> <543C8ADE.5000309@oracle.com> <543D6E18.9060205@oracle.com> <543DBF9F.5050507@oracle.com> <543EA7A8.4030809@oracle.com> <5441F5E2.4050904@oracle.com> <544200A3.2050608@oracle.com> <54471BFF.30303@oracle.com> Message-ID: <544720B4.6080102@oracle.com> On 22/10/2014 12:52 PM, Yumin Qi wrote: > Hi, David and all, > > Second webrev here: http://cr.openjdk.java.net/~minqi/8038468/webrev01/ > > Answer to David's question about 'main' and 'DestroyJavaVM'. I still > did not find how when exception printing the stack trace, 'main' was > retrieved but, at the moment JavaThread for "DestroyJavaVM' was created, > 'main' is not dead. They may exist and with same C thread and id. This > may cause we got 'main' not 'DestroyJavaVM'. They are the same native thread. The "main" thread, upon return from main() (the Java app main()) detaches from the VM and re-attaches as the DestroyJavaVMThread. So I don't see how an exception dump can show the thread name as "main". > Loading another class from agent in 'transform' with same custom > class loader is not a good design. We already have two threads loading > from the agent in parallel, TestClass1 in 'main' and TestClass2 in > 'TestThread'. Should avoid loading another class with same agent in > 'transform' in nested. I'm really not sure we're getting to the bottom of this, but I don't understand the original form and purpose of the test. David > > Thanks > Yumin > > On 10/17/2014 10:54 PM, David Holmes wrote: >> Hi Yumin, >> >> Quick response ... when shutdown is initiated the Shutdown class will >> be loaded and initialized: >> >> at java.lang.Shutdown.(Shutdown.java:61) >> >> Presumably this static initialization is what triggers the involvement >> of the agent to do the transform, and hence encounters the exception. >> Though I'm unclear how it still reports "main" as the name when it has >> now become "DestroyJavaVM" >> >> David >> >> On 18/10/2014 3:08 PM, Yumin Qi wrote: >>> David, (cc Karen) >>> >>> I think I got why it throws CircularityError in 'main' thread. >>> The CircularityError thrown in TestThread, which was handled in >>> classloading, the loading class is put into unresolved list. Note we >>> clean pending exception and return null to caller, which in the search >>> next will load the instance class. There is no exception in java level >>> be caught in TestThread. >>> When main ended, we create a JavaThread named 'DestroyJavaVM' and >>> give the thread id the current thread id, which is the main thread id. >>> Since the All JavaThread object should be freed when this last >>> JavaThread exit, I have no idea how the 'DestroyJavaVM' thread saw the >>> exception, from the stack trace, the calling begins with >>> >>> ShutDown.java: >>> >>> /* The preceding static fields are protected by this lock */ >>> private static class Lock { }; >>> private static Object lock = new Lock(); >>> //<<<------------------------ line 61 >>> >>> How come the call via agent and call transform? At shutdown time, do >>> we need to turn down the request to agent at this time? >>> >>> Thanks >>> Yumin >>> >>> >>> java.lang.Exception: Stack trace >>> at java.lang.Thread.dumpStack(Thread.java:1329) >>> at >>> ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) >>> >>> >>> at >>> sun.instrument.TransformerManager.transform(TransformerManager.java:188) >>> at >>> sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) >>> >>> java.lang.Exception: Stack trace >>> at java.lang.Thread.dumpStack(Thread.java:1329) >>> at >>> ParallelTransformerLoaderAgent$TestTransformer.transform(ParallelTransformerLoaderAgent.java:92) >>> >>> >>> at >>> sun.instrument.TransformerManager.transform(TransformerManager.java:188) >>> at >>> sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) >>> >>> at java.lang.Shutdown.(Shutdown.java:61) >>> >>> This output in >>> >>> >>> >>> >>> >>> On 10/15/2014 9:58 AM, Yumin Qi wrote: >>>> David, >>>> >>>> I will take another detail trace to see where the exception begins >>>> in main thread, it should not thrown in main thread. I only saw it is >>>> thrown in TestThread, not main, not DestroyJavaVM. If that happens, >>>> maybe something wrong in vm. >>>> The output in all 'failed' case (many failed not cause exception >>>> output, not caught), the main thread got the exception. That is not >>>> right. >>>> >>>> Thanks >>>> Yumin >>>> >>>> On 10/14/2014 5:28 PM, David Holmes wrote: >>>>> Hi Yumin, >>>>> >>>>> On 15/10/2014 4:40 AM, Yumin Qi wrote: >>>>>> David, Thanks for the comment. See embedded. >>>>>> >>>>>> >>>>>> On 10/13/2014 7:30 PM, David Holmes wrote: >>>>>>> Hi Yumin, >>>>>>> >>>>>>> jdk9-dev is not the best place for code review requests. >>>>>>> serviceability-dev would be better for this test. >>>>>>> >>>>>>> On 14/10/2014 8:58 AM, Yumin Qi wrote: >>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8038468 >>>>>>>> webrev:*http://cr.openjdk.java.net/~minqi/8038468/webrev00/ >>>>>>>> >>>>>>>> the bug marked as confidential so post the webrev internally. >>>>>>> >>>>>>> Not any more :) >>>>>>> >>>>>> Thanks. I changed to non security related bug. Usually when test >>>>>> failed, >>>>>> a confidential bug is filed. I would like to create bug open if the >>>>>> test >>>>>> is in open part. >>>>>>>> Problem: The test case tries to load a class from the same jar via >>>>>>>> agent >>>>>>>> in the middle of loading another class from the jar via same class >>>>>>>> loader in same thread. The call happens in transform which is a >>>>>>>> rare >>>>>>>> case --- in middle of loading class, loading another class. The >>>>>>>> result >>>>>>>> is a CircularityError. When first class is in loading, in vm we put >>>>>>>> JarLoader$2 on place holder table, then we start the defineClass, >>>>>>>> which >>>>>>>> calls transform, begins loading the second class so go along the >>>>>>>> same >>>>>>>> routine for loading JarLoader$2 first, found it already in >>>>>>>> placeholder >>>>>>>> table. A CircularityError is thrown. >>>>>>>> Fix: The test case should not call loading class with same class >>>>>>>> loader >>>>>>>> in same thread from same jar in 'transform' method. I modify it >>>>>>>> loading >>>>>>>> with system class loader and we expect see ClassNotFoundException. >>>>>>>> Detail see bug comments. >>>>>>> >>>>>>> It is not clear to me that the test is incorrect. It is also unclear >>>>>>> why such an old test is now failing - we must have changed >>>>>>> something. >>>>>>> And it's unclear whether what the test does with your change is >>>>>>> actually testing what the test wanted to test. >>>>>>> >>>>>>> It seems to me that the actual problem in the test is the >>>>>>> reference to >>>>>>> the "main" thread ie: >>>>>>> >>>>>>> if (!tName.equals("main")) >>>>>>> >>>>>>> The test knows not to do the loading in the main thread, but has >>>>>>> overlooked the fact that the main thread, upon the end of main() >>>>>>> becomes the DestroyJavaVM thread - and it is that thread which >>>>>>> encounters the ClassCircularityError: >>>>>>> >>>>>>> Starting test with 1000 iterations >>>>>>> Thread 'DestroyJavaVM' has called transform() >>>>>>> >>>>>>> So perhaps the right fix is to expand the above to: >>>>>>> >>>>>>> if (!tName.equals("main") && !tName.equals("DestroyJavaVM")) >>>>>>> >>>>>>> ? I admit I'm having trouble seeing the full picture in this test. >>>>>>> >>>>>> It is not DestroyJavaVM thread cause CircularityError. It is >>>>>> TestThread >>>>>> cause CircularityError. >>>>> >>>>> Not according to the bug report: >>>>> >>>>> Starting test with 1000 iterationsThread 'DestroyJavaVM' has called >>>>> transform() >>>>> Thread 'DestroyJavaVM' has called transform() >>>>> result=1 >>>>> ----------System.err:(14/920)---------- >>>>> Exception in thread "main" java.lang.ClassCircularityError: >>>>> sun/misc/URLClassPath$JarLoader$2 >>>>> >>>>> This shows that "main" got the CCE. Which in itself is confusing >>>>> given we also report "Thread 'DestroyJavaVM' has called transform()" >>>>> and they are in fact the same thread! >>>>> >>>>> David >>>>> ----- >>>>> >>>>> >>>>>> In TestThread (DestroyJavaVM may cause same I think, but not seen in >>>>>> debug): >>>>>> >>>>>> forName("TestClass2", true, classLoader); <---- the loader is >>>>>> customer loader which is obtained from agent code. >>>>>> -->...... transform(...) >>>>>> -->defineClass(...) >>>>>> -->...... call into vm, we need to load >>>>>> JarLoader$2 >>>>>> since JarLoader$1 used >>>>>> ->resolve_instance_class_or_null >>>>>> // here we create PlaceTableEntry for >>>>>> JarLoader$2, put into place holder table >>>>>> -->...... >>>>>> --->forName("TestClass3", true, >>>>>> classLoader); >>>>>> -->... transform(...) >>>>>> -->defineClass(...) >>>>>> -->...... call into vm >>>>>> again. Now JarLoader$2 is not loaded, but it is in placeholder >>>>>> table, so >>>>>> throw_circularity_error set and throw. >>>>>> ....... >>>>>> With custom loader, agent's transform will be called, then it >>>>>> loads TestClass3, repeat the same steps as loading TestClass2. The >>>>>> problem is JarLoader$2 has not been loaded yet but in place holder >>>>>> table >>>>>> (this is for checking CircularityError), then begins loading >>>>>> TestClass3, >>>>>> this is a recursive and embedded case. The non-failed case also saw >>>>>> CircularityError thrown, but somehow the test case did not fail. >>>>>> Design >>>>>> like this will cause call transform in transform which is the reason >>>>>> CircularityError thrown. >>>>>> >>>>>> I have no idea about the original desin of the test case, but >>>>>> think >>>>>> it should do this. >>>>>> >>>>>>> >>>>>>> Looking at your change, don't leave commented out lines in the code: >>>>>>> 115 // ClassLoader loader = >>>>>>> ParallelTransformerLoaderAgent.getClassLoader(); >>>>>>> 118 //Class.forName("TestClass" + >>>>>>> index, true, loader); >>>>>>> >>>>>> Will remove >>>>>> >>>>>> Thanks >>>>>> Yumin >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> Thanks >>>>>>>> Yumin * >>>>>> >>>> >>> > From staffan.larsen at oracle.com Wed Oct 22 06:43:43 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 22 Oct 2014 08:43:43 +0200 Subject: system profilers and incomplete stacks In-Reply-To: References: <1F3E1054-9947-43AC-9AD1-350E0174C9A1@oracle.com> <539FD60E.10203@oracle.com> Message-ID: <542BB193-847F-413F-AE53-3EED02A6B9BE@oracle.com> On 22 okt 2014, at 02:10, Brendan Gregg wrote: > G'Day, > > I checked the JDK 9 early access releases, but didn't see anything for > JDK-6276264. That bug has been closed as a duplicate of https://bugs.openjdk.java.net/browse/JDK-6617153, which is still open. > I've also since learned that Twitter has an OpenJDK fork > with frame pointers disabled, for the same purpose: stack profiling > (using Linux perf_events). Might this be worked on for JDK 9? I would welcome that change - perhaps Twitter can contribute it? Thanks, /Staffan > I can > help test. thanks, > > Brendan > > On Mon, Jun 16, 2014 at 11:52 PM, Brendan Gregg > wrote: >> G'Day Serguei, >> >> On Mon, Jun 16, 2014 at 10:45 PM, serguei.spitsyn at oracle.com >> wrote: >>> >>> Hi Brendan, >>> >>> We are aware of these issues and work with the Solaris team to fix them in >>> JDK 9. >>> One is the frame pointer is used by the server compiler as a general >>> purpose register on intel. >>> Another is about the virtual (or inlined) frames. >>> >>> There are a couple of related bugs: >>> https://bugs.openjdk.java.net/browse/JDK-6617153 >>> https://bugs.openjdk.java.net/browse/JDK-6276264 >>> >>> There can be more issues filed on this. >> >> >> Ah, thanks, it's JDK-6276264. >> >> As Tom Rodriguez said at the time (2005): "The server VM uses the frame >> pointer as an allocatable register and there's no way to turn that off." I >> was really hoping there was a way to turn that off, like >> -fno-omit-frame-pointer. >> >> This also means DTrace jstack() has never worked fully. For the applications >> I tried it on, 50% of stacks were incomplete. Perhaps it wasn't that bad in >> 2005. I've been getting more mileage today from Java profilers. >> >>> Please, note, that the jstack action is not implemented on Linux yet. >> >> >> Linux doesn't have DTrace jstack(), no, but its perf_events does has support >> for loading an auxiliary file of symbols, which can created via a Java agent >> for that purpose (eg, https://github.com/jrudolph/perf-map-agent). But that >> hasn't been working fully for the same reason - incomplete stacks. >> >> Brendan >> >>> >>> Thanks, >>> Serguei >>> >>> >>> >>> On 6/16/14 5:14 PM, Brendan Gregg wrote: >>> >>> Thanks but no, I'm aware of that bug and workarounds (I'm using the >>> LD_AUDIT_64=/usr/lib/dtrace/64/libdtrace_forceload.so workaround, which >>> isn't mentioned in the bug comments, but probably should be). That bug is >>> about missing symbols, but the stacks shown in that bug still go all the way >>> to thread_start. My stacks often don't. >>> >>> For simple programs, the stacks are complete. But something complex (eg, >>> vert.x with event loops), and the stacks are often incomplete, one frame >>> only. Very much like what I see with -fomit-frame-pointer, although this is >>> hotspot, not gcc. Such incomplete stacks are seen using either DTrace or >>> perf_events. >>> >>> It was suggested to me to email the hotspot developers, because this may >>> well be a hotspot optimization they are familiar with. It may also be >>> something really obvious, like that the JVM breaks native stacks due to >>> optimized frames / green threads / etc, and there is absolutely no way >>> around it (no way to disable it). If that's true, it may also mean that the >>> DTrace jstack() action has always had this issue. I'm still reading the >>> source... >>> >>> Brendan >>> >>> >>> >>> On Mon, Jun 16, 2014 at 4:04 AM, Staffan Larsen >>> wrote: >>>> >>>> I think this is the bug you are looking at: >>>> https://bugs.openjdk.java.net/browse/JDK-7187999, but I?ll defer to someone >>>> else to confirm. >>>> >>>> /Staffan >>>> >>>> >>>> On 16 jun 2014, at 12:47, Roland Westrelin >>>> wrote: >>>> >>>> Forwarding to serviceability alias where this question belongs I think. >>>> >>>> Begin forwarded message: >>>> >>>> From: Brendan Gregg >>>> Subject: system profilers and incomplete stacks >>>> Date: June 12, 2014 at 7:15:54 PM GMT+2 >>>> To: hotspot-compiler-dev at openjdk.java.net >>>> >>>> G'Day, >>>> >>>> Is there a way to run hotspot so that a system profiler (eg, DTrace, or >>>> Linux perf_events) can measure complete stacks? I often get incomplete, >>>> partial stacks, with one or a few frames only. I'm not worried about symbols >>>> right now, what I'd like is to walk stacks all the way down to thread start. >>>> >>>> I've been browsing the hotspot code, but haven't found out how yet. I >>>> suspect it's related to Java optimized frames, and has ditched the frame >>>> pointer. I was looking for an equivalent -fno-omit-frame-pointer option. >>>> >>>> Here's an example: >>>> >>>> # dtrace -n 'profile-99 /execname == "java"/ { @[jstack(100, 8000)] = >>>> count(); }' >>>> [...] >>>> org/mozilla/javascript/ >>>> >>>> ScriptableObject.createSlot(Ljava/lang/String;II)Lorg/mozilla/javascript/ScriptableObject$Slot;* >>>> 0x884acce8200002da >>>> 1 >>>> >>>> sun/nio/ch/SocketChannelImpl.read(Ljava/nio/ByteBuffer;)I* >>>> 0xffffffff20007f4b >>>> 1 >>>> >>>> >>>> org/mozilla/javascript/ScriptRuntime.newObjectLiteral([Ljava/lang/Object;[Ljava/lang/Object;[ILorg/mozilla/javascript/Context;Lorg/mozilla/javascript/Scriptable;)Lorg/mozilla/javascript/Scriptable;* >>>> 0xa20000041 >>>> 1 >>>> [...] >>>> >>>> I see similar incomplete stacks with Linux perf_events. Oracle JDKs from >>>> 6 to 8, and OpenJDK. >>>> >>>> thanks, >>>> >>>> Brendan >>>> -- >>>> http://www.brendangregg.com >>>> >>>> >>>> >>> >>> >>> >>> -- >>> http://www.brendangregg.com >>> >>> >> >> >> >> -- >> http://www.brendangregg.com From serguei.spitsyn at oracle.com Wed Oct 22 07:32:48 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 22 Oct 2014 00:32:48 -0700 Subject: system profilers and incomplete stacks In-Reply-To: References: <1F3E1054-9947-43AC-9AD1-350E0174C9A1@oracle.com> <539FD60E.10203@oracle.com> Message-ID: <54475DA0.8090103@oracle.com> Hi Brendan, We are working with the Solaris on prototyping an assisted approach to resolve the issue. In this approach the jhelper.d (the dtrace jstack action provider on VM side) is assisting the DTrace framework to do the stack walking cooperatively. I've assigned the bug JDK-6617153 to myself and will post updates about the progress. Please, let us know if you have any questions. Thanks, Serguei On 10/21/14 5:10 PM, Brendan Gregg wrote: > G'Day, > > I checked the JDK 9 early access releases, but didn't see anything for > JDK-6276264. I've also since learned that Twitter has an OpenJDK fork > with frame pointers disabled, for the same purpose: stack profiling > (using Linux perf_events). Might this be worked on for JDK 9? I can > help test. thanks, > > Brendan > > On Mon, Jun 16, 2014 at 11:52 PM, Brendan Gregg > wrote: >> G'Day Serguei, >> >> On Mon, Jun 16, 2014 at 10:45 PM, serguei.spitsyn at oracle.com >> wrote: >>> Hi Brendan, >>> >>> We are aware of these issues and work with the Solaris team to fix them in >>> JDK 9. >>> One is the frame pointer is used by the server compiler as a general >>> purpose register on intel. >>> Another is about the virtual (or inlined) frames. >>> >>> There are a couple of related bugs: >>> https://bugs.openjdk.java.net/browse/JDK-6617153 >>> https://bugs.openjdk.java.net/browse/JDK-6276264 >>> >>> There can be more issues filed on this. >> >> Ah, thanks, it's JDK-6276264. >> >> As Tom Rodriguez said at the time (2005): "The server VM uses the frame >> pointer as an allocatable register and there's no way to turn that off." I >> was really hoping there was a way to turn that off, like >> -fno-omit-frame-pointer. >> >> This also means DTrace jstack() has never worked fully. For the applications >> I tried it on, 50% of stacks were incomplete. Perhaps it wasn't that bad in >> 2005. I've been getting more mileage today from Java profilers. >> >>> Please, note, that the jstack action is not implemented on Linux yet. >> >> Linux doesn't have DTrace jstack(), no, but its perf_events does has support >> for loading an auxiliary file of symbols, which can created via a Java agent >> for that purpose (eg, https://github.com/jrudolph/perf-map-agent). But that >> hasn't been working fully for the same reason - incomplete stacks. >> >> Brendan >> >>> Thanks, >>> Serguei >>> >>> >>> >>> On 6/16/14 5:14 PM, Brendan Gregg wrote: >>> >>> Thanks but no, I'm aware of that bug and workarounds (I'm using the >>> LD_AUDIT_64=/usr/lib/dtrace/64/libdtrace_forceload.so workaround, which >>> isn't mentioned in the bug comments, but probably should be). That bug is >>> about missing symbols, but the stacks shown in that bug still go all the way >>> to thread_start. My stacks often don't. >>> >>> For simple programs, the stacks are complete. But something complex (eg, >>> vert.x with event loops), and the stacks are often incomplete, one frame >>> only. Very much like what I see with -fomit-frame-pointer, although this is >>> hotspot, not gcc. Such incomplete stacks are seen using either DTrace or >>> perf_events. >>> >>> It was suggested to me to email the hotspot developers, because this may >>> well be a hotspot optimization they are familiar with. It may also be >>> something really obvious, like that the JVM breaks native stacks due to >>> optimized frames / green threads / etc, and there is absolutely no way >>> around it (no way to disable it). If that's true, it may also mean that the >>> DTrace jstack() action has always had this issue. I'm still reading the >>> source... >>> >>> Brendan >>> >>> >>> >>> On Mon, Jun 16, 2014 at 4:04 AM, Staffan Larsen >>> wrote: >>>> I think this is the bug you are looking at: >>>> https://bugs.openjdk.java.net/browse/JDK-7187999, but I?ll defer to someone >>>> else to confirm. >>>> >>>> /Staffan >>>> >>>> >>>> On 16 jun 2014, at 12:47, Roland Westrelin >>>> wrote: >>>> >>>> Forwarding to serviceability alias where this question belongs I think. >>>> >>>> Begin forwarded message: >>>> >>>> From: Brendan Gregg >>>> Subject: system profilers and incomplete stacks >>>> Date: June 12, 2014 at 7:15:54 PM GMT+2 >>>> To: hotspot-compiler-dev at openjdk.java.net >>>> >>>> G'Day, >>>> >>>> Is there a way to run hotspot so that a system profiler (eg, DTrace, or >>>> Linux perf_events) can measure complete stacks? I often get incomplete, >>>> partial stacks, with one or a few frames only. I'm not worried about symbols >>>> right now, what I'd like is to walk stacks all the way down to thread start. >>>> >>>> I've been browsing the hotspot code, but haven't found out how yet. I >>>> suspect it's related to Java optimized frames, and has ditched the frame >>>> pointer. I was looking for an equivalent -fno-omit-frame-pointer option. >>>> >>>> Here's an example: >>>> >>>> # dtrace -n 'profile-99 /execname == "java"/ { @[jstack(100, 8000)] = >>>> count(); }' >>>> [...] >>>> org/mozilla/javascript/ >>>> >>>> ScriptableObject.createSlot(Ljava/lang/String;II)Lorg/mozilla/javascript/ScriptableObject$Slot;* >>>> 0x884acce8200002da >>>> 1 >>>> >>>> sun/nio/ch/SocketChannelImpl.read(Ljava/nio/ByteBuffer;)I* >>>> 0xffffffff20007f4b >>>> 1 >>>> >>>> >>>> org/mozilla/javascript/ScriptRuntime.newObjectLiteral([Ljava/lang/Object;[Ljava/lang/Object;[ILorg/mozilla/javascript/Context;Lorg/mozilla/javascript/Scriptable;)Lorg/mozilla/javascript/Scriptable;* >>>> 0xa20000041 >>>> 1 >>>> [...] >>>> >>>> I see similar incomplete stacks with Linux perf_events. Oracle JDKs from >>>> 6 to 8, and OpenJDK. >>>> >>>> thanks, >>>> >>>> Brendan >>>> -- >>>> http://www.brendangregg.com >>>> >>>> >>>> >>> >>> >>> -- >>> http://www.brendangregg.com >>> >>> >> >> >> -- >> http://www.brendangregg.com From markus.gronlund at oracle.com Wed Oct 22 09:43:49 2014 From: markus.gronlund at oracle.com (=?iso-8859-1?B?TWFya3VzIEdy9m5sdW5k?=) Date: Wed, 22 Oct 2014 02:43:49 -0700 (PDT) Subject: RFR(L): 8056049: getProcessCpuLoad() stops working in one process when a different process exits Message-ID: Greetings, ? Kindly asking for reviews for the following changeset. ? Bug: https://bugs.openjdk.java.net/browse/JDK-8056049 Webrev: http://cr.openjdk.java.net/~mgronlun/8056049/webrev01/ ? Description: ? The issue is ?Windows specific. And the problem relates to using the Performance Data Helper API (PDH), more specifically how to use the "Process" PDH object in PDH queries: ? // code comment extract ? /* * Working against the Process object and it's related counters is inherently problematic * when using the PDH API: * * For PDH, a process is not primarily identified by it's process id, * but with a sequential number, for example \Process(java#0), \Process(java#1), .... * The really bad part is that this list is reset as soon as one process exits: * If \Process(java#1) exits, \Process(java#3) now becomes \Process(java#2) etc. * * The PDH query api requires a process identifier to be submitted when registering * a query, but as soon as the list resets, the query is invalidated (since the name * changed). * * Solution: * The #number identifier for a Process query can only decrease after process creation. * * Therefore we create an array of counter queries for all process object instances * up to and including ourselves: * * Ex. we come in as third process instance (java#2), we then create and register * queries for the following Process object instances: * java#0, java#1, java#2 * * currentQueryIndexForProcess() keeps track of the current "correct" query * (in order to keep this index valid when the list resets from underneath, * ensure to call getCurrentQueryIndexForProcess() before every query involving * Process object instance data). */ ? I have already fixed this in the VM as of https://bugs.openjdk.java.net/browse/JDK-8019921 ? In the process of fixing this issue now in the JDK, I realized that the previous implementation of using PDH in the JDK was a bit convoluted - especially if you would like to reuse functionality / add new counters. ? Therefore this change also includes an overall rewrite of the how the JDK will interface with the PDH library, a rewrite of which (hopefully) improves both readability and extensibility. ? I can do a code walkthrough live if anyone is interested to know the exact details of this change. ? Testing completed : Testset SVC (includes jdk_instrument, jdk_management, jdk_jmx, jdk_jdi) ? Thanks in advance Markus ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Wed Oct 22 20:29:39 2014 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 23 Oct 2014 00:29:39 +0400 Subject: system profilers and incomplete stacks In-Reply-To: <54475DA0.8090103@oracle.com> References: <1F3E1054-9947-43AC-9AD1-350E0174C9A1@oracle.com> <539FD60E.10203@oracle.com> <54475DA0.8090103@oracle.com> Message-ID: Hi, I think the main problem here is not only that compiled code uses the frame pointer as a general purpose register, but the fact that even the VM itself currently can not reliably take a stack trace of a thread at any arbitrary PC. An external sampling profiler like perf can interrupt the VM at any place (i.e. while the VM sets up a new frame or while inside an adapter). So while keeping the frame pointer alive will definitely help to improve the situation, I'm not it will help in every situation. Regards, Volker On 10/22/14, serguei.spitsyn at oracle.com wrote: > Hi Brendan, > > We are working with the Solaris on prototyping an assisted approach to > resolve the issue. > In this approach the jhelper.d (the dtrace jstack action provider on VM > side) is assisting > the DTrace framework to do the stack walking cooperatively. > > I've assigned the bug JDK-6617153 to myself and will post updates about > the progress. > Please, let us know if you have any questions. > > Thanks, > Serguei > > > On 10/21/14 5:10 PM, Brendan Gregg wrote: >> G'Day, >> >> I checked the JDK 9 early access releases, but didn't see anything for >> JDK-6276264. I've also since learned that Twitter has an OpenJDK fork >> with frame pointers disabled, for the same purpose: stack profiling >> (using Linux perf_events). Might this be worked on for JDK 9? I can >> help test. thanks, >> >> Brendan >> >> On Mon, Jun 16, 2014 at 11:52 PM, Brendan Gregg >> wrote: >>> G'Day Serguei, >>> >>> On Mon, Jun 16, 2014 at 10:45 PM, serguei.spitsyn at oracle.com >>> wrote: >>>> Hi Brendan, >>>> >>>> We are aware of these issues and work with the Solaris team to fix them >>>> in >>>> JDK 9. >>>> One is the frame pointer is used by the server compiler as a general >>>> purpose register on intel. >>>> Another is about the virtual (or inlined) frames. >>>> >>>> There are a couple of related bugs: >>>> https://bugs.openjdk.java.net/browse/JDK-6617153 >>>> https://bugs.openjdk.java.net/browse/JDK-6276264 >>>> >>>> There can be more issues filed on this. >>> >>> Ah, thanks, it's JDK-6276264. >>> >>> As Tom Rodriguez said at the time (2005): "The server VM uses the frame >>> pointer as an allocatable register and there's no way to turn that off." >>> I >>> was really hoping there was a way to turn that off, like >>> -fno-omit-frame-pointer. >>> >>> This also means DTrace jstack() has never worked fully. For the >>> applications >>> I tried it on, 50% of stacks were incomplete. Perhaps it wasn't that bad >>> in >>> 2005. I've been getting more mileage today from Java profilers. >>> >>>> Please, note, that the jstack action is not implemented on Linux yet. >>> >>> Linux doesn't have DTrace jstack(), no, but its perf_events does has >>> support >>> for loading an auxiliary file of symbols, which can created via a Java >>> agent >>> for that purpose (eg, https://github.com/jrudolph/perf-map-agent). But >>> that >>> hasn't been working fully for the same reason - incomplete stacks. >>> >>> Brendan >>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> >>>> On 6/16/14 5:14 PM, Brendan Gregg wrote: >>>> >>>> Thanks but no, I'm aware of that bug and workarounds (I'm using the >>>> LD_AUDIT_64=/usr/lib/dtrace/64/libdtrace_forceload.so workaround, which >>>> isn't mentioned in the bug comments, but probably should be). That bug >>>> is >>>> about missing symbols, but the stacks shown in that bug still go all the >>>> way >>>> to thread_start. My stacks often don't. >>>> >>>> For simple programs, the stacks are complete. But something complex >>>> (eg, >>>> vert.x with event loops), and the stacks are often incomplete, one >>>> frame >>>> only. Very much like what I see with -fomit-frame-pointer, although this >>>> is >>>> hotspot, not gcc. Such incomplete stacks are seen using either DTrace >>>> or >>>> perf_events. >>>> >>>> It was suggested to me to email the hotspot developers, because this >>>> may >>>> well be a hotspot optimization they are familiar with. It may also be >>>> something really obvious, like that the JVM breaks native stacks due to >>>> optimized frames / green threads / etc, and there is absolutely no way >>>> around it (no way to disable it). If that's true, it may also mean that >>>> the >>>> DTrace jstack() action has always had this issue. I'm still reading the >>>> source... >>>> >>>> Brendan >>>> >>>> >>>> >>>> On Mon, Jun 16, 2014 at 4:04 AM, Staffan Larsen >>>> wrote: >>>>> I think this is the bug you are looking at: >>>>> https://bugs.openjdk.java.net/browse/JDK-7187999, but I?ll defer to >>>>> someone >>>>> else to confirm. >>>>> >>>>> /Staffan >>>>> >>>>> >>>>> On 16 jun 2014, at 12:47, Roland Westrelin >>>>> >>>>> wrote: >>>>> >>>>> Forwarding to serviceability alias where this question belongs I >>>>> think. >>>>> >>>>> Begin forwarded message: >>>>> >>>>> From: Brendan Gregg >>>>> Subject: system profilers and incomplete stacks >>>>> Date: June 12, 2014 at 7:15:54 PM GMT+2 >>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>> >>>>> G'Day, >>>>> >>>>> Is there a way to run hotspot so that a system profiler (eg, DTrace, >>>>> or >>>>> Linux perf_events) can measure complete stacks? I often get >>>>> incomplete, >>>>> partial stacks, with one or a few frames only. I'm not worried about >>>>> symbols >>>>> right now, what I'd like is to walk stacks all the way down to thread >>>>> start. >>>>> >>>>> I've been browsing the hotspot code, but haven't found out how yet. I >>>>> suspect it's related to Java optimized frames, and has ditched the >>>>> frame >>>>> pointer. I was looking for an equivalent -fno-omit-frame-pointer >>>>> option. >>>>> >>>>> Here's an example: >>>>> >>>>> # dtrace -n 'profile-99 /execname == "java"/ { @[jstack(100, 8000)] = >>>>> count(); }' >>>>> [...] >>>>> org/mozilla/javascript/ >>>>> >>>>> ScriptableObject.createSlot(Ljava/lang/String;II)Lorg/mozilla/javascript/ScriptableObject$Slot;* >>>>> 0x884acce8200002da >>>>> 1 >>>>> >>>>> >>>>> sun/nio/ch/SocketChannelImpl.read(Ljava/nio/ByteBuffer;)I* >>>>> 0xffffffff20007f4b >>>>> 1 >>>>> >>>>> >>>>> org/mozilla/javascript/ScriptRuntime.newObjectLiteral([Ljava/lang/Object;[Ljava/lang/Object;[ILorg/mozilla/javascript/Context;Lorg/mozilla/javascript/Scriptable;)Lorg/mozilla/javascript/Scriptable;* >>>>> 0xa20000041 >>>>> 1 >>>>> [...] >>>>> >>>>> I see similar incomplete stacks with Linux perf_events. Oracle JDKs >>>>> from >>>>> 6 to 8, and OpenJDK. >>>>> >>>>> thanks, >>>>> >>>>> Brendan >>>>> -- >>>>> http://www.brendangregg.com >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> http://www.brendangregg.com >>>> >>>> >>> >>> >>> -- >>> http://www.brendangregg.com > > From jaroslav.bachorik at oracle.com Thu Oct 23 09:45:29 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 23 Oct 2014 11:45:29 +0200 Subject: RFR 8058506: ThreadMXBeanStateTest throws exception In-Reply-To: <54469C35.8090005@oracle.com> References: <5444EE20.6090803@oracle.com> <54465F27.3080204@oracle.com> <54469C35.8090005@oracle.com> Message-ID: <5448CE39.704@oracle.com> On 10/21/2014 07:47 PM, Jaroslav Bachorik wrote: > On 10/21/2014 03:27 PM, Erik Gahlin wrote: >> Have you considered creating a LogMessage class that keeps the logCntr >> value and the log message, instead of putting the counter into the log >> string and parsing it. > > Yes. And didn't go that way in order to prevent creating a lot of > throwaway stringbuilder instances (the Formatter works that way) - but > it might (almost certainly) be a premature optimization. I will clean it > up and resubmit the request. http://cr.openjdk.java.net/~jbachorik/8058506/webrev.01 I moved the log collection logic to a separate class available in the testlib now. The code is much more concise now. -JB- > > -JB- > >> >> Seems simpler and easier to understand. >> >> Erik >> >> Jaroslav Bachorik skrev 2014-10-20 13:12: >>> Please, review the following test change >>> >>> Issue : https://bugs.openjdk.java.net/browse/JDK-8058506 >>> Webrev: http://cr.openjdk.java.net/~jbachorik/8058506/webrev.00 >>> >>> The test fails intermittently due to the log printing blocking the >>> test thread from time to time, resulting in incorrect data reported by >>> ThreadMXBean. >>> >>> The solution is to use per-thread non-blocking StringBuilders (wrapped >>> in Formatter instances) and aggregate the log output only after the >>> test is finished. >>> >>> Thanks, >>> >>> -JB- >> > From erik.gahlin at oracle.com Thu Oct 23 13:31:25 2014 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Thu, 23 Oct 2014 15:31:25 +0200 Subject: RFR 8058506: ThreadMXBeanStateTest throws exception In-Reply-To: <5448CE39.704@oracle.com> References: <5444EE20.6090803@oracle.com> <54465F27.3080204@oracle.com> <54469C35.8090005@oracle.com> <5448CE39.704@oracle.com> Message-ID: <5449032D.5070208@oracle.com> Looks good! Erik Jaroslav Bachorik skrev 2014-10-23 11:45: > On 10/21/2014 07:47 PM, Jaroslav Bachorik wrote: >> On 10/21/2014 03:27 PM, Erik Gahlin wrote: >>> Have you considered creating a LogMessage class that keeps the logCntr >>> value and the log message, instead of putting the counter into the log >>> string and parsing it. >> >> Yes. And didn't go that way in order to prevent creating a lot of >> throwaway stringbuilder instances (the Formatter works that way) - but >> it might (almost certainly) be a premature optimization. I will clean it >> up and resubmit the request. > > http://cr.openjdk.java.net/~jbachorik/8058506/webrev.01 > > I moved the log collection logic to a separate class available in the > testlib now. The code is much more concise now. > > -JB- > >> >> -JB- >> >>> >>> Seems simpler and easier to understand. >>> >>> Erik >>> >>> Jaroslav Bachorik skrev 2014-10-20 13:12: >>>> Please, review the following test change >>>> >>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8058506 >>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8058506/webrev.00 >>>> >>>> The test fails intermittently due to the log printing blocking the >>>> test thread from time to time, resulting in incorrect data reported by >>>> ThreadMXBean. >>>> >>>> The solution is to use per-thread non-blocking StringBuilders (wrapped >>>> in Formatter instances) and aggregate the log output only after the >>>> test is finished. >>>> >>>> Thanks, >>>> >>>> -JB- >>> >> > From peter.allwin at oracle.com Fri Oct 24 13:39:41 2014 From: peter.allwin at oracle.com (Peter Allwin) Date: Fri, 24 Oct 2014 15:39:41 +0200 Subject: RFR: 8024055: serviceability/attach/AttachWithStalePidFile.java createJavaPidFile() fails Message-ID: <74D4DD20-76FD-4E21-A3FC-29D6D7CC4A9E@oracle.com> Hello! This patch fixes two intermittent issues seen over the past year: a) Possible failure where an existing pid-file is not owned by the test user b) Race during startup where we try to attach to the target before it?s ready (removed arbitrary 5sec sleep) Bug: https://bugs.openjdk.java.net/browse/JDK-8024055 Webrev: http://cr.openjdk.java.net/~allwin/8024055/webrev.00/ Tested locally on my Mac. Thanks! /peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaroslav.bachorik at oracle.com Fri Oct 24 16:44:14 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 24 Oct 2014 18:44:14 +0200 Subject: RFR Message-ID: <544A81DE.8060009@oracle.com> Please, review this change to the test library. Issue : https://bugs.openjdk.java.net/browse/JDK-8062070 Webrev: http://cr.openjdk.java.net/~jbachorik/8062070/webrev.00 The test started failing after the process.destroyForcibly() was added to ProcessTools.executeProcess() to make sure there were no orphaned processes once the test was finished. The destruction call was placed to the finally block and was called for running or terminated processes indiscriminately. It turns out that Process.get*Stream() methods will get confused when calling Process.destroy/Forcibly/() on an already exited process - the streams will get closed and any attempt to read or write to them will end with a SocketException. The failure is timing related - when the stream manages to buffer data before destroying the process from another thread the test passes. Otherwise it just fails. The solution is to forcibly close the external process (started by ProcessTools.executeProcess()) only in cases when eg. Process.waitFor() throws an exception - moving it from the finally block to the catch block. Thanks, -JB- From jaroslav.bachorik at oracle.com Fri Oct 24 17:06:54 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 24 Oct 2014 19:06:54 +0200 Subject: RFR 8062070: com/sun/jdi/DoubleAgentTest.java.DoubleAgentTest fails intermittently after 8056143 Message-ID: <544A872E.8050301@oracle.com> Please, review this change to the test library. Issue : https://bugs.openjdk.java.net/browse/JDK-8062070 Webrev: http://cr.openjdk.java.net/~jbachorik/8062070/webrev.00 The test started failing after the process.destroyForcibly() was added to ProcessTools.executeProcess() to make sure there were no orphaned processes once the test was finished. The destruction call was placed to the finally block and was called for running or terminated processes indiscriminately. It turns out that Process.get*Stream() methods will get confused when calling Process.destroy/Forcibly/() on an already exited process - the streams will get closed and any attempt to read or write to them will end with a SocketException. The failure is timing related - when the stream manages to buffer data before destroying the process from another thread the test passes. Otherwise it just fails. The solution is to forcibly close the external process (started by ProcessTools.executeProcess()) only in cases when eg. Process.waitFor() throws an exception - moving it from the finally block to the catch block. Thanks, -JB- From daniel.daugherty at oracle.com Fri Oct 24 23:06:47 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 24 Oct 2014 17:06:47 -0600 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <542BC45C.8080408@oracle.com> References: <542BC45C.8080408@oracle.com> Message-ID: <544ADB87.7080909@oracle.com> On 10/1/14 3:07 AM, Ivan Gerasimov wrote: > Hello! > > The tests that continue to fail with wrong exit codes suggest that the > fix for JDK-8057744 wasn't sufficient. > Here's another proposal, which expands the synchronized portion of the > code. > It is proposed to make the exiting process wait for the threads that > have already started exiting. > This should help to make sure that no thread is executing any > potentially racy code concurrently with the exiting process. > > BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 > WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ Finally got a chance to look at the official version of fix. Thumbs up! src/os/windows/vm/os_windows.cpp No comments. Dan P.S. We had another sighting of an exit_code == 60115 test failure this past week so while your previous fix greatly reduced the odds of this race, I'm looking forward to seeing this new version in action... > > Comments, suggestion are welcome! > > Sincerely yours, > Ivan From ivan.gerasimov at oracle.com Sat Oct 25 18:23:36 2014 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Sat, 25 Oct 2014 22:23:36 +0400 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <544ADB87.7080909@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> Message-ID: <544BEAA8.5010600@oracle.com> On 25.10.2014 3:06, Daniel D. Daugherty wrote: > On 10/1/14 3:07 AM, Ivan Gerasimov wrote: >> Hello! >> >> The tests that continue to fail with wrong exit codes suggest that >> the fix for JDK-8057744 wasn't sufficient. >> Here's another proposal, which expands the synchronized portion of >> the code. >> It is proposed to make the exiting process wait for the threads that >> have already started exiting. >> This should help to make sure that no thread is executing any >> potentially racy code concurrently with the exiting process. >> >> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 >> WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ > > Finally got a chance to look at the official version of fix. > > Thumbs up! > > src/os/windows/vm/os_windows.cpp > No comments. > Thank you Daniel! I assume the change needs the second hotspot reviewer? What would be the best time for pushing this fix? Sincerely yours, Ivan > Dan > > P.S. > We had another sighting of an exit_code == 60115 test failure > this past week so while your previous fix greatly reduced the > odds of this race, I'm looking forward to seeing this new > version in action... > > > >> >> Comments, suggestion are welcome! >> >> Sincerely yours, >> Ivan > > > From daniel.daugherty at oracle.com Sun Oct 26 15:01:29 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Sun, 26 Oct 2014 09:01:29 -0600 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <544BEAA8.5010600@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> Message-ID: <544D0CC9.3040903@oracle.com> On 10/25/14 12:23 PM, Ivan Gerasimov wrote: > > On 25.10.2014 3:06, Daniel D. Daugherty wrote: >> On 10/1/14 3:07 AM, Ivan Gerasimov wrote: >>> Hello! >>> >>> The tests that continue to fail with wrong exit codes suggest that >>> the fix for JDK-8057744 wasn't sufficient. >>> Here's another proposal, which expands the synchronized portion of >>> the code. >>> It is proposed to make the exiting process wait for the threads that >>> have already started exiting. >>> This should help to make sure that no thread is executing any >>> potentially racy code concurrently with the exiting process. >>> >>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 >>> WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ >> >> Finally got a chance to look at the official version of fix. >> >> Thumbs up! >> >> src/os/windows/vm/os_windows.cpp >> No comments. >> > Thank you Daniel! > > I assume the change needs the second hotspot reviewer? Yes, HotSpot changes always need two reviewers. David Holmes chimed in on this thread. You should ask him if he can be counted as a reviewer. > What would be the best time for pushing this fix? Let's go for Wednesday again so we have a full week of testing to evaluate this latest tweak. Dan > > Sincerely yours, > Ivan > >> Dan >> >> P.S. >> We had another sighting of an exit_code == 60115 test failure >> this past week so while your previous fix greatly reduced the >> odds of this race, I'm looking forward to seeing this new >> version in action... >> >> >> >>> >>> Comments, suggestion are welcome! >>> >>> Sincerely yours, >>> Ivan >> >> >> > From ivan.gerasimov at oracle.com Sun Oct 26 15:15:32 2014 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Sun, 26 Oct 2014 19:15:32 +0400 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <544D0CC9.3040903@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> <544D0CC9.3040903@oracle.com> Message-ID: <544D1014.10600@oracle.com> David, would you approve this fix? Sincerely yours, Ivan On 26.10.2014 19:01, Daniel D. Daugherty wrote: > On 10/25/14 12:23 PM, Ivan Gerasimov wrote: >> >> On 25.10.2014 3:06, Daniel D. Daugherty wrote: >>> On 10/1/14 3:07 AM, Ivan Gerasimov wrote: >>>> Hello! >>>> >>>> The tests that continue to fail with wrong exit codes suggest that >>>> the fix for JDK-8057744 wasn't sufficient. >>>> Here's another proposal, which expands the synchronized portion of >>>> the code. >>>> It is proposed to make the exiting process wait for the threads >>>> that have already started exiting. >>>> This should help to make sure that no thread is executing any >>>> potentially racy code concurrently with the exiting process. >>>> >>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 >>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ >>> >>> Finally got a chance to look at the official version of fix. >>> >>> Thumbs up! >>> >>> src/os/windows/vm/os_windows.cpp >>> No comments. >>> >> Thank you Daniel! >> >> I assume the change needs the second hotspot reviewer? > > Yes, HotSpot changes always need two reviewers. David Holmes > chimed in on this thread. You should ask him if he can be > counted as a reviewer. > > >> What would be the best time for pushing this fix? > > Let's go for Wednesday again so we have a full week of testing > to evaluate this latest tweak. > > Dan > > >> >> Sincerely yours, >> Ivan >> >>> Dan >>> >>> P.S. >>> We had another sighting of an exit_code == 60115 test failure >>> this past week so while your previous fix greatly reduced the >>> odds of this race, I'm looking forward to seeing this new >>> version in action... >>> >>> >>> >>>> >>>> Comments, suggestion are welcome! >>>> >>>> Sincerely yours, >>>> Ivan >>> >>> >>> >> > > > From david.holmes at oracle.com Sun Oct 26 22:53:03 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 27 Oct 2014 08:53:03 +1000 Subject: RFR 8062070: com/sun/jdi/DoubleAgentTest.java.DoubleAgentTest fails intermittently after 8056143 In-Reply-To: <544A872E.8050301@oracle.com> References: <544A872E.8050301@oracle.com> Message-ID: <544D7B4F.6000201@oracle.com> On 25/10/2014 3:06 AM, Jaroslav Bachorik wrote: > Please, review this change to the test library. > > Issue : https://bugs.openjdk.java.net/browse/JDK-8062070 > Webrev: http://cr.openjdk.java.net/~jbachorik/8062070/webrev.00 > > The test started failing after the process.destroyForcibly() was added > to ProcessTools.executeProcess() to make sure there were no orphaned > processes once the test was finished. The destruction call was placed to > the finally block and was called for running or terminated processes > indiscriminately. > > It turns out that Process.get*Stream() methods will get confused when > calling Process.destroy/Forcibly/() on an already exited process - the > streams will get closed and any attempt to read or write to them will > end with a SocketException. > > The failure is timing related - when the stream manages to buffer data > before destroying the process from another thread the test passes. > Otherwise it just fails. > > The solution is to forcibly close the external process (started by > ProcessTools.executeProcess()) only in cases when eg. Process.waitFor() > throws an exception - moving it from the finally block to the catch block. I can't help but think we are simply shifting the failure window around. But the proof of this is in the testing. Reviewed. David > Thanks, > > -JB- From david.holmes at oracle.com Sun Oct 26 23:36:43 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 27 Oct 2014 09:36:43 +1000 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <544D1014.10600@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> <544D0CC9.3040903@oracle.com> <544D1014.10600@oracle.com> Message-ID: <544D858B.5050308@oracle.com> On 27/10/2014 1:15 AM, Ivan Gerasimov wrote: > David, would you approve this fix? Sorry Ivan I'm having trouble following the logic this time - could you add some comments about what we are checking at each step. Also we seem to exit while still holding the critical section - how does that work? Thanks, David > Sincerely yours, > Ivan > > On 26.10.2014 19:01, Daniel D. Daugherty wrote: >> On 10/25/14 12:23 PM, Ivan Gerasimov wrote: >>> >>> On 25.10.2014 3:06, Daniel D. Daugherty wrote: >>>> On 10/1/14 3:07 AM, Ivan Gerasimov wrote: >>>>> Hello! >>>>> >>>>> The tests that continue to fail with wrong exit codes suggest that >>>>> the fix for JDK-8057744 wasn't sufficient. >>>>> Here's another proposal, which expands the synchronized portion of >>>>> the code. >>>>> It is proposed to make the exiting process wait for the threads >>>>> that have already started exiting. >>>>> This should help to make sure that no thread is executing any >>>>> potentially racy code concurrently with the exiting process. >>>>> >>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 >>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ >>>> >>>> Finally got a chance to look at the official version of fix. >>>> >>>> Thumbs up! >>>> >>>> src/os/windows/vm/os_windows.cpp >>>> No comments. >>>> >>> Thank you Daniel! >>> >>> I assume the change needs the second hotspot reviewer? >> >> Yes, HotSpot changes always need two reviewers. David Holmes >> chimed in on this thread. You should ask him if he can be >> counted as a reviewer. >> >> >>> What would be the best time for pushing this fix? >> >> Let's go for Wednesday again so we have a full week of testing >> to evaluate this latest tweak. >> >> Dan >> >> >>> >>> Sincerely yours, >>> Ivan >>> >>>> Dan >>>> >>>> P.S. >>>> We had another sighting of an exit_code == 60115 test failure >>>> this past week so while your previous fix greatly reduced the >>>> odds of this race, I'm looking forward to seeing this new >>>> version in action... >>>> >>>> >>>> >>>>> >>>>> Comments, suggestion are welcome! >>>>> >>>>> Sincerely yours, >>>>> Ivan >>>> >>>> >>>> >>> >> >> >> > From david.holmes at oracle.com Sun Oct 26 23:48:23 2014 From: david.holmes at oracle.com (David Holmes) Date: Mon, 27 Oct 2014 09:48:23 +1000 Subject: RFR: 8024055: serviceability/attach/AttachWithStalePidFile.java createJavaPidFile() fails In-Reply-To: <74D4DD20-76FD-4E21-A3FC-29D6D7CC4A9E@oracle.com> References: <74D4DD20-76FD-4E21-A3FC-29D6D7CC4A9E@oracle.com> Message-ID: <544D8847.5040101@oracle.com> Hi Peter, On 24/10/2014 11:39 PM, Peter Allwin wrote: > Hello! > > This patch fixes two intermittent issues seen over the past year: > > a) Possible failure where an existing pid-file is not owned by the > test user > b) Race during startup where we try to attach to the target before > it?s ready (removed arbitrary 5sec sleep) > > Bug: https://bugs.openjdk.java.net/browse/JDK-8024055 > Webrev: http://cr.openjdk.java.net/~allwin/8024055/webrev.00/ test/serviceability/attach/AttachWithStalePidFile.java Why use an Executor and a Future instead of simply processing the process output directly? David > Tested locally on my Mac. > > Thanks! > /peter From ivan.gerasimov at oracle.com Mon Oct 27 08:35:42 2014 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Mon, 27 Oct 2014 12:35:42 +0400 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <544D858B.5050308@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> <544D0CC9.3040903@oracle.com> <544D1014.10600@oracle.com> <544D858B.5050308@oracle.com> Message-ID: <544E03DE.4000308@oracle.com> On 27.10.2014 3:36, David Holmes wrote: > On 27/10/2014 1:15 AM, Ivan Gerasimov wrote: >> David, would you approve this fix? > > Sorry Ivan I'm having trouble following the logic this time - could > you add some comments about what we are checking at each step. Yes, sure. The main idea is to make the thread that ends the process wait for the threads that had finished so far. Thus, we have an array for storing the thread handles. Any thread that is on thread-exit path, first tries to remove the completed threads from the array (to keep the list smaller), and then adds its own handle to the end of the array. The thread that is on process-exit path, calls exit (or _exit), while still owning the critical section. This way we make sure, no other threads execute any exit-related code at the same time. Here's a typical scenario: 1) First thread that decided to end itself calls exit_process_or_thread() -- let's assume it is on thread-exit path. Initializes the critical section. 2) Grabs the ownership of the crit. section 3) The list of thread handles is initially empty, so the thread adds a duplicate of its handle to the array. 4) Releases the crit. section 5) Calls _endthreadex() to terminate itself 6) Another thread enters exit_process_or_thread() -- let it be on thread-exit path as well. 7) Grabs the ownership of the crit. section 8) In a loop checks if any previously ended thread has completed. Here we call WaitForSingleObject with zero timeout, so we don't block. All the handles of completed threads are closed. 9) If there's is a free slot in the array, the thread adds its handle to the end 10) If the array is full (which is very unlikely), the thread waits for ANY thread to complete, and then adds itself to the array. 11) Releases the crit. section 12) Calls _endthreadex() to terminate itself 13) Some thread enters exit_process_or_thread() in order to end the whole process. 14) Grabs the ownership of the crit. section 15) Waits on all the threads that have added their handles to the array (typically there will be only one such thread handle). Since the ownership of the critical section is held, no other threads will execute any exit-related code at this time. 16) Once all the threads from the list have completed, the thread closes the handles and calls exit() (or _exit()), holding the crit. section ownership. We're done. Error handling: in a case of errors, we report them, and proceed with exiting as usual. - If initialization of critical section fails, we'll just call the corresponding exit routine. - If we failed, waiting for an exiting thread to complete, close its handle as if it has completed. - If we failed, waiting for any thread to complete withing a time-out (array is full), close all the handles and continue as if there were no threads exited before. - If we couldn't duplicate the handle, ignore it (don't add it to the array), so no one will wait for it later. - If the thread on the process-exit path failed to wait for the threads to complete withing the time-out, proceed to the exit anyway. All these errors should never happen during normal execution, but if they do, we still try to end threads/process in a way it's done now. In this, later case, we are at risk of observing a race condition. However, the chances of this happening are much lesser, and in addition we'll have a waring message to analyze. Possible bottlenecks. 1) All the threads have to obtain the ownership of the critical section, which effectively serializes all the exiting threads. However, this doesn't appear to make things too much slower, as all the threads already do similar thing in _endthreadex(). 2) Normally, the threads don't block having ownership of the crit. section. The block can only happen if there's no free slot in the array of handles. This can only happen if MAX_EXIT_HANDLES (== 16) threads have just called _endthreadex(), and none of them completed. 3) When the thread at process-exit path waits for all the exiting threads to complete, the time-out of 1 second is specified. If any of those threads do not complete, this can lead to that the application is delayed at the exit. However, we don't block forever, and the delay can only be observed upon a failure. > Also we seem to exit while still holding the critical section - how > does that work? > Right. We make the thread at the process-exit path call exit() from withing critical section block. This way it is ensured no other exit-related code is executed at the same moment, and a race is avoided. Sincerely yours, Ivan > Thanks, > David > >> Sincerely yours, >> Ivan >> >> On 26.10.2014 19:01, Daniel D. Daugherty wrote: >>> On 10/25/14 12:23 PM, Ivan Gerasimov wrote: >>>> >>>> On 25.10.2014 3:06, Daniel D. Daugherty wrote: >>>>> On 10/1/14 3:07 AM, Ivan Gerasimov wrote: >>>>>> Hello! >>>>>> >>>>>> The tests that continue to fail with wrong exit codes suggest that >>>>>> the fix for JDK-8057744 wasn't sufficient. >>>>>> Here's another proposal, which expands the synchronized portion of >>>>>> the code. >>>>>> It is proposed to make the exiting process wait for the threads >>>>>> that have already started exiting. >>>>>> This should help to make sure that no thread is executing any >>>>>> potentially racy code concurrently with the exiting process. >>>>>> >>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 >>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ >>>>> >>>>> Finally got a chance to look at the official version of fix. >>>>> >>>>> Thumbs up! >>>>> >>>>> src/os/windows/vm/os_windows.cpp >>>>> No comments. >>>>> >>>> Thank you Daniel! >>>> >>>> I assume the change needs the second hotspot reviewer? >>> >>> Yes, HotSpot changes always need two reviewers. David Holmes >>> chimed in on this thread. You should ask him if he can be >>> counted as a reviewer. >>> >>> >>>> What would be the best time for pushing this fix? >>> >>> Let's go for Wednesday again so we have a full week of testing >>> to evaluate this latest tweak. >>> >>> Dan >>> >>> >>>> >>>> Sincerely yours, >>>> Ivan >>>> >>>>> Dan >>>>> >>>>> P.S. >>>>> We had another sighting of an exit_code == 60115 test failure >>>>> this past week so while your previous fix greatly reduced the >>>>> odds of this race, I'm looking forward to seeing this new >>>>> version in action... >>>>> >>>>> >>>>> >>>>>> >>>>>> Comments, suggestion are welcome! >>>>>> >>>>>> Sincerely yours, >>>>>> Ivan >>>>> >>>>> >>>>> >>>> >>> >>> >>> >> > > From jaroslav.bachorik at oracle.com Mon Oct 27 09:31:09 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 27 Oct 2014 10:31:09 +0100 Subject: RFR: 8024055: serviceability/attach/AttachWithStalePidFile.java createJavaPidFile() fails In-Reply-To: <74D4DD20-76FD-4E21-A3FC-29D6D7CC4A9E@oracle.com> References: <74D4DD20-76FD-4E21-A3FC-29D6D7CC4A9E@oracle.com> Message-ID: <544E10DD.5070706@oracle.com> Hi Peter, On 10/24/2014 03:39 PM, Peter Allwin wrote: > Hello! > > This patch fixes two intermittent issues seen over the past year: > > a) Possible failure where an existing pid-file is not owned by the > test user > b) Race during startup where we try to attach to the target before > it?s ready (removed arbitrary 5sec sleep) > > Bug: https://bugs.openjdk.java.net/browse/JDK-8024055 > Webrev: http://cr.openjdk.java.net/~allwin/8024055/webrev.00/ test/serviceability/attach/AttachWithStalePidFile.java --- Couldn't you use ProcessTools.startProcess(name, processBuilder, readyPredicate) to start the test process and make sure it prints "ready" line before continuing? test/serviceability/attach/AttachWithStalePidFileTarget.java --- Instead of waiting here for a really long time you could block on reading from stdin. The driver application would then just send a shutdown message over pipe when it is safe for the test application to die. -JB- > > Tested locally on my Mac. > > Thanks! > /peter From alex.schenkman at oracle.com Mon Oct 27 14:52:56 2014 From: alex.schenkman at oracle.com (Alex Schenkman) Date: Mon, 27 Oct 2014 15:52:56 +0100 Subject: RFR: JDK-8062137, JDK-8062136 Message-ID: <544E5C48.2070302@oracle.com> Please review two these excluded tests. http://cr.openjdk.java.net/~miauno/8062136_8062137/webrev.00/ Thank you! -- Alex Schenkman Java VM SQE Stockholm -------------- next part -------------- An HTML attachment was scrubbed... URL: From mattias.tobiasson at oracle.com Mon Oct 27 15:21:58 2014 From: mattias.tobiasson at oracle.com (Mattias Tobiasson) Date: Mon, 27 Oct 2014 08:21:58 -0700 (PDT) Subject: RFR 8061960: TestDaemonThread.java regularly fails due to exceeded timeout Message-ID: Hi, Could someone please review this simple fix. The current version times out, because the timeout option is at the wrong position on the command line. Bug: https://bugs.openjdk.java.net/browse/JDK-8061960 Webrev: http://cr.openjdk.java.net/~miauno/8060165/webrev.00/ Thanks, Mattias From jaroslav.bachorik at oracle.com Mon Oct 27 17:44:19 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 27 Oct 2014 18:44:19 +0100 Subject: RFR 8061960: TestDaemonThread.java regularly fails due to exceeded timeout In-Reply-To: References: Message-ID: <544E8473.9070707@oracle.com> Thumbs up! I wonder whether there are still some other tests with similarly displaced switches... -JB- On 10/27/2014 04:21 PM, Mattias Tobiasson wrote: > Hi, > Could someone please review this simple fix. > The current version times out, because the timeout option is at the wrong position on the command line. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8061960 > > Webrev: > http://cr.openjdk.java.net/~miauno/8060165/webrev.00/ > > Thanks, > Mattias > From jaroslav.bachorik at oracle.com Mon Oct 27 17:46:30 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 27 Oct 2014 18:46:30 +0100 Subject: RFR: JDK-8062137, JDK-8062136 In-Reply-To: <544E5C48.2070302@oracle.com> References: <544E5C48.2070302@oracle.com> Message-ID: <544E84F6.3070204@oracle.com> Looks good! -JB- On 10/27/2014 03:52 PM, Alex Schenkman wrote: > Please review two these excluded tests. > > http://cr.openjdk.java.net/~miauno/8062136_8062137/webrev.00/ > > > Thank you! > > -- > Alex Schenkman > Java VM SQE Stockholm > From jaroslav.bachorik at oracle.com Mon Oct 27 18:00:27 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 27 Oct 2014 19:00:27 +0100 Subject: RFR 8062070: com/sun/jdi/DoubleAgentTest.java.DoubleAgentTest fails intermittently after 8056143 In-Reply-To: <544D7B4F.6000201@oracle.com> References: <544A872E.8050301@oracle.com> <544D7B4F.6000201@oracle.com> Message-ID: <544E883B.4020703@oracle.com> On 10/26/2014 11:53 PM, David Holmes wrote: > On 25/10/2014 3:06 AM, Jaroslav Bachorik wrote: >> Please, review this change to the test library. >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-8062070 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8062070/webrev.00 >> >> The test started failing after the process.destroyForcibly() was added >> to ProcessTools.executeProcess() to make sure there were no orphaned >> processes once the test was finished. The destruction call was placed to >> the finally block and was called for running or terminated processes >> indiscriminately. >> >> It turns out that Process.get*Stream() methods will get confused when >> calling Process.destroy/Forcibly/() on an already exited process - the >> streams will get closed and any attempt to read or write to them will >> end with a SocketException. >> >> The failure is timing related - when the stream manages to buffer data >> before destroying the process from another thread the test passes. >> Otherwise it just fails. >> >> The solution is to forcibly close the external process (started by >> ProcessTools.executeProcess()) only in cases when eg. Process.waitFor() >> throws an exception - moving it from the finally block to the catch >> block. > > I can't help but think we are simply shifting the failure window around. > But the proof of this is in the testing. I don't know whether we can do any better while keeping the post-condition of having the process exited before leaving this method. But with this change we would hit this problem in cases when the test would be already failing anyway. I ran JPRT on all the available platforms and the test didn't fail once. Thanks for the reviews. -JB- > > Reviewed. > > David > > >> Thanks, >> >> -JB- From david.holmes at oracle.com Tue Oct 28 03:06:13 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 28 Oct 2014 13:06:13 +1000 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <544E03DE.4000308@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> <544D0CC9.3040903@oracle.com> <544D1014.10600@oracle.com> <544D858B.5050308@oracle.com> <544E03DE.4000308@oracle.com> Message-ID: <544F0825.7040905@oracle.com> Thanks for the explanation Ivan, but I don't see any comments added to the code. So this latest fix all hinges on whether the part of the exit logic that corrupts the process exit value, happens before or after the logic that will cause a thread waiting on a terminating thread's handle to unblock. If after then we still have the race - but at least the exiting thread has a slight head start over the process terminating thread. Thanks, David On 27/10/2014 6:35 PM, Ivan Gerasimov wrote: > > On 27.10.2014 3:36, David Holmes wrote: >> On 27/10/2014 1:15 AM, Ivan Gerasimov wrote: >>> David, would you approve this fix? >> >> Sorry Ivan I'm having trouble following the logic this time - could >> you add some comments about what we are checking at each step. > > Yes, sure. > > The main idea is to make the thread that ends the process wait for the > threads that had finished so far. > Thus, we have an array for storing the thread handles. > Any thread that is on thread-exit path, first tries to remove the > completed threads from the array (to keep the list smaller), and then > adds its own handle to the end of the array. > The thread that is on process-exit path, calls exit (or _exit), while > still owning the critical section. > This way we make sure, no other threads execute any exit-related code at > the same time. > > Here's a typical scenario: > 1) First thread that decided to end itself calls > exit_process_or_thread() -- let's assume it is on thread-exit path. > Initializes the critical section. > 2) Grabs the ownership of the crit. section > 3) The list of thread handles is initially empty, so the thread adds a > duplicate of its handle to the array. > 4) Releases the crit. section > 5) Calls _endthreadex() to terminate itself > > 6) Another thread enters exit_process_or_thread() -- let it be on > thread-exit path as well. > 7) Grabs the ownership of the crit. section > 8) In a loop checks if any previously ended thread has completed. > Here we call WaitForSingleObject with zero timeout, so we don't block. > All the handles of completed threads are closed. > 9) If there's is a free slot in the array, the thread adds its handle to > the end > 10) If the array is full (which is very unlikely), the thread waits for > ANY thread to complete, and then adds itself to the array. > 11) Releases the crit. section > 12) Calls _endthreadex() to terminate itself > > 13) Some thread enters exit_process_or_thread() in order to end the > whole process. > 14) Grabs the ownership of the crit. section > 15) Waits on all the threads that have added their handles to the array > (typically there will be only one such thread handle). > Since the ownership of the critical section is held, no other threads > will execute any exit-related code at this time. > 16) Once all the threads from the list have completed, the thread closes > the handles and calls exit() (or _exit()), holding the crit. section > ownership. > > We're done. > > Error handling: in a case of errors, we report them, and proceed with > exiting as usual. > - If initialization of critical section fails, we'll just call the > corresponding exit routine. > - If we failed, waiting for an exiting thread to complete, close its > handle as if it has completed. > - If we failed, waiting for any thread to complete withing a time-out > (array is full), close all the handles and continue as if there were no > threads exited before. > - If we couldn't duplicate the handle, ignore it (don't add it to the > array), so no one will wait for it later. > - If the thread on the process-exit path failed to wait for the threads > to complete withing the time-out, proceed to the exit anyway. > > All these errors should never happen during normal execution, but if > they do, we still try to end threads/process in a way it's done now. > In this, later case, we are at risk of observing a race condition. > However, the chances of this happening are much lesser, and in addition > we'll have a waring message to analyze. > > Possible bottlenecks. > 1) All the threads have to obtain the ownership of the critical section, > which effectively serializes all the exiting threads. > However, this doesn't appear to make things too much slower, as all the > threads already do similar thing in _endthreadex(). > 2) Normally, the threads don't block having ownership of the crit. section. > The block can only happen if there's no free slot in the array of handles. > This can only happen if MAX_EXIT_HANDLES (== 16) threads have just > called _endthreadex(), and none of them completed. > 3) When the thread at process-exit path waits for all the exiting > threads to complete, the time-out of 1 second is specified. > If any of those threads do not complete, this can lead to that the > application is delayed at the exit. > However, we don't block forever, and the delay can only be observed upon > a failure. > > >> Also we seem to exit while still holding the critical section - how >> does that work? >> > Right. > We make the thread at the process-exit path call exit() from withing > critical section block. > This way it is ensured no other exit-related code is executed at the > same moment, and a race is avoided. > > Sincerely yours, > Ivan > >> Thanks, >> David >> >>> Sincerely yours, >>> Ivan >>> >>> On 26.10.2014 19:01, Daniel D. Daugherty wrote: >>>> On 10/25/14 12:23 PM, Ivan Gerasimov wrote: >>>>> >>>>> On 25.10.2014 3:06, Daniel D. Daugherty wrote: >>>>>> On 10/1/14 3:07 AM, Ivan Gerasimov wrote: >>>>>>> Hello! >>>>>>> >>>>>>> The tests that continue to fail with wrong exit codes suggest that >>>>>>> the fix for JDK-8057744 wasn't sufficient. >>>>>>> Here's another proposal, which expands the synchronized portion of >>>>>>> the code. >>>>>>> It is proposed to make the exiting process wait for the threads >>>>>>> that have already started exiting. >>>>>>> This should help to make sure that no thread is executing any >>>>>>> potentially racy code concurrently with the exiting process. >>>>>>> >>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 >>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ >>>>>> >>>>>> Finally got a chance to look at the official version of fix. >>>>>> >>>>>> Thumbs up! >>>>>> >>>>>> src/os/windows/vm/os_windows.cpp >>>>>> No comments. >>>>>> >>>>> Thank you Daniel! >>>>> >>>>> I assume the change needs the second hotspot reviewer? >>>> >>>> Yes, HotSpot changes always need two reviewers. David Holmes >>>> chimed in on this thread. You should ask him if he can be >>>> counted as a reviewer. >>>> >>>> >>>>> What would be the best time for pushing this fix? >>>> >>>> Let's go for Wednesday again so we have a full week of testing >>>> to evaluate this latest tweak. >>>> >>>> Dan >>>> >>>> >>>>> >>>>> Sincerely yours, >>>>> Ivan >>>>> >>>>>> Dan >>>>>> >>>>>> P.S. >>>>>> We had another sighting of an exit_code == 60115 test failure >>>>>> this past week so while your previous fix greatly reduced the >>>>>> odds of this race, I'm looking forward to seeing this new >>>>>> version in action... >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Comments, suggestion are welcome! >>>>>>> >>>>>>> Sincerely yours, >>>>>>> Ivan >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> >>> >> >> > From mattias.tobiasson at oracle.com Tue Oct 28 11:01:33 2014 From: mattias.tobiasson at oracle.com (Mattias Tobiasson) Date: Tue, 28 Oct 2014 04:01:33 -0700 (PDT) Subject: RFR 8061960: TestDaemonThread.java regularly fails due to exceeded timeout Message-ID: <5bea6f43-7c9d-4811-8a79-617ea83d6e25@default> Thanks for the review Jaroslav! Could you please sponsor this patch and submit it? repository /jdk9/dev/jdk I have made a simple grep search in /jdk/test/... and /hotspot/test/... and have not found any other test with the same error. Mattias ----- Original Message ----- From: jaroslav.bachorik at oracle.com To: serviceability-dev at openjdk.java.net Sent: Monday, October 27, 2014 6:45:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: Re: RFR 8061960: TestDaemonThread.java regularly fails due to exceeded timeout Thumbs up! I wonder whether there are still some other tests with similarly displaced switches... -JB- On 10/27/2014 04:21 PM, Mattias Tobiasson wrote: > Hi, > Could someone please review this simple fix. > The current version times out, because the timeout option is at the wrong position on the command line. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8061960 > > Webrev: > http://cr.openjdk.java.net/~miauno/8060165/webrev.00/ > > Thanks, > Mattias > -------------- next part -------------- A non-text attachment was scrubbed... Name: 8061960.diff Type: text/x-patch Size: 917 bytes Desc: not available URL: From alex.schenkman at oracle.com Tue Oct 28 14:21:27 2014 From: alex.schenkman at oracle.com (Alex Schenkman) Date: Tue, 28 Oct 2014 15:21:27 +0100 Subject: Fwd: RFR: JDK-8062137, JDK-8062136 In-Reply-To: <544E5C48.2070302@oracle.com> References: <544E5C48.2070302@oracle.com> Message-ID: <544FA667.5050001@oracle.com> Jaroslav, Could you please push the attached changes for me? Thank you! -------- Original Message -------- Subject: RFR: JDK-8062137, JDK-8062136 Date: Mon, 27 Oct 2014 15:52:56 +0100 From: Alex Schenkman Organization: Oracle Svenska AB To: serviceability-dev at openjdk.java.net Please review two these excluded tests. http://cr.openjdk.java.net/~miauno/8062136_8062137/webrev.00/ Thank you! -- Alex Schenkman Java VM SQE Stockholm -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 8062136-8062137.patch Type: text/x-patch Size: 1121 bytes Desc: not available URL: From serguei.spitsyn at oracle.com Wed Oct 29 01:11:30 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 28 Oct 2014 18:11:30 -0700 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) Message-ID: <54503EC2.6070601@oracle.com> Please, review the fix for: https://bugs.openjdk.java.net/browse/JDK-6988950 Open webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ Summary: The failing scenario: The debugger and the debuggee are well aware a VM shutdown has been started in the target process. The debugger at this point is not expected to send any commands to the JDWP agent. However, the JDI layer (debugger side) and the jdwp agent (debuggee side) are not in sync with the consumer layers. One reason is because the test debugger does not invoke the JDI method VirtualMachine.dispose(). Another reason is that the Debugger and the debuggee processes are uneasy to sync in general. As a result the following steps are possible: - The test debugger sends a 'quit' command to the test debuggee - The debuggee is normally exiting - The jdwp backend reports (over the jdwp protocol) an anonymous class unload event - The JDI InternalEventHandler thread handles the ClassUnloadEvent event - The InternalEventHandler wants to uncache the matching reference type. If there is more than one class with the same host class signature, it can't distinguish them, and so, deletes all references and re-retrieves them again (see tracing below): MY_TRACE: JDI: VirtualMachineImpl.retrieveClassesBySignature: sig=Ljava/lang/invoke/LambdaForm$DMH; - The jdwp backend debugLoop_run() gets the command from JDI and calls the functions classesForSignature() and classStatus() recursively. - The classStatus() makes a call to the JVMTI GetClassStatus() and gets the JVMTI_ERROR_WRONG_PHASE - As a result the jdwp backend reports the JVMTI error to the JDI, and so, the test fails For details, see the analysis in bug report closed as a dup of the bug 6988950: https://bugs.openjdk.java.net/browse/JDK-8024865 Some similar cases can be found in the two bug reports (6988950 and 8024865) describing this issue. The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as it is normal at the VM shutdown. The original jdwp backend implementation had a similar approach for the raw monitor functions. Threy use the ignore_vm_death() to workaround the JVMTI_ERROR_WRONG_PHASE errors. For reference, please, see the file: src/share/back/util.c Testing: Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests Thanks, Serguei From sergei.kovalev at oracle.com Wed Oct 29 08:45:37 2014 From: sergei.kovalev at oracle.com (Sergei Kovalev) Date: Wed, 29 Oct 2014 11:45:37 +0300 Subject: RFR(S): 8060707: jdwp accept invalid address ':' In-Reply-To: <206DB8DD-AAC1-4AE1-A8B3-52EFD750A7C2@oracle.com> References: <544F788B.9050605@oracle.com> <544F8FA3.9050403@oracle.com> <544FA49A.3060700@oracle.com> <544FC9BD.7080106@oracle.com> <544FCE84.7080602@oracle.com> <206DB8DD-AAC1-4AE1-A8B3-52EFD750A7C2@oracle.com> Message-ID: <5450A931.40406@oracle.com> Hi Team, Could you please review following patch that fix the tests. Bug: https://bugs.openjdk.java.net/browse/JDK-8060707 Webrev: http://cr.openjdk.java.net/~iignatyev/skovalev/8060707/webrev.00/ Problem: A test failed because of wrong assumption that stderr always starts with "ERROR" string in case of error Cause: On some embedded platforms (ARM-SFLT) in case we provide VM option: "-XX:NativeMemoryTracking=detail" VM print out a warning first: "/NMT detail is not supported on this platform. Using NMT summary instead./" ant then printout error message. Solution: Look up "^ERROR: transport error " pattern in whole message, not only in the very begin. Testing done: Test launched manually on the host where it initially failed. Used different set of VM options to get different kind of output. -- With best regards, Sergei -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.schenkman at oracle.com Wed Oct 29 08:50:07 2014 From: alex.schenkman at oracle.com (Alex Schenkman) Date: Wed, 29 Oct 2014 09:50:07 +0100 Subject: RFR: JDK-8062135; serviceability/threads/TestFalseDeadLock.java should be quarantined Message-ID: <5450AA3F.4030309@oracle.com> Please review this exluded test: http://cr.openjdk.java.net/~miauno/8062135/webrev.00/ Thank you! -- Alex Schenkman Java VM SQE Stockholm -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaroslav.bachorik at oracle.com Wed Oct 29 08:50:48 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 29 Oct 2014 09:50:48 +0100 Subject: RFR(S): 8060707: jdwp accept invalid address ':' In-Reply-To: <5450A931.40406@oracle.com> References: <544F788B.9050605@oracle.com> <544F8FA3.9050403@oracle.com> <544FA49A.3060700@oracle.com> <544FC9BD.7080106@oracle.com> <544FCE84.7080602@oracle.com> <206DB8DD-AAC1-4AE1-A8B3-52EFD750A7C2@oracle.com> <5450A931.40406@oracle.com> Message-ID: <5450AA68.6010508@oracle.com> Looks good! -JB- On 10/29/2014 09:45 AM, Sergei Kovalev wrote: > Hi Team, > > Could you please review following patch that fix the tests. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8060707 > Webrev: http://cr.openjdk.java.net/~iignatyev/skovalev/8060707/webrev.00/ > > Problem: > A test failed because of wrong assumption that stderr always starts with > "ERROR" string in case of error > > Cause: > On some embedded platforms (ARM-SFLT) in case we provide VM option: > "-XX:NativeMemoryTracking=detail" VM print out a warning first: "/NMT > detail is not supported on this platform. Using NMT summary instead./" > ant then printout error message. > > Solution: > Look up "^ERROR: transport error " pattern in whole message, not only in > the very begin. > > Testing done: > Test launched manually on the host where it initially failed. Used > different set of VM options to get different kind of output. > > -- > With best regards, > Sergei > From ivan.gerasimov at oracle.com Wed Oct 29 09:24:25 2014 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Wed, 29 Oct 2014 12:24:25 +0300 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <544F0825.7040905@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> <544D0CC9.3040903@oracle.com> <544D1014.10600@oracle.com> <544D858B.5050308@oracle.com> <544E03DE.4000308@oracle.com> <544F0825.7040905@oracle.com> Message-ID: <5450B249.2020603@oracle.com> Thanks David for the comments! On 28.10.2014 6:06, David Holmes wrote: > Thanks for the explanation Ivan, but I don't see any comments added to > the code. > I've updated the webrev to include the comment that explain the logic of the workaround http://cr.openjdk.java.net/~igerasim/8059533/1/webrev/ Would you please take a look to see if they look sensible? > So this latest fix all hinges on whether the part of the exit logic > that corrupts the process exit value, happens before or after the > logic that will cause a thread waiting on a terminating thread's > handle to unblock. If after then we still have the race - but at least > the exiting thread has a slight head start over the process > terminating thread. > I have a couple of thought about this: First, the thread that ends the process waits for all the exiting threads to complete. By the time it returns from WaitForMultipleObjects, those exiting threads aren't running anymore, so the race is avoided (unless it's the race with scheduler). Second, the previous attempt, which utilized abandoning mutexes, reduced the frequency of this bug occurrences. The latest proposal extends the portion of synchronized code (I've checked that abandoning the mutex owned by an exiting thread normally happens before WaitFor(... exiting thread...) returns). Thus, if the race had happened during this time window, it should be eliminated now. Sincerely yours, Ivan > Thanks, > David > > On 27/10/2014 6:35 PM, Ivan Gerasimov wrote: >> >> On 27.10.2014 3:36, David Holmes wrote: >>> On 27/10/2014 1:15 AM, Ivan Gerasimov wrote: >>>> David, would you approve this fix? >>> >>> Sorry Ivan I'm having trouble following the logic this time - could >>> you add some comments about what we are checking at each step. >> >> Yes, sure. >> >> The main idea is to make the thread that ends the process wait for the >> threads that had finished so far. >> Thus, we have an array for storing the thread handles. >> Any thread that is on thread-exit path, first tries to remove the >> completed threads from the array (to keep the list smaller), and then >> adds its own handle to the end of the array. >> The thread that is on process-exit path, calls exit (or _exit), while >> still owning the critical section. >> This way we make sure, no other threads execute any exit-related code at >> the same time. >> >> Here's a typical scenario: >> 1) First thread that decided to end itself calls >> exit_process_or_thread() -- let's assume it is on thread-exit path. >> Initializes the critical section. >> 2) Grabs the ownership of the crit. section >> 3) The list of thread handles is initially empty, so the thread adds a >> duplicate of its handle to the array. >> 4) Releases the crit. section >> 5) Calls _endthreadex() to terminate itself >> >> 6) Another thread enters exit_process_or_thread() -- let it be on >> thread-exit path as well. >> 7) Grabs the ownership of the crit. section >> 8) In a loop checks if any previously ended thread has completed. >> Here we call WaitForSingleObject with zero timeout, so we don't block. >> All the handles of completed threads are closed. >> 9) If there's is a free slot in the array, the thread adds its handle to >> the end >> 10) If the array is full (which is very unlikely), the thread waits for >> ANY thread to complete, and then adds itself to the array. >> 11) Releases the crit. section >> 12) Calls _endthreadex() to terminate itself >> >> 13) Some thread enters exit_process_or_thread() in order to end the >> whole process. >> 14) Grabs the ownership of the crit. section >> 15) Waits on all the threads that have added their handles to the array >> (typically there will be only one such thread handle). >> Since the ownership of the critical section is held, no other threads >> will execute any exit-related code at this time. >> 16) Once all the threads from the list have completed, the thread closes >> the handles and calls exit() (or _exit()), holding the crit. section >> ownership. >> >> We're done. >> >> Error handling: in a case of errors, we report them, and proceed with >> exiting as usual. >> - If initialization of critical section fails, we'll just call the >> corresponding exit routine. >> - If we failed, waiting for an exiting thread to complete, close its >> handle as if it has completed. >> - If we failed, waiting for any thread to complete withing a time-out >> (array is full), close all the handles and continue as if there were no >> threads exited before. >> - If we couldn't duplicate the handle, ignore it (don't add it to the >> array), so no one will wait for it later. >> - If the thread on the process-exit path failed to wait for the threads >> to complete withing the time-out, proceed to the exit anyway. >> >> All these errors should never happen during normal execution, but if >> they do, we still try to end threads/process in a way it's done now. >> In this, later case, we are at risk of observing a race condition. >> However, the chances of this happening are much lesser, and in addition >> we'll have a waring message to analyze. >> >> Possible bottlenecks. >> 1) All the threads have to obtain the ownership of the critical section, >> which effectively serializes all the exiting threads. >> However, this doesn't appear to make things too much slower, as all the >> threads already do similar thing in _endthreadex(). >> 2) Normally, the threads don't block having ownership of the crit. >> section. >> The block can only happen if there's no free slot in the array of >> handles. >> This can only happen if MAX_EXIT_HANDLES (== 16) threads have just >> called _endthreadex(), and none of them completed. >> 3) When the thread at process-exit path waits for all the exiting >> threads to complete, the time-out of 1 second is specified. >> If any of those threads do not complete, this can lead to that the >> application is delayed at the exit. >> However, we don't block forever, and the delay can only be observed upon >> a failure. >> >> >>> Also we seem to exit while still holding the critical section - how >>> does that work? >>> >> Right. >> We make the thread at the process-exit path call exit() from withing >> critical section block. >> This way it is ensured no other exit-related code is executed at the >> same moment, and a race is avoided. >> >> Sincerely yours, >> Ivan >> >>> Thanks, >>> David >>> >>>> Sincerely yours, >>>> Ivan >>>> >>>> On 26.10.2014 19:01, Daniel D. Daugherty wrote: >>>>> On 10/25/14 12:23 PM, Ivan Gerasimov wrote: >>>>>> >>>>>> On 25.10.2014 3:06, Daniel D. Daugherty wrote: >>>>>>> On 10/1/14 3:07 AM, Ivan Gerasimov wrote: >>>>>>>> Hello! >>>>>>>> >>>>>>>> The tests that continue to fail with wrong exit codes suggest that >>>>>>>> the fix for JDK-8057744 wasn't sufficient. >>>>>>>> Here's another proposal, which expands the synchronized portion of >>>>>>>> the code. >>>>>>>> It is proposed to make the exiting process wait for the threads >>>>>>>> that have already started exiting. >>>>>>>> This should help to make sure that no thread is executing any >>>>>>>> potentially racy code concurrently with the exiting process. >>>>>>>> >>>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 >>>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ >>>>>>> >>>>>>> Finally got a chance to look at the official version of fix. >>>>>>> >>>>>>> Thumbs up! >>>>>>> >>>>>>> src/os/windows/vm/os_windows.cpp >>>>>>> No comments. >>>>>>> >>>>>> Thank you Daniel! >>>>>> >>>>>> I assume the change needs the second hotspot reviewer? >>>>> >>>>> Yes, HotSpot changes always need two reviewers. David Holmes >>>>> chimed in on this thread. You should ask him if he can be >>>>> counted as a reviewer. >>>>> >>>>> >>>>>> What would be the best time for pushing this fix? >>>>> >>>>> Let's go for Wednesday again so we have a full week of testing >>>>> to evaluate this latest tweak. >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> Sincerely yours, >>>>>> Ivan >>>>>> >>>>>>> Dan >>>>>>> >>>>>>> P.S. >>>>>>> We had another sighting of an exit_code == 60115 test failure >>>>>>> this past week so while your previous fix greatly reduced the >>>>>>> odds of this race, I'm looking forward to seeing this new >>>>>>> version in action... >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Comments, suggestion are welcome! >>>>>>>> >>>>>>>> Sincerely yours, >>>>>>>> Ivan >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>> >>> >> > > From dmitry.samersoff at oracle.com Wed Oct 29 10:06:25 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 29 Oct 2014 13:06:25 +0300 Subject: RFR(S): 8060707: jdwp accept invalid address ':' In-Reply-To: <5450A931.40406@oracle.com> References: <544F788B.9050605@oracle.com> <544F8FA3.9050403@oracle.com> <544FA49A.3060700@oracle.com> <544FC9BD.7080106@oracle.com> <544FCE84.7080602@oracle.com> <206DB8DD-AAC1-4AE1-A8B3-52EFD750A7C2@oracle.com> <5450A931.40406@oracle.com> Message-ID: <5450BC21.9050101@oracle.com> Looks good for me. -Dmitry On 2014-10-29 11:45, Sergei Kovalev wrote: > Hi Team, > > Could you please review following patch that fix the tests. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8060707 > Webrev: http://cr.openjdk.java.net/~iignatyev/skovalev/8060707/webrev.00/ > > Problem: > A test failed because of wrong assumption that stderr always starts with > "ERROR" string in case of error > > Cause: > On some embedded platforms (ARM-SFLT) in case we provide VM option: > "-XX:NativeMemoryTracking=detail" VM print out a warning first: "/NMT > detail is not supported on this platform. Using NMT summary instead./" > ant then printout error message. > > Solution: > Look up "^ERROR: transport error " pattern in whole message, not only in > the very begin. > > Testing done: > Test launched manually on the host where it initially failed. Used > different set of VM options to get different kind of output. > > -- > With best regards, > Sergei > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From david.holmes at oracle.com Wed Oct 29 10:15:18 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 29 Oct 2014 20:15:18 +1000 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <5450B249.2020603@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> <544D0CC9.3040903@oracle.com> <544D1014.10600@oracle.com> <544D858B.5050308@oracle.com> <544E03DE.4000308@oracle.com> <544F0825.7040905@oracle.com> <5450B249.2020603@oracle.com> Message-ID: <5450BE36.5000405@oracle.com> HI Ivan, On 29/10/2014 7:24 PM, Ivan Gerasimov wrote: > Thanks David for the comments! > > On 28.10.2014 6:06, David Holmes wrote: >> Thanks for the explanation Ivan, but I don't see any comments added to >> the code. >> > I've updated the webrev to include the comment that explain the logic of > the workaround > http://cr.openjdk.java.net/~igerasim/8059533/1/webrev/ > > Would you please take a look to see if they look sensible? Great improvement thanks! One further suggestion - after line 3731 add a short overview eg: // Basic approach: // - Each exiting thread registers its intent to exit and then does so. // - A thread trying to terminate the process must wait for all // threads currently exiting to complete their exit Typo: 3802 // _endthreadex() comleted. >> So this latest fix all hinges on whether the part of the exit logic >> that corrupts the process exit value, happens before or after the >> logic that will cause a thread waiting on a terminating thread's >> handle to unblock. If after then we still have the race - but at least >> the exiting thread has a slight head start over the process >> terminating thread. >> > > I have a couple of thought about this: > > First, the thread that ends the process waits for all the exiting > threads to complete. By the time it returns from WaitForMultipleObjects, > those exiting threads aren't running anymore, so the race is avoided > (unless it's the race with scheduler). That's not quite true and the key part of my point. At some point during the thread termination process the exiting thread has to signal any waiting thread - and it is still running at the point. We don't know whether the action that causes interference with the process termination happens before or after the signalling is done. Anyway this narrows the window even further even if it may not close it completely. Cheers, David > Second, the previous attempt, which utilized abandoning mutexes, reduced > the frequency of this bug occurrences. > The latest proposal extends the portion of synchronized code (I've > checked that abandoning the mutex owned by an exiting thread normally > happens before WaitFor(... exiting thread...) returns). > Thus, if the race had happened during this time window, it should be > eliminated now. > > Sincerely yours, > Ivan > >> Thanks, >> David >> >> On 27/10/2014 6:35 PM, Ivan Gerasimov wrote: >>> >>> On 27.10.2014 3:36, David Holmes wrote: >>>> On 27/10/2014 1:15 AM, Ivan Gerasimov wrote: >>>>> David, would you approve this fix? >>>> >>>> Sorry Ivan I'm having trouble following the logic this time - could >>>> you add some comments about what we are checking at each step. >>> >>> Yes, sure. >>> >>> The main idea is to make the thread that ends the process wait for the >>> threads that had finished so far. >>> Thus, we have an array for storing the thread handles. >>> Any thread that is on thread-exit path, first tries to remove the >>> completed threads from the array (to keep the list smaller), and then >>> adds its own handle to the end of the array. >>> The thread that is on process-exit path, calls exit (or _exit), while >>> still owning the critical section. >>> This way we make sure, no other threads execute any exit-related code at >>> the same time. >>> >>> Here's a typical scenario: >>> 1) First thread that decided to end itself calls >>> exit_process_or_thread() -- let's assume it is on thread-exit path. >>> Initializes the critical section. >>> 2) Grabs the ownership of the crit. section >>> 3) The list of thread handles is initially empty, so the thread adds a >>> duplicate of its handle to the array. >>> 4) Releases the crit. section >>> 5) Calls _endthreadex() to terminate itself >>> >>> 6) Another thread enters exit_process_or_thread() -- let it be on >>> thread-exit path as well. >>> 7) Grabs the ownership of the crit. section >>> 8) In a loop checks if any previously ended thread has completed. >>> Here we call WaitForSingleObject with zero timeout, so we don't block. >>> All the handles of completed threads are closed. >>> 9) If there's is a free slot in the array, the thread adds its handle to >>> the end >>> 10) If the array is full (which is very unlikely), the thread waits for >>> ANY thread to complete, and then adds itself to the array. >>> 11) Releases the crit. section >>> 12) Calls _endthreadex() to terminate itself >>> >>> 13) Some thread enters exit_process_or_thread() in order to end the >>> whole process. >>> 14) Grabs the ownership of the crit. section >>> 15) Waits on all the threads that have added their handles to the array >>> (typically there will be only one such thread handle). >>> Since the ownership of the critical section is held, no other threads >>> will execute any exit-related code at this time. >>> 16) Once all the threads from the list have completed, the thread closes >>> the handles and calls exit() (or _exit()), holding the crit. section >>> ownership. >>> >>> We're done. >>> >>> Error handling: in a case of errors, we report them, and proceed with >>> exiting as usual. >>> - If initialization of critical section fails, we'll just call the >>> corresponding exit routine. >>> - If we failed, waiting for an exiting thread to complete, close its >>> handle as if it has completed. >>> - If we failed, waiting for any thread to complete withing a time-out >>> (array is full), close all the handles and continue as if there were no >>> threads exited before. >>> - If we couldn't duplicate the handle, ignore it (don't add it to the >>> array), so no one will wait for it later. >>> - If the thread on the process-exit path failed to wait for the threads >>> to complete withing the time-out, proceed to the exit anyway. >>> >>> All these errors should never happen during normal execution, but if >>> they do, we still try to end threads/process in a way it's done now. >>> In this, later case, we are at risk of observing a race condition. >>> However, the chances of this happening are much lesser, and in addition >>> we'll have a waring message to analyze. >>> >>> Possible bottlenecks. >>> 1) All the threads have to obtain the ownership of the critical section, >>> which effectively serializes all the exiting threads. >>> However, this doesn't appear to make things too much slower, as all the >>> threads already do similar thing in _endthreadex(). >>> 2) Normally, the threads don't block having ownership of the crit. >>> section. >>> The block can only happen if there's no free slot in the array of >>> handles. >>> This can only happen if MAX_EXIT_HANDLES (== 16) threads have just >>> called _endthreadex(), and none of them completed. >>> 3) When the thread at process-exit path waits for all the exiting >>> threads to complete, the time-out of 1 second is specified. >>> If any of those threads do not complete, this can lead to that the >>> application is delayed at the exit. >>> However, we don't block forever, and the delay can only be observed upon >>> a failure. >>> >>> >>>> Also we seem to exit while still holding the critical section - how >>>> does that work? >>>> >>> Right. >>> We make the thread at the process-exit path call exit() from withing >>> critical section block. >>> This way it is ensured no other exit-related code is executed at the >>> same moment, and a race is avoided. >>> >>> Sincerely yours, >>> Ivan >>> >>>> Thanks, >>>> David >>>> >>>>> Sincerely yours, >>>>> Ivan >>>>> >>>>> On 26.10.2014 19:01, Daniel D. Daugherty wrote: >>>>>> On 10/25/14 12:23 PM, Ivan Gerasimov wrote: >>>>>>> >>>>>>> On 25.10.2014 3:06, Daniel D. Daugherty wrote: >>>>>>>> On 10/1/14 3:07 AM, Ivan Gerasimov wrote: >>>>>>>>> Hello! >>>>>>>>> >>>>>>>>> The tests that continue to fail with wrong exit codes suggest that >>>>>>>>> the fix for JDK-8057744 wasn't sufficient. >>>>>>>>> Here's another proposal, which expands the synchronized portion of >>>>>>>>> the code. >>>>>>>>> It is proposed to make the exiting process wait for the threads >>>>>>>>> that have already started exiting. >>>>>>>>> This should help to make sure that no thread is executing any >>>>>>>>> potentially racy code concurrently with the exiting process. >>>>>>>>> >>>>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 >>>>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ >>>>>>>> >>>>>>>> Finally got a chance to look at the official version of fix. >>>>>>>> >>>>>>>> Thumbs up! >>>>>>>> >>>>>>>> src/os/windows/vm/os_windows.cpp >>>>>>>> No comments. >>>>>>>> >>>>>>> Thank you Daniel! >>>>>>> >>>>>>> I assume the change needs the second hotspot reviewer? >>>>>> >>>>>> Yes, HotSpot changes always need two reviewers. David Holmes >>>>>> chimed in on this thread. You should ask him if he can be >>>>>> counted as a reviewer. >>>>>> >>>>>> >>>>>>> What would be the best time for pushing this fix? >>>>>> >>>>>> Let's go for Wednesday again so we have a full week of testing >>>>>> to evaluate this latest tweak. >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> >>>>>>> Sincerely yours, >>>>>>> Ivan >>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> P.S. >>>>>>>> We had another sighting of an exit_code == 60115 test failure >>>>>>>> this past week so while your previous fix greatly reduced the >>>>>>>> odds of this race, I'm looking forward to seeing this new >>>>>>>> version in action... >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Comments, suggestion are welcome! >>>>>>>>> >>>>>>>>> Sincerely yours, >>>>>>>>> Ivan >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >> >> > From ivan.gerasimov at oracle.com Wed Oct 29 10:56:29 2014 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Wed, 29 Oct 2014 13:56:29 +0300 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <5450BE36.5000405@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> <544D0CC9.3040903@oracle.com> <544D1014.10600@oracle.com> <544D858B.5050308@oracle.com> <544E03DE.4000308@oracle.com> <544F0825.7040905@oracle.com> <5450B249.2020603@oracle.com> <5450BE36.5000405@oracle.com> Message-ID: <5450C7DD.9030600@oracle.com> On 29.10.2014 13:15, David Holmes wrote: > > Great improvement thanks! One further suggestion - after line 3731 add > a short overview eg: > > // Basic approach: > // - Each exiting thread registers its intent to exit and then does so. > // - A thread trying to terminate the process must wait for all > // threads currently exiting to complete their exit > > > Typo: > 3802 // _endthreadex() comleted. > Thanks! Fixed the typo and added the comment. The updated webrev is at the same location: http://cr.openjdk.java.net/~igerasim/8059533/1/webrev/ > >>> >> >> I have a couple of thought about this: >> >> First, the thread that ends the process waits for all the exiting >> threads to complete. By the time it returns from WaitForMultipleObjects, >> those exiting threads aren't running anymore, so the race is avoided >> (unless it's the race with scheduler). > > That's not quite true and the key part of my point. At some point > during the thread termination process the exiting thread has to signal > any waiting thread - and it is still running at the point. We don't > know whether the action that causes interference with the process > termination happens before or after the signalling is done. > I was thinking if the dedicated scheduler thread can do the signalling. Otherwise, some poorly behaving thread could die without updating its status, leaving other threads waiting for its completion forever. Though, I don't know how it's actually done. I've read in several places (for example, a comment at the bottom of [1]), that waiting for a thread to complete with WaitForXXX() function ensures the exit code of that thread is set. In particular, GetExitCodeThread() should not return STILL_ACTIVE for that thread anymore. This lets me hope that the code, which sets the exit code (which is potentially racy) has already been executed by the time WaitForXXX() returns. [1] http://msdn.microsoft.com/en-us/library/windows/desktop/ms683190(v=vs.85).aspx > Anyway this narrows the window even further even if it may not close > it completely. > That's my hope too :) Otherwise, I'll be (almost) out of ideas how to work this issue around. Sincerely yours, Ivan From staffan.larsen at oracle.com Wed Oct 29 11:52:52 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 29 Oct 2014 12:52:52 +0100 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <54503EC2.6070601@oracle.com> References: <54503EC2.6070601@oracle.com> Message-ID: This looks good - and thanks for the detailed description! My only comment is that you chose to ignore the error in only some of the methods - I assume these were the methods where you have encountered a problem in. Should we write a comment about that, or should we ignore the error in all methods? /Staffan On 29 okt 2014, at 02:11, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6988950 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ > > > Summary: > > The failing scenario: > The debugger and the debuggee are well aware a VM shutdown has been started in the target process. > The debugger at this point is not expected to send any commands to the JDWP agent. > However, the JDI layer (debugger side) and the jdwp agent (debuggee side) > are not in sync with the consumer layers. > > One reason is because the test debugger does not invoke the JDI method VirtualMachine.dispose(). > Another reason is that the Debugger and the debuggee processes are uneasy to sync in general. > > As a result the following steps are possible: > - The test debugger sends a 'quit' command to the test debuggee > - The debuggee is normally exiting > - The jdwp backend reports (over the jdwp protocol) an anonymous class unload event > - The JDI InternalEventHandler thread handles the ClassUnloadEvent event > - The InternalEventHandler wants to uncache the matching reference type. > If there is more than one class with the same host class signature, it can't distinguish them, > and so, deletes all references and re-retrieves them again (see tracing below): > MY_TRACE: JDI: VirtualMachineImpl.retrieveClassesBySignature: sig=Ljava/lang/invoke/LambdaForm$DMH; > - The jdwp backend debugLoop_run() gets the command from JDI and calls the functions > classesForSignature() and classStatus() recursively. > - The classStatus() makes a call to the JVMTI GetClassStatus() and gets the JVMTI_ERROR_WRONG_PHASE > - As a result the jdwp backend reports the JVMTI error to the JDI, and so, the test fails > > For details, see the analysis in bug report closed as a dup of the bug 6988950: > https://bugs.openjdk.java.net/browse/JDK-8024865 > > Some similar cases can be found in the two bug reports (6988950 and 8024865) describing this issue. > > The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as it is normal at the VM shutdown. > The original jdwp backend implementation had a similar approach for the raw monitor functions. > Threy use the ignore_vm_death() to workaround the JVMTI_ERROR_WRONG_PHASE errors. > For reference, please, see the file: src/share/back/util.c > > > Testing: > Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests > > > Thanks, > Serguei > From staffan.larsen at oracle.com Wed Oct 29 11:56:31 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Wed, 29 Oct 2014 12:56:31 +0100 Subject: RFR: JDK-8062135; serviceability/threads/TestFalseDeadLock.java should be quarantined In-Reply-To: <5450AA3F.4030309@oracle.com> References: <5450AA3F.4030309@oracle.com> Message-ID: Looks good! Thanks, /Staffan On 29 okt 2014, at 09:50, Alex Schenkman wrote: > Please review this exluded test: > > http://cr.openjdk.java.net/~miauno/8062135/webrev.00/ > > Thank you! > -- > Alex Schenkman > Java VM SQE Stockholm -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Oct 29 12:23:17 2014 From: david.holmes at oracle.com (David Holmes) Date: Wed, 29 Oct 2014 22:23:17 +1000 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <5450C7DD.9030600@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> <544D0CC9.3040903@oracle.com> <544D1014.10600@oracle.com> <544D858B.5050308@oracle.com> <544E03DE.4000308@oracle.com> <544F0825.7040905@oracle.com> <5450B249.2020603@oracle.com> <5450BE36.5000405@oracle.com> <5450C7DD.9030600@oracle.com> Message-ID: <5450DC35.4090104@oracle.com> Thanks Ivan - good to go. David On 29/10/2014 8:56 PM, Ivan Gerasimov wrote: > > On 29.10.2014 13:15, David Holmes wrote: >> >> Great improvement thanks! One further suggestion - after line 3731 add >> a short overview eg: >> >> // Basic approach: >> // - Each exiting thread registers its intent to exit and then does so. >> // - A thread trying to terminate the process must wait for all >> // threads currently exiting to complete their exit >> >> >> Typo: >> 3802 // _endthreadex() comleted. >> > Thanks! Fixed the typo and added the comment. > The updated webrev is at the same location: > http://cr.openjdk.java.net/~igerasim/8059533/1/webrev/ > >> >>>> >>> >>> I have a couple of thought about this: >>> >>> First, the thread that ends the process waits for all the exiting >>> threads to complete. By the time it returns from WaitForMultipleObjects, >>> those exiting threads aren't running anymore, so the race is avoided >>> (unless it's the race with scheduler). >> >> That's not quite true and the key part of my point. At some point >> during the thread termination process the exiting thread has to signal >> any waiting thread - and it is still running at the point. We don't >> know whether the action that causes interference with the process >> termination happens before or after the signalling is done. >> > I was thinking if the dedicated scheduler thread can do the signalling. > Otherwise, some poorly behaving thread could die without updating its > status, leaving other threads waiting for its completion forever. > Though, I don't know how it's actually done. > > I've read in several places (for example, a comment at the bottom of > [1]), that waiting for a thread to complete with WaitForXXX() function > ensures the exit code of that thread is set. In particular, > GetExitCodeThread() should not return STILL_ACTIVE for that thread anymore. > > This lets me hope that the code, which sets the exit code (which is > potentially racy) has already been executed by the time WaitForXXX() > returns. > > [1] > http://msdn.microsoft.com/en-us/library/windows/desktop/ms683190(v=vs.85).aspx > > >> Anyway this narrows the window even further even if it may not close >> it completely. >> > > That's my hope too :) > Otherwise, I'll be (almost) out of ideas how to work this issue around. > > Sincerely yours, > Ivan > > From alex.schenkman at oracle.com Wed Oct 29 13:33:44 2014 From: alex.schenkman at oracle.com (Alex Schenkman) Date: Wed, 29 Oct 2014 14:33:44 +0100 Subject: RFR: JDK-8062135; serviceability/threads/TestFalseDeadLock.java should be quarantined In-Reply-To: References: <5450AA3F.4030309@oracle.com> Message-ID: <5450ECB8.3060806@oracle.com> Could you push it for me, please? Thanks! On 2014-10-29 12:56, Staffan Larsen wrote: > Looks good! > > Thanks, > /Staffan > > > On 29 okt 2014, at 09:50, Alex Schenkman > wrote: > >> Please review this exluded test: >> >> http://cr.openjdk.java.net/~miauno/8062135/webrev.00/ >> >> >> Thank you! >> -- >> Alex Schenkman >> Java VM SQE Stockholm > -- Alex Schenkman Java VM SQE Stockholm -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 8062135.patch Type: text/x-patch Size: 798 bytes Desc: not available URL: From daniel.daugherty at oracle.com Wed Oct 29 13:46:37 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 29 Oct 2014 07:46:37 -0600 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <54503EC2.6070601@oracle.com> References: <54503EC2.6070601@oracle.com> Message-ID: <5450EFBD.8060504@oracle.com> Serguei, Do you have a scenario for the non-anonymous class case? This bug (6988950) has been around much longer than anonymous classes... Dan On 10/28/14 7:11 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6988950 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ > > > > Summary: > > The failing scenario: > The debugger and the debuggee are well aware a VM shutdown has > been started in the target process. > The debugger at this point is not expected to send any commands > to the JDWP agent. > However, the JDI layer (debugger side) and the jdwp agent > (debuggee side) > are not in sync with the consumer layers. > > One reason is because the test debugger does not invoke the JDI > method VirtualMachine.dispose(). > Another reason is that the Debugger and the debuggee processes > are uneasy to sync in general. > > As a result the following steps are possible: > - The test debugger sends a 'quit' command to the test debuggee > - The debuggee is normally exiting > - The jdwp backend reports (over the jdwp protocol) an > anonymous class unload event > - The JDI InternalEventHandler thread handles the > ClassUnloadEvent event > - The InternalEventHandler wants to uncache the matching > reference type. > If there is more than one class with the same host class > signature, it can't distinguish them, > and so, deletes all references and re-retrieves them again > (see tracing below): > MY_TRACE: JDI: > VirtualMachineImpl.retrieveClassesBySignature: > sig=Ljava/lang/invoke/LambdaForm$DMH; > - The jdwp backend debugLoop_run() gets the command from JDI > and calls the functions > classesForSignature() and classStatus() recursively. > - The classStatus() makes a call to the JVMTI GetClassStatus() > and gets the JVMTI_ERROR_WRONG_PHASE > - As a result the jdwp backend reports the JVMTI error to the > JDI, and so, the test fails > > For details, see the analysis in bug report closed as a dup of > the bug 6988950: > https://bugs.openjdk.java.net/browse/JDK-8024865 > > Some similar cases can be found in the two bug reports (6988950 > and 8024865) describing this issue. > > The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as > it is normal at the VM shutdown. > The original jdwp backend implementation had a similar approach > for the raw monitor functions. > Threy use the ignore_vm_death() to workaround the > JVMTI_ERROR_WRONG_PHASE errors. > For reference, please, see the file: src/share/back/util.c > > > Testing: > Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests > > > Thanks, > Serguei > From jaroslav.bachorik at oracle.com Wed Oct 29 13:50:04 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 29 Oct 2014 14:50:04 +0100 Subject: RFR 8061616: HotspotDiagnosticMXBean.getVMOption() throws IllegalArgumentException for flags of type double Message-ID: <5450F08C.5050804@oracle.com> Please review the following change Issue : https://bugs.openjdk.java.net/browse/JDK-8061616 Webrev : (jdk): http://cr.openjdk.java.net/~jbachorik/8061616/jdk/webrev.00/ (hotspot): http://cr.openjdk.java.net/~jbachorik/8061616/hotspot/webrev.00/ Currently the double values for VM flags are not supported in HotspotDiagnosticMXBean. This change is about implementing this support similarly as it is done for eg. long values. Thanks, -JB- From dmitry.samersoff at oracle.com Wed Oct 29 13:59:32 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 29 Oct 2014 16:59:32 +0300 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <54503EC2.6070601@oracle.com> References: <54503EC2.6070601@oracle.com> Message-ID: <5450F2C4.9020604@oracle.com> Serguei, What happens in a caller function if we ignore the error? e. g. getMethodClass has the code: ... error = methodClass(method, &clazz); if ( error != JVMTI_ERROR_NONE ) { EXIT_ERROR(error,"Can't get jclass for a methodID, invalid?"); return NULL; } return clazz; after the fix it probably will return NULL. Is it correct? -Dmitry On 2014-10-29 04:11, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6988950 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ > > > > Summary: > > The failing scenario: > The debugger and the debuggee are well aware a VM shutdown has been > started in the target process. > The debugger at this point is not expected to send any commands to > the JDWP agent. > However, the JDI layer (debugger side) and the jdwp agent (debuggee > side) > are not in sync with the consumer layers. > > One reason is because the test debugger does not invoke the JDI > method VirtualMachine.dispose(). > Another reason is that the Debugger and the debuggee processes are > uneasy to sync in general. > > As a result the following steps are possible: > - The test debugger sends a 'quit' command to the test debuggee > - The debuggee is normally exiting > - The jdwp backend reports (over the jdwp protocol) an anonymous > class unload event > - The JDI InternalEventHandler thread handles the > ClassUnloadEvent event > - The InternalEventHandler wants to uncache the matching > reference type. > If there is more than one class with the same host class > signature, it can't distinguish them, > and so, deletes all references and re-retrieves them again (see > tracing below): > MY_TRACE: JDI: VirtualMachineImpl.retrieveClassesBySignature: > sig=Ljava/lang/invoke/LambdaForm$DMH; > - The jdwp backend debugLoop_run() gets the command from JDI and > calls the functions > classesForSignature() and classStatus() recursively. > - The classStatus() makes a call to the JVMTI GetClassStatus() > and gets the JVMTI_ERROR_WRONG_PHASE > - As a result the jdwp backend reports the JVMTI error to the > JDI, and so, the test fails > > For details, see the analysis in bug report closed as a dup of the > bug 6988950: > https://bugs.openjdk.java.net/browse/JDK-8024865 > > Some similar cases can be found in the two bug reports (6988950 and > 8024865) describing this issue. > > The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as > it is normal at the VM shutdown. > The original jdwp backend implementation had a similar approach for > the raw monitor functions. > Threy use the ignore_vm_death() to workaround the > JVMTI_ERROR_WRONG_PHASE errors. > For reference, please, see the file: src/share/back/util.c > > > Testing: > Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests > > > Thanks, > Serguei > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From serguei.spitsyn at oracle.com Wed Oct 29 14:09:14 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 29 Oct 2014 07:09:14 -0700 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5450EFBD.8060504@oracle.com> References: <54503EC2.6070601@oracle.com> <5450EFBD.8060504@oracle.com> Message-ID: <5450F50A.7070903@oracle.com> Dan, I do not have a scenario for non-anonymous class case while they should exist. The anonymous scenario was well reproducible in a range of the jdk8 ea builds. However, it is not as such anymore. I was not able to reproduce it in jdk9, nor 8u40. But I believe the shutdown race issue is still there and can hit anytime. Thanks, Serguei On 10/29/14 6:46 AM, Daniel D. Daugherty wrote: > Serguei, > > Do you have a scenario for the non-anonymous class case? This bug > (6988950) has been around much longer than anonymous classes... > > Dan > > > On 10/28/14 7:11 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6988950 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >> >> >> >> Summary: >> >> The failing scenario: >> The debugger and the debuggee are well aware a VM shutdown has >> been started in the target process. >> The debugger at this point is not expected to send any commands >> to the JDWP agent. >> However, the JDI layer (debugger side) and the jdwp agent >> (debuggee side) >> are not in sync with the consumer layers. >> >> One reason is because the test debugger does not invoke the JDI >> method VirtualMachine.dispose(). >> Another reason is that the Debugger and the debuggee processes >> are uneasy to sync in general. >> >> As a result the following steps are possible: >> - The test debugger sends a 'quit' command to the test debuggee >> - The debuggee is normally exiting >> - The jdwp backend reports (over the jdwp protocol) an >> anonymous class unload event >> - The JDI InternalEventHandler thread handles the >> ClassUnloadEvent event >> - The InternalEventHandler wants to uncache the matching >> reference type. >> If there is more than one class with the same host class >> signature, it can't distinguish them, >> and so, deletes all references and re-retrieves them again >> (see tracing below): >> MY_TRACE: JDI: >> VirtualMachineImpl.retrieveClassesBySignature: >> sig=Ljava/lang/invoke/LambdaForm$DMH; >> - The jdwp backend debugLoop_run() gets the command from JDI >> and calls the functions >> classesForSignature() and classStatus() recursively. >> - The classStatus() makes a call to the JVMTI GetClassStatus() >> and gets the JVMTI_ERROR_WRONG_PHASE >> - As a result the jdwp backend reports the JVMTI error to the >> JDI, and so, the test fails >> >> For details, see the analysis in bug report closed as a dup of >> the bug 6988950: >> https://bugs.openjdk.java.net/browse/JDK-8024865 >> >> Some similar cases can be found in the two bug reports (6988950 >> and 8024865) describing this issue. >> >> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >> as it is normal at the VM shutdown. >> The original jdwp backend implementation had a similar approach >> for the raw monitor functions. >> Threy use the ignore_vm_death() to workaround the >> JVMTI_ERROR_WRONG_PHASE errors. >> For reference, please, see the file: src/share/back/util.c >> >> >> Testing: >> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >> >> >> Thanks, >> Serguei >> > From serguei.spitsyn at oracle.com Wed Oct 29 14:14:16 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 29 Oct 2014 07:14:16 -0700 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: References: <54503EC2.6070601@oracle.com> Message-ID: <5450F638.70503@oracle.com> Staffan, Thank you for the review! I agree that maybe it'd make sense to ignore in all functions. But I'm trying to be careful - let's see if this approach works first. It is a good idea to add a comment. Thanks, Serguei On 10/29/14 4:52 AM, Staffan Larsen wrote: > This looks good - and thanks for the detailed description! > > My only comment is that you chose to ignore the error in only some of the methods - I assume these were the methods where you have encountered a problem in. Should we write a comment about that, or should we ignore the error in all methods? > > /Staffan > > On 29 okt 2014, at 02:11, serguei.spitsyn at oracle.com wrote: > >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6988950 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >> >> >> Summary: >> >> The failing scenario: >> The debugger and the debuggee are well aware a VM shutdown has been started in the target process. >> The debugger at this point is not expected to send any commands to the JDWP agent. >> However, the JDI layer (debugger side) and the jdwp agent (debuggee side) >> are not in sync with the consumer layers. >> >> One reason is because the test debugger does not invoke the JDI method VirtualMachine.dispose(). >> Another reason is that the Debugger and the debuggee processes are uneasy to sync in general. >> >> As a result the following steps are possible: >> - The test debugger sends a 'quit' command to the test debuggee >> - The debuggee is normally exiting >> - The jdwp backend reports (over the jdwp protocol) an anonymous class unload event >> - The JDI InternalEventHandler thread handles the ClassUnloadEvent event >> - The InternalEventHandler wants to uncache the matching reference type. >> If there is more than one class with the same host class signature, it can't distinguish them, >> and so, deletes all references and re-retrieves them again (see tracing below): >> MY_TRACE: JDI: VirtualMachineImpl.retrieveClassesBySignature: sig=Ljava/lang/invoke/LambdaForm$DMH; >> - The jdwp backend debugLoop_run() gets the command from JDI and calls the functions >> classesForSignature() and classStatus() recursively. >> - The classStatus() makes a call to the JVMTI GetClassStatus() and gets the JVMTI_ERROR_WRONG_PHASE >> - As a result the jdwp backend reports the JVMTI error to the JDI, and so, the test fails >> >> For details, see the analysis in bug report closed as a dup of the bug 6988950: >> https://bugs.openjdk.java.net/browse/JDK-8024865 >> >> Some similar cases can be found in the two bug reports (6988950 and 8024865) describing this issue. >> >> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as it is normal at the VM shutdown. >> The original jdwp backend implementation had a similar approach for the raw monitor functions. >> Threy use the ignore_vm_death() to workaround the JVMTI_ERROR_WRONG_PHASE errors. >> For reference, please, see the file: src/share/back/util.c >> >> >> Testing: >> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >> >> >> Thanks, >> Serguei >> From serguei.spitsyn at oracle.com Wed Oct 29 14:35:30 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 29 Oct 2014 07:35:30 -0700 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5450F2C4.9020604@oracle.com> References: <54503EC2.6070601@oracle.com> <5450F2C4.9020604@oracle.com> Message-ID: <5450FB32.4020602@oracle.com> Dmitry, Yes, it will return NULL if the case of JVMTI_ERROR_WRONG_PHASE error. It is Ok for the getMethodClass() as the NULL return value listed in the comment. It seems to be Ok for other two uses of the methodClass(). Thanks! Serguei On 10/29/14 6:59 AM, Dmitry Samersoff wrote: > Serguei, > > What happens in a caller function if we ignore the error? > > e. g. getMethodClass has the code: > ... > > error = methodClass(method, &clazz); > if ( error != JVMTI_ERROR_NONE ) { > EXIT_ERROR(error,"Can't get jclass for a methodID, invalid?"); > return NULL; > } > return clazz; > > after the fix it probably will return NULL. Is it correct? > > -Dmitry > > On 2014-10-29 04:11, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6988950 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >> >> >> >> Summary: >> >> The failing scenario: >> The debugger and the debuggee are well aware a VM shutdown has been >> started in the target process. >> The debugger at this point is not expected to send any commands to >> the JDWP agent. >> However, the JDI layer (debugger side) and the jdwp agent (debuggee >> side) >> are not in sync with the consumer layers. >> >> One reason is because the test debugger does not invoke the JDI >> method VirtualMachine.dispose(). >> Another reason is that the Debugger and the debuggee processes are >> uneasy to sync in general. >> >> As a result the following steps are possible: >> - The test debugger sends a 'quit' command to the test debuggee >> - The debuggee is normally exiting >> - The jdwp backend reports (over the jdwp protocol) an anonymous >> class unload event >> - The JDI InternalEventHandler thread handles the >> ClassUnloadEvent event >> - The InternalEventHandler wants to uncache the matching >> reference type. >> If there is more than one class with the same host class >> signature, it can't distinguish them, >> and so, deletes all references and re-retrieves them again (see >> tracing below): >> MY_TRACE: JDI: VirtualMachineImpl.retrieveClassesBySignature: >> sig=Ljava/lang/invoke/LambdaForm$DMH; >> - The jdwp backend debugLoop_run() gets the command from JDI and >> calls the functions >> classesForSignature() and classStatus() recursively. >> - The classStatus() makes a call to the JVMTI GetClassStatus() >> and gets the JVMTI_ERROR_WRONG_PHASE >> - As a result the jdwp backend reports the JVMTI error to the >> JDI, and so, the test fails >> >> For details, see the analysis in bug report closed as a dup of the >> bug 6988950: >> https://bugs.openjdk.java.net/browse/JDK-8024865 >> >> Some similar cases can be found in the two bug reports (6988950 and >> 8024865) describing this issue. >> >> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as >> it is normal at the VM shutdown. >> The original jdwp backend implementation had a similar approach for >> the raw monitor functions. >> Threy use the ignore_vm_death() to workaround the >> JVMTI_ERROR_WRONG_PHASE errors. >> For reference, please, see the file: src/share/back/util.c >> >> >> Testing: >> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >> >> >> Thanks, >> Serguei >> > From serguei.spitsyn at oracle.com Wed Oct 29 14:39:17 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 29 Oct 2014 07:39:17 -0700 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5450FB32.4020602@oracle.com> References: <54503EC2.6070601@oracle.com> <5450F2C4.9020604@oracle.com> <5450FB32.4020602@oracle.com> Message-ID: <5450FC15.4080807@oracle.com> On 10/29/14 7:35 AM, serguei.spitsyn at oracle.com wrote: > Dmitry, > > Yes, it will return NULL if the case of JVMTI_ERROR_WRONG_PHASE error. A typo above: "if the case" => "in the case" Thanks, Serguei > It is Ok for the getMethodClass() as the NULL return value listed in > the comment. > It seems to be Ok for other two uses of the methodClass(). > > Thanks! > Serguei > > On 10/29/14 6:59 AM, Dmitry Samersoff wrote: >> Serguei, >> >> What happens in a caller function if we ignore the error? >> >> e. g. getMethodClass has the code: >> ... >> >> error = methodClass(method, &clazz); >> if ( error != JVMTI_ERROR_NONE ) { >> EXIT_ERROR(error,"Can't get jclass for a methodID, invalid?"); >> return NULL; >> } >> return clazz; >> >> after the fix it probably will return NULL. Is it correct? >> >> -Dmitry >> >> On 2014-10-29 04:11, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>> >>> >>> >>> >>> Summary: >>> >>> The failing scenario: >>> The debugger and the debuggee are well aware a VM shutdown has >>> been >>> started in the target process. >>> The debugger at this point is not expected to send any >>> commands to >>> the JDWP agent. >>> However, the JDI layer (debugger side) and the jdwp agent >>> (debuggee >>> side) >>> are not in sync with the consumer layers. >>> >>> One reason is because the test debugger does not invoke the JDI >>> method VirtualMachine.dispose(). >>> Another reason is that the Debugger and the debuggee processes >>> are >>> uneasy to sync in general. >>> >>> As a result the following steps are possible: >>> - The test debugger sends a 'quit' command to the test debuggee >>> - The debuggee is normally exiting >>> - The jdwp backend reports (over the jdwp protocol) an >>> anonymous >>> class unload event >>> - The JDI InternalEventHandler thread handles the >>> ClassUnloadEvent event >>> - The InternalEventHandler wants to uncache the matching >>> reference type. >>> If there is more than one class with the same host class >>> signature, it can't distinguish them, >>> and so, deletes all references and re-retrieves them again >>> (see >>> tracing below): >>> MY_TRACE: JDI: >>> VirtualMachineImpl.retrieveClassesBySignature: >>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>> - The jdwp backend debugLoop_run() gets the command from JDI >>> and >>> calls the functions >>> classesForSignature() and classStatus() recursively. >>> - The classStatus() makes a call to the JVMTI GetClassStatus() >>> and gets the JVMTI_ERROR_WRONG_PHASE >>> - As a result the jdwp backend reports the JVMTI error to the >>> JDI, and so, the test fails >>> >>> For details, see the analysis in bug report closed as a dup of >>> the >>> bug 6988950: >>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>> >>> Some similar cases can be found in the two bug reports >>> (6988950 and >>> 8024865) describing this issue. >>> >>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as >>> it is normal at the VM shutdown. >>> The original jdwp backend implementation had a similar >>> approach for >>> the raw monitor functions. >>> Threy use the ignore_vm_death() to workaround the >>> JVMTI_ERROR_WRONG_PHASE errors. >>> For reference, please, see the file: src/share/back/util.c >>> >>> >>> Testing: >>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>> >>> >>> Thanks, >>> Serguei >>> >> > From dmitry.samersoff at oracle.com Wed Oct 29 14:50:53 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Wed, 29 Oct 2014 17:50:53 +0300 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5450FB32.4020602@oracle.com> References: <54503EC2.6070601@oracle.com> <5450F2C4.9020604@oracle.com> <5450FB32.4020602@oracle.com> Message-ID: <5450FECD.3010504@oracle.com> Serguei, Thank you! Fix looks good for me. -Dmitry On 2014-10-29 17:35, serguei.spitsyn at oracle.com wrote: > Dmitry, > > Yes, it will return NULL if the case of JVMTI_ERROR_WRONG_PHASE error. > It is Ok for the getMethodClass() as the NULL return value listed in the > comment. > It seems to be Ok for other two uses of the methodClass(). > > Thanks! > Serguei > > On 10/29/14 6:59 AM, Dmitry Samersoff wrote: >> Serguei, >> >> What happens in a caller function if we ignore the error? >> >> e. g. getMethodClass has the code: >> ... >> >> error = methodClass(method, &clazz); >> if ( error != JVMTI_ERROR_NONE ) { >> EXIT_ERROR(error,"Can't get jclass for a methodID, invalid?"); >> return NULL; >> } >> return clazz; >> >> after the fix it probably will return NULL. Is it correct? >> >> -Dmitry >> >> On 2014-10-29 04:11, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>> >>> >>> >>> >>> Summary: >>> >>> The failing scenario: >>> The debugger and the debuggee are well aware a VM shutdown has >>> been >>> started in the target process. >>> The debugger at this point is not expected to send any commands to >>> the JDWP agent. >>> However, the JDI layer (debugger side) and the jdwp agent >>> (debuggee >>> side) >>> are not in sync with the consumer layers. >>> >>> One reason is because the test debugger does not invoke the JDI >>> method VirtualMachine.dispose(). >>> Another reason is that the Debugger and the debuggee processes are >>> uneasy to sync in general. >>> >>> As a result the following steps are possible: >>> - The test debugger sends a 'quit' command to the test debuggee >>> - The debuggee is normally exiting >>> - The jdwp backend reports (over the jdwp protocol) an anonymous >>> class unload event >>> - The JDI InternalEventHandler thread handles the >>> ClassUnloadEvent event >>> - The InternalEventHandler wants to uncache the matching >>> reference type. >>> If there is more than one class with the same host class >>> signature, it can't distinguish them, >>> and so, deletes all references and re-retrieves them again >>> (see >>> tracing below): >>> MY_TRACE: JDI: >>> VirtualMachineImpl.retrieveClassesBySignature: >>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>> - The jdwp backend debugLoop_run() gets the command from JDI and >>> calls the functions >>> classesForSignature() and classStatus() recursively. >>> - The classStatus() makes a call to the JVMTI GetClassStatus() >>> and gets the JVMTI_ERROR_WRONG_PHASE >>> - As a result the jdwp backend reports the JVMTI error to the >>> JDI, and so, the test fails >>> >>> For details, see the analysis in bug report closed as a dup of the >>> bug 6988950: >>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>> >>> Some similar cases can be found in the two bug reports (6988950 >>> and >>> 8024865) describing this issue. >>> >>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as >>> it is normal at the VM shutdown. >>> The original jdwp backend implementation had a similar approach >>> for >>> the raw monitor functions. >>> Threy use the ignore_vm_death() to workaround the >>> JVMTI_ERROR_WRONG_PHASE errors. >>> For reference, please, see the file: src/share/back/util.c >>> >>> >>> Testing: >>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>> >>> >>> Thanks, >>> Serguei >>> >> > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From serguei.spitsyn at oracle.com Wed Oct 29 14:54:16 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 29 Oct 2014 07:54:16 -0700 Subject: RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <5450FECD.3010504@oracle.com> References: <54503EC2.6070601@oracle.com> <5450F2C4.9020604@oracle.com> <5450FB32.4020602@oracle.com> <5450FECD.3010504@oracle.com> Message-ID: <5450FF98.7000902@oracle.com> Thanks, Dmitry! I'm still in a process of checking the consequences of the ignore_wrong_phase() calls. Thanks, Serguei On 10/29/14 7:50 AM, Dmitry Samersoff wrote: > Serguei, > > Thank you! > > Fix looks good for me. > > -Dmitry > > On 2014-10-29 17:35, serguei.spitsyn at oracle.com wrote: >> Dmitry, >> >> Yes, it will return NULL if the case of JVMTI_ERROR_WRONG_PHASE error. >> It is Ok for the getMethodClass() as the NULL return value listed in the >> comment. >> It seems to be Ok for other two uses of the methodClass(). >> >> Thanks! >> Serguei >> >> On 10/29/14 6:59 AM, Dmitry Samersoff wrote: >>> Serguei, >>> >>> What happens in a caller function if we ignore the error? >>> >>> e. g. getMethodClass has the code: >>> ... >>> >>> error = methodClass(method, &clazz); >>> if ( error != JVMTI_ERROR_NONE ) { >>> EXIT_ERROR(error,"Can't get jclass for a methodID, invalid?"); >>> return NULL; >>> } >>> return clazz; >>> >>> after the fix it probably will return NULL. Is it correct? >>> >>> -Dmitry >>> >>> On 2014-10-29 04:11, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for: >>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>> >>>> >>>> Open webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>> >>>> >>>> >>>> >>>> Summary: >>>> >>>> The failing scenario: >>>> The debugger and the debuggee are well aware a VM shutdown has >>>> been >>>> started in the target process. >>>> The debugger at this point is not expected to send any commands to >>>> the JDWP agent. >>>> However, the JDI layer (debugger side) and the jdwp agent >>>> (debuggee >>>> side) >>>> are not in sync with the consumer layers. >>>> >>>> One reason is because the test debugger does not invoke the JDI >>>> method VirtualMachine.dispose(). >>>> Another reason is that the Debugger and the debuggee processes are >>>> uneasy to sync in general. >>>> >>>> As a result the following steps are possible: >>>> - The test debugger sends a 'quit' command to the test debuggee >>>> - The debuggee is normally exiting >>>> - The jdwp backend reports (over the jdwp protocol) an anonymous >>>> class unload event >>>> - The JDI InternalEventHandler thread handles the >>>> ClassUnloadEvent event >>>> - The InternalEventHandler wants to uncache the matching >>>> reference type. >>>> If there is more than one class with the same host class >>>> signature, it can't distinguish them, >>>> and so, deletes all references and re-retrieves them again >>>> (see >>>> tracing below): >>>> MY_TRACE: JDI: >>>> VirtualMachineImpl.retrieveClassesBySignature: >>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>> - The jdwp backend debugLoop_run() gets the command from JDI and >>>> calls the functions >>>> classesForSignature() and classStatus() recursively. >>>> - The classStatus() makes a call to the JVMTI GetClassStatus() >>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>> - As a result the jdwp backend reports the JVMTI error to the >>>> JDI, and so, the test fails >>>> >>>> For details, see the analysis in bug report closed as a dup of the >>>> bug 6988950: >>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>> >>>> Some similar cases can be found in the two bug reports (6988950 >>>> and >>>> 8024865) describing this issue. >>>> >>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as >>>> it is normal at the VM shutdown. >>>> The original jdwp backend implementation had a similar approach >>>> for >>>> the raw monitor functions. >>>> Threy use the ignore_vm_death() to workaround the >>>> JVMTI_ERROR_WRONG_PHASE errors. >>>> For reference, please, see the file: src/share/back/util.c >>>> >>>> >>>> Testing: >>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>>> >>>> >>>> Thanks, >>>> Serguei >>>> > From daniel.daugherty at oracle.com Wed Oct 29 15:10:54 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 29 Oct 2014 09:10:54 -0600 Subject: RFR 8059533: (process) Make exiting process wait for exiting threads [win] In-Reply-To: <5450C7DD.9030600@oracle.com> References: <542BC45C.8080408@oracle.com> <544ADB87.7080909@oracle.com> <544BEAA8.5010600@oracle.com> <544D0CC9.3040903@oracle.com> <544D1014.10600@oracle.com> <544D858B.5050308@oracle.com> <544E03DE.4000308@oracle.com> <544F0825.7040905@oracle.com> <5450B249.2020603@oracle.com> <5450BE36.5000405@oracle.com> <5450C7DD.9030600@oracle.com> Message-ID: <5451037E.70807@oracle.com> On 10/29/14 4:56 AM, Ivan Gerasimov wrote: > > On 29.10.2014 13:15, David Holmes wrote: >> >> Great improvement thanks! One further suggestion - after line 3731 >> add a short overview eg: >> >> // Basic approach: >> // - Each exiting thread registers its intent to exit and then does so. >> // - A thread trying to terminate the process must wait for all >> // threads currently exiting to complete their exit >> >> >> Typo: >> 3802 // _endthreadex() comleted. >> > Thanks! Fixed the typo and added the comment. > The updated webrev is at the same location: > http://cr.openjdk.java.net/~igerasim/8059533/1/webrev/ src/os/windows/vm/os_windows.cpp No comments. Thumbs up! Thanks for adding more comments. Dan > >> >>>> >>> >>> I have a couple of thought about this: >>> >>> First, the thread that ends the process waits for all the exiting >>> threads to complete. By the time it returns from >>> WaitForMultipleObjects, >>> those exiting threads aren't running anymore, so the race is avoided >>> (unless it's the race with scheduler). >> >> That's not quite true and the key part of my point. At some point >> during the thread termination process the exiting thread has to >> signal any waiting thread - and it is still running at the point. We >> don't know whether the action that causes interference with the >> process termination happens before or after the signalling is done. >> > I was thinking if the dedicated scheduler thread can do the > signalling. Otherwise, some poorly behaving thread could die without > updating its status, leaving other threads waiting for its completion > forever. Though, I don't know how it's actually done. > > I've read in several places (for example, a comment at the bottom of > [1]), that waiting for a thread to complete with WaitForXXX() function > ensures the exit code of that thread is set. In particular, > GetExitCodeThread() should not return STILL_ACTIVE for that thread > anymore. > > This lets me hope that the code, which sets the exit code (which is > potentially racy) has already been executed by the time WaitForXXX() > returns. > > [1] > http://msdn.microsoft.com/en-us/library/windows/desktop/ms683190(v=vs.85).aspx > >> Anyway this narrows the window even further even if it may not close >> it completely. >> > > That's my hope too :) > Otherwise, I'll be (almost) out of ideas how to work this issue around. > > Sincerely yours, > Ivan > > From mandy.chung at oracle.com Wed Oct 29 19:34:19 2014 From: mandy.chung at oracle.com (Mandy Chung) Date: Wed, 29 Oct 2014 12:34:19 -0700 Subject: RFR 8061616: HotspotDiagnosticMXBean.getVMOption() throws IllegalArgumentException for flags of type double In-Reply-To: <5450F08C.5050804@oracle.com> References: <5450F08C.5050804@oracle.com> Message-ID: <5451413B.2080807@oracle.com> On 10/29/2014 6:50 AM, Jaroslav Bachorik wrote: > Please review the following change > > Issue : https://bugs.openjdk.java.net/browse/JDK-8061616 > Webrev : > (jdk): http://cr.openjdk.java.net/~jbachorik/8061616/jdk/webrev.00/ > (hotspot): > http://cr.openjdk.java.net/~jbachorik/8061616/hotspot/webrev.00/ > share/classes/sun/management/HotSpotDiagnostic.java line 105-110, 117-122: you can replace with new IAE(String, Throwable cause) The original code was written before that IAE constructor was added. test/com/sun/management/HotSpotDiagnosticMXBean/GetDoubleVMOption.java you can take out line 28 Looks okay otherwise. Mandy From serguei.spitsyn at oracle.com Thu Oct 30 01:05:37 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 29 Oct 2014 18:05:37 -0700 Subject: 2-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <54503EC2.6070601@oracle.com> References: <54503EC2.6070601@oracle.com> Message-ID: <54518EE1.9040208@oracle.com> The updated webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ The changes are: - added a comment recommended by Staffan - removed the ignore_wrong_phase() call from function classSignature() The classSignature() function is called in 16 places. Most of them do not tolerate the NULL in place of returned signature and will crash. I'm not comfortable to fix all the occurrences now and suggest to return to this issue after gaining experience with more failure cases that are still expected. The failure with the classSignature() involved was observed only once in the nightly and should be extremely rare reproducible. I'll file a placeholder bug if necessary. Thanks, Serguei On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for: > https://bugs.openjdk.java.net/browse/JDK-6988950 > > > Open webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ > > > > Summary: > > The failing scenario: > The debugger and the debuggee are well aware a VM shutdown has > been started in the target process. > The debugger at this point is not expected to send any commands > to the JDWP agent. > However, the JDI layer (debugger side) and the jdwp agent > (debuggee side) > are not in sync with the consumer layers. > > One reason is because the test debugger does not invoke the JDI > method VirtualMachine.dispose(). > Another reason is that the Debugger and the debuggee processes > are uneasy to sync in general. > > As a result the following steps are possible: > - The test debugger sends a 'quit' command to the test debuggee > - The debuggee is normally exiting > - The jdwp backend reports (over the jdwp protocol) an > anonymous class unload event > - The JDI InternalEventHandler thread handles the > ClassUnloadEvent event > - The InternalEventHandler wants to uncache the matching > reference type. > If there is more than one class with the same host class > signature, it can't distinguish them, > and so, deletes all references and re-retrieves them again > (see tracing below): > MY_TRACE: JDI: > VirtualMachineImpl.retrieveClassesBySignature: > sig=Ljava/lang/invoke/LambdaForm$DMH; > - The jdwp backend debugLoop_run() gets the command from JDI > and calls the functions > classesForSignature() and classStatus() recursively. > - The classStatus() makes a call to the JVMTI GetClassStatus() > and gets the JVMTI_ERROR_WRONG_PHASE > - As a result the jdwp backend reports the JVMTI error to the > JDI, and so, the test fails > > For details, see the analysis in bug report closed as a dup of > the bug 6988950: > https://bugs.openjdk.java.net/browse/JDK-8024865 > > Some similar cases can be found in the two bug reports (6988950 > and 8024865) describing this issue. > > The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as > it is normal at the VM shutdown. > The original jdwp backend implementation had a similar approach > for the raw monitor functions. > Threy use the ignore_vm_death() to workaround the > JVMTI_ERROR_WRONG_PHASE errors. > For reference, please, see the file: src/share/back/util.c > > > Testing: > Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests > > > Thanks, > Serguei > From dmitry.samersoff at oracle.com Thu Oct 30 11:16:50 2014 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Thu, 30 Oct 2014 14:16:50 +0300 Subject: 2-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <54518EE1.9040208@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> Message-ID: <54521E22.3000504@oracle.com> Serguei, Looks good for me! -Dmitry On 2014-10-30 04:05, serguei.spitsyn at oracle.com wrote: > The updated webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ > > > The changes are: > - added a comment recommended by Staffan > - removed the ignore_wrong_phase() call from function classSignature() > > The classSignature() function is called in 16 places. > Most of them do not tolerate the NULL in place of returned signature and > will crash. > I'm not comfortable to fix all the occurrences now and suggest to return > to this > issue after gaining experience with more failure cases that are still > expected. > The failure with the classSignature() involved was observed only once in > the nightly > and should be extremely rare reproducible. > I'll file a placeholder bug if necessary. > > Thanks, > Serguei > > On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6988950 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >> >> >> >> Summary: >> >> The failing scenario: >> The debugger and the debuggee are well aware a VM shutdown has >> been started in the target process. >> The debugger at this point is not expected to send any commands >> to the JDWP agent. >> However, the JDI layer (debugger side) and the jdwp agent >> (debuggee side) >> are not in sync with the consumer layers. >> >> One reason is because the test debugger does not invoke the JDI >> method VirtualMachine.dispose(). >> Another reason is that the Debugger and the debuggee processes >> are uneasy to sync in general. >> >> As a result the following steps are possible: >> - The test debugger sends a 'quit' command to the test debuggee >> - The debuggee is normally exiting >> - The jdwp backend reports (over the jdwp protocol) an >> anonymous class unload event >> - The JDI InternalEventHandler thread handles the >> ClassUnloadEvent event >> - The InternalEventHandler wants to uncache the matching >> reference type. >> If there is more than one class with the same host class >> signature, it can't distinguish them, >> and so, deletes all references and re-retrieves them again >> (see tracing below): >> MY_TRACE: JDI: >> VirtualMachineImpl.retrieveClassesBySignature: >> sig=Ljava/lang/invoke/LambdaForm$DMH; >> - The jdwp backend debugLoop_run() gets the command from JDI >> and calls the functions >> classesForSignature() and classStatus() recursively. >> - The classStatus() makes a call to the JVMTI GetClassStatus() >> and gets the JVMTI_ERROR_WRONG_PHASE >> - As a result the jdwp backend reports the JVMTI error to the >> JDI, and so, the test fails >> >> For details, see the analysis in bug report closed as a dup of >> the bug 6988950: >> https://bugs.openjdk.java.net/browse/JDK-8024865 >> >> Some similar cases can be found in the two bug reports (6988950 >> and 8024865) describing this issue. >> >> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as >> it is normal at the VM shutdown. >> The original jdwp backend implementation had a similar approach >> for the raw monitor functions. >> Threy use the ignore_vm_death() to workaround the >> JVMTI_ERROR_WRONG_PHASE errors. >> For reference, please, see the file: src/share/back/util.c >> >> >> Testing: >> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >> >> >> Thanks, >> Serguei >> > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From serguei.spitsyn at oracle.com Thu Oct 30 19:07:53 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 30 Oct 2014 12:07:53 -0700 Subject: 2-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <54521E22.3000504@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <54521E22.3000504@oracle.com> Message-ID: <54528C89.3060104@oracle.com> Thanks, Dmitry! Serguei On 10/30/14 4:16 AM, Dmitry Samersoff wrote: > Serguei, > > Looks good for me! > > -Dmitry > > On 2014-10-30 04:05, serguei.spitsyn at oracle.com wrote: >> The updated webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >> >> >> The changes are: >> - added a comment recommended by Staffan >> - removed the ignore_wrong_phase() call from function classSignature() >> >> The classSignature() function is called in 16 places. >> Most of them do not tolerate the NULL in place of returned signature and >> will crash. >> I'm not comfortable to fix all the occurrences now and suggest to return >> to this >> issue after gaining experience with more failure cases that are still >> expected. >> The failure with the classSignature() involved was observed only once in >> the nightly >> and should be extremely rare reproducible. >> I'll file a placeholder bug if necessary. >> >> Thanks, >> Serguei >> >> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for: >>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>> >>> >>> Open webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>> >>> >>> >>> Summary: >>> >>> The failing scenario: >>> The debugger and the debuggee are well aware a VM shutdown has >>> been started in the target process. >>> The debugger at this point is not expected to send any commands >>> to the JDWP agent. >>> However, the JDI layer (debugger side) and the jdwp agent >>> (debuggee side) >>> are not in sync with the consumer layers. >>> >>> One reason is because the test debugger does not invoke the JDI >>> method VirtualMachine.dispose(). >>> Another reason is that the Debugger and the debuggee processes >>> are uneasy to sync in general. >>> >>> As a result the following steps are possible: >>> - The test debugger sends a 'quit' command to the test debuggee >>> - The debuggee is normally exiting >>> - The jdwp backend reports (over the jdwp protocol) an >>> anonymous class unload event >>> - The JDI InternalEventHandler thread handles the >>> ClassUnloadEvent event >>> - The InternalEventHandler wants to uncache the matching >>> reference type. >>> If there is more than one class with the same host class >>> signature, it can't distinguish them, >>> and so, deletes all references and re-retrieves them again >>> (see tracing below): >>> MY_TRACE: JDI: >>> VirtualMachineImpl.retrieveClassesBySignature: >>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>> - The jdwp backend debugLoop_run() gets the command from JDI >>> and calls the functions >>> classesForSignature() and classStatus() recursively. >>> - The classStatus() makes a call to the JVMTI GetClassStatus() >>> and gets the JVMTI_ERROR_WRONG_PHASE >>> - As a result the jdwp backend reports the JVMTI error to the >>> JDI, and so, the test fails >>> >>> For details, see the analysis in bug report closed as a dup of >>> the bug 6988950: >>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>> >>> Some similar cases can be found in the two bug reports (6988950 >>> and 8024865) describing this issue. >>> >>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error as >>> it is normal at the VM shutdown. >>> The original jdwp backend implementation had a similar approach >>> for the raw monitor functions. >>> Threy use the ignore_vm_death() to workaround the >>> JVMTI_ERROR_WRONG_PHASE errors. >>> For reference, please, see the file: src/share/back/util.c >>> >>> >>> Testing: >>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >>> >>> >>> Thanks, >>> Serguei >>> > From serguei.spitsyn at oracle.com Thu Oct 30 20:35:27 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 30 Oct 2014 13:35:27 -0700 Subject: 2-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <545294C0.4010204@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> <54521E22.3000504@oracle.com> <54528C33.7000004@oracle.com> <545294C0.4010204@oracle.com> Message-ID: <5452A10F.2080104@oracle.com> As we started this discussion I've added the open mailing lists back. :) On 10/30/14 12:42 PM, Daniel D. Daugherty wrote: > On 10/30/14 1:06 PM, serguei.spitsyn at oracle.com wrote: >> Staffan and Dan, >> >> Do you have anything to say? > > It feels like we're suppressing a symptom rather than dealing with > the underlying cause. However, I haven't been close to this bug > for years so my memories are rusty here. I agree. It feels that not all the aspects of the shutdown sequence were equally designed in the JDI + jdwp agent. There can be shutdown races between the debugger + JDI and debuggee + jdwp agent. I do not see patterns in the code to recognize the shutdown and bail out gracefully. > > One question: > > line 1051: if (debugInit_isInitComplete() && error == > JVMTI_ERROR_WRONG_PHASE) { > The debugInit_isInitComplete() check means that we only do > this suppression in the live phase or later, right? Perhaps > we should do this only when we are post live phase... Agreed. It is exactly the case. In normal case the debugInit_isInitComplete() returns true after VM_INIT event was received. Some agent flag can enforce to postpone the agent initialization until an Exception event is received. In all cases, the initialization happens in the live phase or later (not sure, it can happen in the dead phase). Encountering the JVMTI WRONG_PHASE error means the VM entered the VM_DEAD phase. This must be a signal to start an agent shutdown. At this point, I'm not ready to redesign this in the agent. This fix is only a workaround for nightly stabilization. > > Maybe a flag set at the beginning of the VMDeath event handler > would be better. There is already a global flag: gdata->vmDead I've already tried to use it, but it did not work for me. Let me check it more. > > >> Is it Ok to push this? > > I'm OK with it, but I'm just one voice... Thanks! You raised good points. > > >> Dan, should I count on you as a reviewer? > > Yes, I've reviewed the changes at this point. Ok. > > >> I will also need to backport this to 8u40. > > You might want to let this bake for a couple of weeks first... Sure. Thanks, Serguei > > Dan > > >> >> Thanks! >> Serguei >> >> On 10/30/14 4:16 AM, Dmitry Samersoff wrote: >>> Serguei, >>> >>> Looks good for me! >>> >>> -Dmitry >>> >>> On 2014-10-30 04:05, serguei.spitsyn at oracle.com wrote: >>>> The updated webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ >>>> >>>> >>>> >>>> The changes are: >>>> - added a comment recommended by Staffan >>>> - removed the ignore_wrong_phase() call from function >>>> classSignature() >>>> >>>> The classSignature() function is called in 16 places. >>>> Most of them do not tolerate the NULL in place of returned >>>> signature and >>>> will crash. >>>> I'm not comfortable to fix all the occurrences now and suggest to >>>> return >>>> to this >>>> issue after gaining experience with more failure cases that are still >>>> expected. >>>> The failure with the classSignature() involved was observed only >>>> once in >>>> the nightly >>>> and should be extremely rare reproducible. >>>> I'll file a placeholder bug if necessary. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >>>>> Please, review the fix for: >>>>> https://bugs.openjdk.java.net/browse/JDK-6988950 >>>>> >>>>> >>>>> Open webrev: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >>>>> >>>>> >>>>> >>>>> >>>>> Summary: >>>>> >>>>> The failing scenario: >>>>> The debugger and the debuggee are well aware a VM shutdown has >>>>> been started in the target process. >>>>> The debugger at this point is not expected to send any commands >>>>> to the JDWP agent. >>>>> However, the JDI layer (debugger side) and the jdwp agent >>>>> (debuggee side) >>>>> are not in sync with the consumer layers. >>>>> >>>>> One reason is because the test debugger does not invoke the JDI >>>>> method VirtualMachine.dispose(). >>>>> Another reason is that the Debugger and the debuggee processes >>>>> are uneasy to sync in general. >>>>> >>>>> As a result the following steps are possible: >>>>> - The test debugger sends a 'quit' command to the test >>>>> debuggee >>>>> - The debuggee is normally exiting >>>>> - The jdwp backend reports (over the jdwp protocol) an >>>>> anonymous class unload event >>>>> - The JDI InternalEventHandler thread handles the >>>>> ClassUnloadEvent event >>>>> - The InternalEventHandler wants to uncache the matching >>>>> reference type. >>>>> If there is more than one class with the same host class >>>>> signature, it can't distinguish them, >>>>> and so, deletes all references and re-retrieves them again >>>>> (see tracing below): >>>>> MY_TRACE: JDI: >>>>> VirtualMachineImpl.retrieveClassesBySignature: >>>>> sig=Ljava/lang/invoke/LambdaForm$DMH; >>>>> - The jdwp backend debugLoop_run() gets the command from JDI >>>>> and calls the functions >>>>> classesForSignature() and classStatus() recursively. >>>>> - The classStatus() makes a call to the JVMTI >>>>> GetClassStatus() >>>>> and gets the JVMTI_ERROR_WRONG_PHASE >>>>> - As a result the jdwp backend reports the JVMTI error to the >>>>> JDI, and so, the test fails >>>>> >>>>> For details, see the analysis in bug report closed as a dup of >>>>> the bug 6988950: >>>>> https://bugs.openjdk.java.net/browse/JDK-8024865 >>>>> >>>>> Some similar cases can be found in the two bug reports (6988950 >>>>> and 8024865) describing this issue. >>>>> >>>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE >>>>> error as >>>>> it is normal at the VM shutdown. >>>>> The original jdwp backend implementation had a similar approach >>>>> for the raw monitor functions. >>>>> Threy use the ignore_vm_death() to workaround the >>>>> JVMTI_ERROR_WRONG_PHASE errors. >>>>> For reference, please, see the file: src/share/back/util.c >>>>> >>>>> >>>>> Testing: >>>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi >>>>> tests >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>> >> > From david.holmes at oracle.com Thu Oct 30 22:56:37 2014 From: david.holmes at oracle.com (David Holmes) Date: Fri, 31 Oct 2014 08:56:37 +1000 Subject: RFR 8062116: JVMTI GetClassMethods is Slow In-Reply-To: References: Message-ID: <5452C225.8080001@oracle.com> Adding serviceability as they own JVMTI. David On 31/10/2014 3:02 AM, Jeremy Manson wrote: > There's a significant regression in the speed of JVMTI GetClassMethods in > JDK8. I've tracked this down to allocation of jmethodids in a tight loop. > The issue can be addressed by preallocating enough space for all of the > jmethodids when starting the operation and not iterating over all of the > existing jmethodids when you allocate a new one. > > A patch is here: > > http://cr.openjdk.java.net/~jmanson/8062116/webrev.00/ > > A reproducible test case can be found here: > > http://cr.openjdk.java.net/~jmanson/8062116/repro/ > > It's a benchmark, though: I have no idea how to turn it into a test. > > For whoever reviews it: can you explain to me why it is okay that this code > reuses jmethodIDs (in JNIMethodBlock::add_method? I can imagine a lot of > problems stemming from accidental reuse. > > Jeremy > From ivan.gerasimov at oracle.com Fri Oct 31 14:39:11 2014 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Fri, 31 Oct 2014 17:39:11 +0300 Subject: RFR 8062647: Wrong indentation of arguments of annotated methods Message-ID: <54539F0F.6030704@oracle.com> Hello everybody! I noticed that the javadoc tool may produce the doc with misaligned arguments of annotated methods in the 'method details' section. For example: http://jre.us.oracle.com/java/re/jdk/9/nightly/latest/docs/api/java/util/EnumSet.html#of-E-E...- Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8062647 WEBREV: http://cr.openjdk.java.net/~igerasim/8062647/0/webrev/ Sincerely yours, Ivan From david.buck at oracle.com Fri Oct 31 15:58:03 2014 From: david.buck at oracle.com (david buck) Date: Sat, 01 Nov 2014 00:58:03 +0900 Subject: RFR 8060169: Update the Crash Reporting URL in the Java crash log Message-ID: <5453B18B.4020203@oracle.com> Hi! Please approve this very simple update to the URL written in the crash log (hs_err_pid.log) file. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8060169 WEBREV: http://cr.openjdk.java.net/~dbuck/8060169/webrev.000/ Cheers, -Buck From daniel.daugherty at oracle.com Fri Oct 31 16:04:03 2014 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 31 Oct 2014 10:04:03 -0600 Subject: RFR 8060169: Update the Crash Reporting URL in the Java crash log In-Reply-To: <5453B18B.4020203@oracle.com> References: <5453B18B.4020203@oracle.com> Message-ID: <5453B2F3.5000503@oracle.com> On 10/31/14 9:58 AM, david buck wrote: > Hi! > > Please approve this very simple update to the URL written in the crash > log (hs_err_pid.log) file. > > BUGURL: https://bugs.openjdk.java.net/browse/JDK-8060169 > WEBREV: http://cr.openjdk.java.net/~dbuck/8060169/webrev.000/ src/share/vm/runtime/arguments.cpp No comments. Thumbs up. Dan > > Cheers, > -Buck > > From ron.durbin at oracle.com Fri Oct 31 16:15:46 2014 From: ron.durbin at oracle.com (Ron Durbin) Date: Fri, 31 Oct 2014 09:15:46 -0700 (PDT) Subject: RFR 8060169: Update the Crash Reporting URL in the Java crash log In-Reply-To: <5453B2F3.5000503@oracle.com> References: <5453B18B.4020203@oracle.com> <5453B2F3.5000503@oracle.com> Message-ID: <55ada4cf-8534-44bb-8e73-38743cfcbd2e@default> I agree with the change and give it my thumbs up. > -----Original Message----- > From: Daniel D. Daugherty > Sent: Friday, October 31, 2014 10:04 AM > To: david buck > Cc: serviceability-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR 8060169: Update the Crash Reporting URL in the Java crash log > > On 10/31/14 9:58 AM, david buck wrote: > > Hi! > > > > Please approve this very simple update to the URL written in the crash > > log (hs_err_pid.log) file. > > > > BUGURL: https://bugs.openjdk.java.net/browse/JDK-8060169 > > WEBREV: http://cr.openjdk.java.net/~dbuck/8060169/webrev.000/ > > src/share/vm/runtime/arguments.cpp > No comments. > > Thumbs up. > > Dan > > > > > > > > Cheers, > > -Buck > > > > > From shanliang.jiang at oracle.com Fri Oct 31 16:44:12 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Fri, 31 Oct 2014 17:44:12 +0100 Subject: Code review: 8046192 Eliminate SNMP dependencies to the internal APIs from open jdk modules Message-ID: <5453BC5C.6000804@oracle.com> Hi, The fix is to remove unnecessary exports for jdk.snmp module. bug: https://bugs.openjdk.java.net/browse/JDK-8046192 webrev: http://cr.openjdk.java.net/~sjiang/JDK-8046192/00/ Thanks, Shanliang From poonam.bajaj at oracle.com Fri Oct 31 16:58:42 2014 From: poonam.bajaj at oracle.com (Poonam Bajaj) Date: Fri, 31 Oct 2014 09:58:42 -0700 Subject: RFR 8060169: Update the Crash Reporting URL in the Java crash log In-Reply-To: <5453B18B.4020203@oracle.com> References: <5453B18B.4020203@oracle.com> Message-ID: <5453BFC2.4040005@oracle.com> Hi David, On 10/31/2014 8:58 AM, david buck wrote: > Hi! > > Please approve this very simple update to the URL written in the crash > log (hs_err_pid.log) file. > > BUGURL: https://bugs.openjdk.java.net/browse/JDK-8060169 > WEBREV: http://cr.openjdk.java.net/~dbuck/8060169/webrev.000/ > The changes look good. Thanks, Poonam > Cheers, > -Buck From serguei.spitsyn at oracle.com Fri Oct 31 21:07:00 2014 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 31 Oct 2014 14:07:00 -0700 Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112) In-Reply-To: <54518EE1.9040208@oracle.com> References: <54503EC2.6070601@oracle.com> <54518EE1.9040208@oracle.com> Message-ID: <5453F9F4.20309@oracle.com> It is 3-rd round of review for: https://bugs.openjdk.java.net/browse/JDK-6988950 New webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ Summary For failing scenario, please, refer to the 1-st round RFR below. I've found what is missed in the jdwp agent shutdown and decided to switch from a workaround to a real fix. The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1. The agent debugLoop_run() has a guard against the VM shutdown: 165 } else if (gdata->vmDead && 166 ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) { 167 /* Protect the VM from calls while dead. 168 * VirtualMachine cmdSet quietly ignores some cmds 169 * after VM death, so, it sends it's own errors. 170 */ 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD)); However, the guard above does not help much if the VM_DEATH event happens in the middle of a command execution. There is a lack of synchronization here. The fix introduces new lock (vmDeathLock) which does not allow to execute the commands and the VM_DEATH event callback concurrently. It should work well for any function that is used in implementation of the JDWP_COMMAND_SET(VirtualMachine) . Testing: Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests Thanks, Serguei On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote: > The updated webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ > > > The changes are: > - added a comment recommended by Staffan > - removed the ignore_wrong_phase() call from function classSignature() > > The classSignature() function is called in 16 places. > Most of them do not tolerate the NULL in place of returned signature > and will crash. > I'm not comfortable to fix all the occurrences now and suggest to > return to this > issue after gaining experience with more failure cases that are still > expected. > The failure with the classSignature() involved was observed only once > in the nightly > and should be extremely rare reproducible. > I'll file a placeholder bug if necessary. > > Thanks, > Serguei > > On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for: >> https://bugs.openjdk.java.net/browse/JDK-6988950 >> >> >> Open webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ >> >> >> >> Summary: >> >> The failing scenario: >> The debugger and the debuggee are well aware a VM shutdown has >> been started in the target process. >> The debugger at this point is not expected to send any commands >> to the JDWP agent. >> However, the JDI layer (debugger side) and the jdwp agent >> (debuggee side) >> are not in sync with the consumer layers. >> >> One reason is because the test debugger does not invoke the JDI >> method VirtualMachine.dispose(). >> Another reason is that the Debugger and the debuggee processes >> are uneasy to sync in general. >> >> As a result the following steps are possible: >> - The test debugger sends a 'quit' command to the test debuggee >> - The debuggee is normally exiting >> - The jdwp backend reports (over the jdwp protocol) an >> anonymous class unload event >> - The JDI InternalEventHandler thread handles the >> ClassUnloadEvent event >> - The InternalEventHandler wants to uncache the matching >> reference type. >> If there is more than one class with the same host class >> signature, it can't distinguish them, >> and so, deletes all references and re-retrieves them again >> (see tracing below): >> MY_TRACE: JDI: >> VirtualMachineImpl.retrieveClassesBySignature: >> sig=Ljava/lang/invoke/LambdaForm$DMH; >> - The jdwp backend debugLoop_run() gets the command from JDI >> and calls the functions >> classesForSignature() and classStatus() recursively. >> - The classStatus() makes a call to the JVMTI GetClassStatus() >> and gets the JVMTI_ERROR_WRONG_PHASE >> - As a result the jdwp backend reports the JVMTI error to the >> JDI, and so, the test fails >> >> For details, see the analysis in bug report closed as a dup of >> the bug 6988950: >> https://bugs.openjdk.java.net/browse/JDK-8024865 >> >> Some similar cases can be found in the two bug reports (6988950 >> and 8024865) describing this issue. >> >> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error >> as it is normal at the VM shutdown. >> The original jdwp backend implementation had a similar approach >> for the raw monitor functions. >> Threy use the ignore_vm_death() to workaround the >> JVMTI_ERROR_WRONG_PHASE errors. >> For reference, please, see the file: src/share/back/util.c >> >> >> Testing: >> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests >> >> >> Thanks, >> Serguei >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: