From suenaga at oss.nttdata.com Fri Nov 1 08:56:28 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 1 Nov 2019 17:56:28 +0900 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> Message-ID: (Changed subject to review request) Hi, I converted LinuxDebuggerLocal.c to C++ code, and it works fine on submit repo. (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ Could you review it? Thanks, Yasumasa On 2019/11/01 8:54, Yasumasa Suenaga wrote: > Hi David, > > On 2019/11/01 7:55, David Holmes wrote: >> Hi Yasumasa, >> >> New build dependencies cannot be added lightly. This impacts everyone who maintains build/test farms. > > Ok, thanks for telling it. > >> We already use the C++ demangling capabilities in the VM. Is there some way to export that for use by libsaproc ? >> >> Otherwise using C++ demangle may still be the better choice given we already have it as a dependency. > > I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at decoder_linux.cpp . > It is similar with my original proposal. > >>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ > > I agree with David to use C++ demangle way. > However we need to choice the fix from following: > > ? A. Convert LinuxDebuggerLocal.c to C++ code > ? B. Add C++ code for libsaproc.so to demangle symbols. > > I've discussed with Chris about it in [1]. > Option A might be large change. > > > Thanks, > > Yasumasa > > > [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html > > >> David >> >> On 1/11/2019 12:58 am, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> Here's the failure during configure: >>> >>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>> [2019-10-31T06:07:45,151Z] configure: error: Could not find demangle.h! You might be able to fix this by running 'sudo yum install binutils-devel'. >>> >>> Chris >>> >>> >>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>> Hi, >>>> >>>> I filed this enhancement to JBS: >>>> >>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>> >>>> Also I pushed the changes to submit repo, but it was failed. >>>> >>>> ? http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>> ? http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>> >>>> According to the email from Mach 5, dependency errors were occurred in jib. >>>> Can someone share the details? >>>> I'm not familiar in jib, so I want help. >>>> >>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>> You can change the configure script. I don't know if there's any concerns with using libiberty.a. That's possibly a legal question (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>> >>>>> Chris >>>>> >>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Thanks for quick reply! >>>>>> >>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to convert a lot of JNI calls to C++ style. >>>>>> For example: >>>>>> >>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>> ????? to >>>>>> ? env->FindClass("java/lang/String") >>>>>> >>>>>> Can it be accepted? >>>>>> >>>>>> OTOH I said in my email, to use cplus_demangle(), we need to link libiberty.a which is provided by binutils. Thus I think we need to check libiberty.a in configure script. Is it ok? >>>>>> >>>>>> >>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> I don't have concerns with adding C++ source to SA, but in order to do so you've put the new native code in its own file rather than in LinuxDebuggerLocal.c. I'd like to see that resolved. So either convert LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were mangled as below: >>>>>>>> >>>>>>>> >>>>>>>> 0x00007ff255a8fa4c _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread + 0x6ac >>>>>>>> 0x00007ff255a8cc1d _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread + 0x33d >>>>>>>> >>>>>>>> >>>>>>>> We can demangle them via c++filt, but I think it is more convenience if jstack can show demangling symbols. >>>>>>>> I think we can demangle in jstack with this patch. If it is accepted, I will file it to JBS and send review request. >>>>>>>> What do you think? >>>>>>>> >>>>>>>> ? http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>> >>>>>>>> We can get the stack as below after applying this patch: >>>>>>>> >>>>>>>> >>>>>>>> 0x00007ff1aba20a4c????? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>> >>>>>>>> >>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds C++ source to SA. >>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>> But this function is provided by libiberty.a, so we need to link it to libsaproc and need to check libiberty.a in configure script. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> From chris.plummer at oracle.com Fri Nov 1 22:07:20 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 1 Nov 2019 15:07:20 -0700 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> Message-ID: <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> Hi? Yasumasa, I can't comment on the build changes. I don't know how the build works well enough. LinuxDebuggerLocal.cpp should show up as a rename + diffs, not as a new file. The rest of the changes look fine. thanks, Chris On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: > (Changed subject to review request) > > Hi, > > I converted LinuxDebuggerLocal.c to C++ code, and it works fine on > submit repo. > (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ > > Could you review it? > > > Thanks, > > Yasumasa > > > On 2019/11/01 8:54, Yasumasa Suenaga wrote: >> Hi David, >> >> On 2019/11/01 7:55, David Holmes wrote: >>> Hi Yasumasa, >>> >>> New build dependencies cannot be added lightly. This impacts >>> everyone who maintains build/test farms. >> >> Ok, thanks for telling it. >> >>> We already use the C++ demangling capabilities in the VM. Is there >>> some way to export that for use by libsaproc ? >>> >>> Otherwise using C++ demangle may still be the better choice given we >>> already have it as a dependency. >> >> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at >> decoder_linux.cpp . >> It is similar with my original proposal. >> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >> >> I agree with David to use C++ demangle way. >> However we need to choice the fix from following: >> >> ?? A. Convert LinuxDebuggerLocal.c to C++ code >> ?? B. Add C++ code for libsaproc.so to demangle symbols. >> >> I've discussed with Chris about it in [1]. >> Option A might be large change. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >> >> >>> David >>> >>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> Here's the failure during configure: >>>> >>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>> demangle.h! You might be able to fix this by running 'sudo yum >>>> install binutils-devel'. >>>> >>>> Chris >>>> >>>> >>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>> Hi, >>>>> >>>>> I filed this enhancement to JBS: >>>>> >>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>> >>>>> Also I pushed the changes to submit repo, but it was failed. >>>>> >>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>> >>>>> According to the email from Mach 5, dependency errors were >>>>> occurred in jib. >>>>> Can someone share the details? >>>>> I'm not familiar in jib, so I want help. >>>>> >>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>> You can change the configure script. I don't know if there's any >>>>>> concerns with using libiberty.a. That's possibly a legal question >>>>>> (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Thanks for quick reply! >>>>>>> >>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>> convert a lot of JNI calls to C++ style. >>>>>>> For example: >>>>>>> >>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>> ????? to >>>>>>> ? env->FindClass("java/lang/String") >>>>>>> >>>>>>> Can it be accepted? >>>>>>> >>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>> link libiberty.a which is provided by binutils. Thus I think we >>>>>>> need to check libiberty.a in configure script. Is it ok? >>>>>>> >>>>>>> >>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>> >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>> order to do so you've put the new native code in its own file >>>>>>>> rather than in LinuxDebuggerLocal.c. I'd like to see that >>>>>>>> resolved. So either convert LinuxDebuggerLocal.c to C++, or use >>>>>>>> cplus_demangle(). >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>> mangled as below: >>>>>>>>> >>>>>>>>> >>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>> + 0x6ac >>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>> + 0x33d >>>>>>>>> >>>>>>>>> >>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>> I think we can demangle in jstack with this patch. If it is >>>>>>>>> accepted, I will file it to JBS and send review request. >>>>>>>>> What do you think? >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>> >>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>> >>>>>>>>> >>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, >>>>>>>>> Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>> >>>>>>>>> >>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds >>>>>>>>> C++ source to SA. >>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>> But this function is provided by libiberty.a, so we need to >>>>>>>>> link it to libsaproc and need to check libiberty.a in >>>>>>>>> configure script. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> From alexey.menkov at oracle.com Fri Nov 1 23:54:10 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 1 Nov 2019 16:54:10 -0700 Subject: RFR: JDK-8231915: two JDI tests interfere with each other Message-ID: <2bfe41ec-a0e5-73fc-7845-c7ca71dcdd29@oracle.com> Hi all, please review a small fix for https://bugs.openjdk.java.net/browse/JDK-8231915 webrev: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp_test_interference/webrev/ The fix disables "negative" testing for JdwpListenTest. Negative testing is useful during development, but can cause interference with JdwpAttachTest (JdwpAttachTest already has negative testing disabled) --alex From serguei.spitsyn at oracle.com Sat Nov 2 03:59:03 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 1 Nov 2019 20:59:03 -0700 Subject: RFR: JDK-8231915: two JDI tests interfere with each other In-Reply-To: <2bfe41ec-a0e5-73fc-7845-c7ca71dcdd29@oracle.com> References: <2bfe41ec-a0e5-73fc-7845-c7ca71dcdd29@oracle.com> Message-ID: <252e1fcb-3119-d38c-8ed6-1861228b0421@oracle.com> Hi Alex, It looks good. Thanks, Serguei On 11/1/19 16:54, Alex Menkov wrote: > Hi all, > > please review a small fix for > https://bugs.openjdk.java.net/browse/JDK-8231915 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp_test_interference/webrev/ > > The fix disables "negative" testing for JdwpListenTest. > Negative testing is useful during development, but can cause > interference with JdwpAttachTest (JdwpAttachTest already has negative > testing disabled) > > --alex From serguei.spitsyn at oracle.com Sat Nov 2 04:26:15 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 1 Nov 2019 21:26:15 -0700 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> Message-ID: <74626bae-d29f-30d1-e2a5-48243a587f41@oracle.com> Hi Yasumasa, This change looks good. Even though it seems to be not impacting platforms other than Linux it is better to make sure it is built Okay on all platforms. Thanks, Serguei On 11/1/19 01:56, Yasumasa Suenaga wrote: > (Changed subject to review request) > > Hi, > > I converted LinuxDebuggerLocal.c to C++ code, and it works fine on > submit repo. > (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ > > Could you review it? > > > Thanks, > > Yasumasa > > > On 2019/11/01 8:54, Yasumasa Suenaga wrote: >> Hi David, >> >> On 2019/11/01 7:55, David Holmes wrote: >>> Hi Yasumasa, >>> >>> New build dependencies cannot be added lightly. This impacts >>> everyone who maintains build/test farms. >> >> Ok, thanks for telling it. >> >>> We already use the C++ demangling capabilities in the VM. Is there >>> some way to export that for use by libsaproc ? >>> >>> Otherwise using C++ demangle may still be the better choice given we >>> already have it as a dependency. >> >> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at >> decoder_linux.cpp . >> It is similar with my original proposal. >> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >> >> I agree with David to use C++ demangle way. >> However we need to choice the fix from following: >> >> ?? A. Convert LinuxDebuggerLocal.c to C++ code >> ?? B. Add C++ code for libsaproc.so to demangle symbols. >> >> I've discussed with Chris about it in [1]. >> Option A might be large change. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >> >> >>> David >>> >>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> Here's the failure during configure: >>>> >>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>> demangle.h! You might be able to fix this by running 'sudo yum >>>> install binutils-devel'. >>>> >>>> Chris >>>> >>>> >>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>> Hi, >>>>> >>>>> I filed this enhancement to JBS: >>>>> >>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>> >>>>> Also I pushed the changes to submit repo, but it was failed. >>>>> >>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>> >>>>> According to the email from Mach 5, dependency errors were >>>>> occurred in jib. >>>>> Can someone share the details? >>>>> I'm not familiar in jib, so I want help. >>>>> >>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>> You can change the configure script. I don't know if there's any >>>>>> concerns with using libiberty.a. That's possibly a legal question >>>>>> (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Thanks for quick reply! >>>>>>> >>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>> convert a lot of JNI calls to C++ style. >>>>>>> For example: >>>>>>> >>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>> ????? to >>>>>>> ? env->FindClass("java/lang/String") >>>>>>> >>>>>>> Can it be accepted? >>>>>>> >>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>> link libiberty.a which is provided by binutils. Thus I think we >>>>>>> need to check libiberty.a in configure script. Is it ok? >>>>>>> >>>>>>> >>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>> >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>> order to do so you've put the new native code in its own file >>>>>>>> rather than in LinuxDebuggerLocal.c. I'd like to see that >>>>>>>> resolved. So either convert LinuxDebuggerLocal.c to C++, or use >>>>>>>> cplus_demangle(). >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>> mangled as below: >>>>>>>>> >>>>>>>>> >>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>> + 0x6ac >>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>> + 0x33d >>>>>>>>> >>>>>>>>> >>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>> I think we can demangle in jstack with this patch. If it is >>>>>>>>> accepted, I will file it to JBS and send review request. >>>>>>>>> What do you think? >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>> >>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>> >>>>>>>>> >>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, >>>>>>>>> Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>> >>>>>>>>> >>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds >>>>>>>>> C++ source to SA. >>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>> But this function is provided by libiberty.a, so we need to >>>>>>>>> link it to libsaproc and need to check libiberty.a in >>>>>>>>> configure script. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> From suenaga at oss.nttdata.com Sat Nov 2 12:33:57 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 2 Nov 2019 21:33:57 +0900 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> Message-ID: <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> Hi Chris, On 2019/11/02 7:07, Chris Plummer wrote: > Hi? Yasumasa, > > I can't comment on the build changes. I don't know how the build works well enough. I ensured it on submit repo (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009). This change affects for Linux only. So I changed toolchain if SA would be built for Linux only. > LinuxDebuggerLocal.cpp should show up as a rename + diffs, not as a new file. I uploaded it: http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/LinuxDebuggerLocal.patch If we can use Git, it can show "rename + diffs"... I tried to use "hg rename" and "hg move", but the result did not change. > The rest of the changes look fine. Thanks! Yasumasa > thanks, > > Chris > > On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: >> (Changed subject to review request) >> >> Hi, >> >> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on submit repo. >> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >> >> Could you review it? >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> On 2019/11/01 7:55, David Holmes wrote: >>>> Hi Yasumasa, >>>> >>>> New build dependencies cannot be added lightly. This impacts everyone who maintains build/test farms. >>> >>> Ok, thanks for telling it. >>> >>>> We already use the C++ demangling capabilities in the VM. Is there some way to export that for use by libsaproc ? >>>> >>>> Otherwise using C++ demangle may still be the better choice given we already have it as a dependency. >>> >>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at decoder_linux.cpp . >>> It is similar with my original proposal. >>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>> >>> I agree with David to use C++ demangle way. >>> However we need to choice the fix from following: >>> >>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>> >>> I've discussed with Chris about it in [1]. >>> Option A might be large change. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>> >>> >>>> David >>>> >>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Here's the failure during configure: >>>>> >>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find demangle.h! You might be able to fix this by running 'sudo yum install binutils-devel'. >>>>> >>>>> Chris >>>>> >>>>> >>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>> Hi, >>>>>> >>>>>> I filed this enhancement to JBS: >>>>>> >>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>> >>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>> >>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>> >>>>>> According to the email from Mach 5, dependency errors were occurred in jib. >>>>>> Can someone share the details? >>>>>> I'm not familiar in jib, so I want help. >>>>>> >>>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>> You can change the configure script. I don't know if there's any concerns with using libiberty.a. That's possibly a legal question (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thanks for quick reply! >>>>>>>> >>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to convert a lot of JNI calls to C++ style. >>>>>>>> For example: >>>>>>>> >>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>> ????? to >>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>> >>>>>>>> Can it be accepted? >>>>>>>> >>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to link libiberty.a which is provided by binutils. Thus I think we need to check libiberty.a in configure script. Is it ok? >>>>>>>> >>>>>>>> >>>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>>> >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> I don't have concerns with adding C++ source to SA, but in order to do so you've put the new native code in its own file rather than in LinuxDebuggerLocal.c. I'd like to see that resolved. So either convert LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were mangled as below: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 0x00007ff255a8fa4c _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread + 0x6ac >>>>>>>>>> 0x00007ff255a8cc1d _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread + 0x33d >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> We can demangle them via c++filt, but I think it is more convenience if jstack can show demangling symbols. >>>>>>>>>> I think we can demangle in jstack with this patch. If it is accepted, I will file it to JBS and send review request. >>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>> >>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds C++ source to SA. >>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>> But this function is provided by libiberty.a, so we need to link it to libsaproc and need to check libiberty.a in configure script. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> > > From suenaga at oss.nttdata.com Sat Nov 2 12:37:27 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 2 Nov 2019 21:37:27 +0900 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <74626bae-d29f-30d1-e2a5-48243a587f41@oracle.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <74626bae-d29f-30d1-e2a5-48243a587f41@oracle.com> Message-ID: Hi Serguei, On 2019/11/02 13:26, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > This change looks good. > Even though it seems to be not impacting platforms other than Linux > it is better to make sure it is built Okay on all platforms. Thanks! At least, this change works fine on Windows, macOS, Solaris, and Linux - they are in submit repo. Yasumasa > Thanks, > Serguei > > > On 11/1/19 01:56, Yasumasa Suenaga wrote: >> (Changed subject to review request) >> >> Hi, >> >> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on submit repo. >> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >> >> Could you review it? >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> On 2019/11/01 7:55, David Holmes wrote: >>>> Hi Yasumasa, >>>> >>>> New build dependencies cannot be added lightly. This impacts everyone who maintains build/test farms. >>> >>> Ok, thanks for telling it. >>> >>>> We already use the C++ demangling capabilities in the VM. Is there some way to export that for use by libsaproc ? >>>> >>>> Otherwise using C++ demangle may still be the better choice given we already have it as a dependency. >>> >>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at decoder_linux.cpp . >>> It is similar with my original proposal. >>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>> >>> I agree with David to use C++ demangle way. >>> However we need to choice the fix from following: >>> >>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>> >>> I've discussed with Chris about it in [1]. >>> Option A might be large change. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>> >>> >>>> David >>>> >>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Here's the failure during configure: >>>>> >>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find demangle.h! You might be able to fix this by running 'sudo yum install binutils-devel'. >>>>> >>>>> Chris >>>>> >>>>> >>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>> Hi, >>>>>> >>>>>> I filed this enhancement to JBS: >>>>>> >>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>> >>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>> >>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>> >>>>>> According to the email from Mach 5, dependency errors were occurred in jib. >>>>>> Can someone share the details? >>>>>> I'm not familiar in jib, so I want help. >>>>>> >>>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>> You can change the configure script. I don't know if there's any concerns with using libiberty.a. That's possibly a legal question (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thanks for quick reply! >>>>>>>> >>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to convert a lot of JNI calls to C++ style. >>>>>>>> For example: >>>>>>>> >>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>> ????? to >>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>> >>>>>>>> Can it be accepted? >>>>>>>> >>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to link libiberty.a which is provided by binutils. Thus I think we need to check libiberty.a in configure script. Is it ok? >>>>>>>> >>>>>>>> >>>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>>> >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> I don't have concerns with adding C++ source to SA, but in order to do so you've put the new native code in its own file rather than in LinuxDebuggerLocal.c. I'd like to see that resolved. So either convert LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were mangled as below: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 0x00007ff255a8fa4c _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread + 0x6ac >>>>>>>>>> 0x00007ff255a8cc1d _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread + 0x33d >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> We can demangle them via c++filt, but I think it is more convenience if jstack can show demangling symbols. >>>>>>>>>> I think we can demangle in jstack with this patch. If it is accepted, I will file it to JBS and send review request. >>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>> >>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds C++ source to SA. >>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>> But this function is provided by libiberty.a, so we need to link it to libsaproc and need to check libiberty.a in configure script. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> > From daniel.daugherty at oracle.com Sat Nov 2 12:43:10 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Sat, 2 Nov 2019 08:43:10 -0400 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> Message-ID: <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> Since this review contains build changes, I've added build-dev at ... Dan On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: > (Changed subject to review request) > > Hi, > > I converted LinuxDebuggerLocal.c to C++ code, and it works fine on > submit repo. > (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ > > Could you review it? > > > Thanks, > > Yasumasa > > > On 2019/11/01 8:54, Yasumasa Suenaga wrote: >> Hi David, >> >> On 2019/11/01 7:55, David Holmes wrote: >>> Hi Yasumasa, >>> >>> New build dependencies cannot be added lightly. This impacts >>> everyone who maintains build/test farms. >> >> Ok, thanks for telling it. >> >>> We already use the C++ demangling capabilities in the VM. Is there >>> some way to export that for use by libsaproc ? >>> >>> Otherwise using C++ demangle may still be the better choice given we >>> already have it as a dependency. >> >> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at >> decoder_linux.cpp . >> It is similar with my original proposal. >> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >> >> I agree with David to use C++ demangle way. >> However we need to choice the fix from following: >> >> ?? A. Convert LinuxDebuggerLocal.c to C++ code >> ?? B. Add C++ code for libsaproc.so to demangle symbols. >> >> I've discussed with Chris about it in [1]. >> Option A might be large change. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >> >> >>> David >>> >>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> Here's the failure during configure: >>>> >>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>> demangle.h! You might be able to fix this by running 'sudo yum >>>> install binutils-devel'. >>>> >>>> Chris >>>> >>>> >>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>> Hi, >>>>> >>>>> I filed this enhancement to JBS: >>>>> >>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>> >>>>> Also I pushed the changes to submit repo, but it was failed. >>>>> >>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>> >>>>> According to the email from Mach 5, dependency errors were >>>>> occurred in jib. >>>>> Can someone share the details? >>>>> I'm not familiar in jib, so I want help. >>>>> >>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>> You can change the configure script. I don't know if there's any >>>>>> concerns with using libiberty.a. That's possibly a legal question >>>>>> (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Thanks for quick reply! >>>>>>> >>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>> convert a lot of JNI calls to C++ style. >>>>>>> For example: >>>>>>> >>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>> ????? to >>>>>>> ? env->FindClass("java/lang/String") >>>>>>> >>>>>>> Can it be accepted? >>>>>>> >>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>> link libiberty.a which is provided by binutils. Thus I think we >>>>>>> need to check libiberty.a in configure script. Is it ok? >>>>>>> >>>>>>> >>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>> >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>> order to do so you've put the new native code in its own file >>>>>>>> rather than in LinuxDebuggerLocal.c. I'd like to see that >>>>>>>> resolved. So either convert LinuxDebuggerLocal.c to C++, or use >>>>>>>> cplus_demangle(). >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>> mangled as below: >>>>>>>>> >>>>>>>>> >>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>> + 0x6ac >>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>> + 0x33d >>>>>>>>> >>>>>>>>> >>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>> I think we can demangle in jstack with this patch. If it is >>>>>>>>> accepted, I will file it to JBS and send review request. >>>>>>>>> What do you think? >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>> >>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>> >>>>>>>>> >>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, >>>>>>>>> Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>> >>>>>>>>> >>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds >>>>>>>>> C++ source to SA. >>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>> But this function is provided by libiberty.a, so we need to >>>>>>>>> link it to libsaproc and need to check libiberty.a in >>>>>>>>> configure script. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> > From magnus.ihse.bursie at oracle.com Mon Nov 4 10:27:44 2019 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Mon, 4 Nov 2019 11:27:44 +0100 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> Message-ID: On 2019-11-02 13:43, Daniel D. Daugherty wrote: > Since this review contains build changes, I've added build-dev at ... Thanks Dan for noticing this and cc:ing us. Yasumasa: build changes look fine. Thanks. /Magnus > > Dan > > > On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: >> (Changed subject to review request) >> >> Hi, >> >> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on >> submit repo. >> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >> >> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >> >> Could you review it? >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>> Hi David, >>> >>> On 2019/11/01 7:55, David Holmes wrote: >>>> Hi Yasumasa, >>>> >>>> New build dependencies cannot be added lightly. This impacts >>>> everyone who maintains build/test farms. >>> >>> Ok, thanks for telling it. >>> >>>> We already use the C++ demangling capabilities in the VM. Is there >>>> some way to export that for use by libsaproc ? >>>> >>>> Otherwise using C++ demangle may still be the better choice given >>>> we already have it as a dependency. >>> >>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at >>> decoder_linux.cpp . >>> It is similar with my original proposal. >>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>> >>> I agree with David to use C++ demangle way. >>> However we need to choice the fix from following: >>> >>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>> >>> I've discussed with Chris about it in [1]. >>> Option A might be large change. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] >>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>> >>> >>>> David >>>> >>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>> Hi Yasumasa, >>>>> >>>>> Here's the failure during configure: >>>>> >>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>> install binutils-devel'. >>>>> >>>>> Chris >>>>> >>>>> >>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>> Hi, >>>>>> >>>>>> I filed this enhancement to JBS: >>>>>> >>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>> >>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>> >>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>> >>>>>> According to the email from Mach 5, dependency errors were >>>>>> occurred in jib. >>>>>> Can someone share the details? >>>>>> I'm not familiar in jib, so I want help. >>>>>> >>>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>> You can change the configure script. I don't know if there's any >>>>>>> concerns with using libiberty.a. That's possibly a legal >>>>>>> question (GNU GPL). You might want to ask that on jdk-dev and/or >>>>>>> build-dev. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thanks for quick reply! >>>>>>>> >>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>> For example: >>>>>>>> >>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>> ????? to >>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>> >>>>>>>> Can it be accepted? >>>>>>>> >>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>>> link libiberty.a which is provided by binutils. Thus I think we >>>>>>>> need to check libiberty.a in configure script. Is it ok? >>>>>>>> >>>>>>>> >>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>> script. >>>>>>>> >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>> order to do so you've put the new native code in its own file >>>>>>>>> rather than in LinuxDebuggerLocal.c. I'd like to see that >>>>>>>>> resolved. So either convert LinuxDebuggerLocal.c to C++, or >>>>>>>>> use cplus_demangle(). >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>> mangled as below: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>> + 0x6ac >>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>> + 0x33d >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>> I think we can demangle in jstack with this patch. If it is >>>>>>>>>> accepted, I will file it to JBS and send review request. >>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>> >>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>> adds C++ source to SA. >>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>> But this function is provided by libiberty.a, so we need to >>>>>>>>>> link it to libsaproc and need to check libiberty.a in >>>>>>>>>> configure script. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >> > From suenaga at oss.nttdata.com Mon Nov 4 13:09:28 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 4 Nov 2019 22:09:28 +0900 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> Message-ID: <297ac028-5e4b-1bff-789f-0a47035fdaeb@oss.nttdata.com> Thanks Magnus and Dan! I will push it. Yasumasa On 2019/11/04 19:27, Magnus Ihse Bursie wrote: > On 2019-11-02 13:43, Daniel D. Daugherty wrote: >> Since this review contains build changes, I've added build-dev at ... > Thanks Dan for noticing this and cc:ing us. > > Yasumasa: build changes look fine. Thanks. > > /Magnus >> >> Dan >> >> >> On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: >>> (Changed subject to review request) >>> >>> Hi, >>> >>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on submit repo. >>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>> >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>> >>> Could you review it? >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> On 2019/11/01 7:55, David Holmes wrote: >>>>> Hi Yasumasa, >>>>> >>>>> New build dependencies cannot be added lightly. This impacts everyone who maintains build/test farms. >>>> >>>> Ok, thanks for telling it. >>>> >>>>> We already use the C++ demangling capabilities in the VM. Is there some way to export that for use by libsaproc ? >>>>> >>>>> Otherwise using C++ demangle may still be the better choice given we already have it as a dependency. >>>> >>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at decoder_linux.cpp . >>>> It is similar with my original proposal. >>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>> >>>> I agree with David to use C++ demangle way. >>>> However we need to choice the fix from following: >>>> >>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>> >>>> I've discussed with Chris about it in [1]. >>>> Option A might be large change. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>> >>>> >>>>> David >>>>> >>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> Here's the failure during configure: >>>>>> >>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find demangle.h! You might be able to fix this by running 'sudo yum install binutils-devel'. >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I filed this enhancement to JBS: >>>>>>> >>>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>> >>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>> >>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>> >>>>>>> According to the email from Mach 5, dependency errors were occurred in jib. >>>>>>> Can someone share the details? >>>>>>> I'm not familiar in jib, so I want help. >>>>>>> >>>>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>> You can change the configure script. I don't know if there's any concerns with using libiberty.a. That's possibly a legal question (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Thanks for quick reply! >>>>>>>>> >>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to convert a lot of JNI calls to C++ style. >>>>>>>>> For example: >>>>>>>>> >>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>> ????? to >>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>> >>>>>>>>> Can it be accepted? >>>>>>>>> >>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to link libiberty.a which is provided by binutils. Thus I think we need to check libiberty.a in configure script. Is it ok? >>>>>>>>> >>>>>>>>> >>>>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>>>> >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> I don't have concerns with adding C++ source to SA, but in order to do so you've put the new native code in its own file rather than in LinuxDebuggerLocal.c. I'd like to see that resolved. So either convert LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were mangled as below: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 0x00007ff255a8fa4c _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread + 0x6ac >>>>>>>>>>> 0x00007ff255a8cc1d _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread + 0x33d >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We can demangle them via c++filt, but I think it is more convenience if jstack can show demangling symbols. >>>>>>>>>>> I think we can demangle in jstack with this patch. If it is accepted, I will file it to JBS and send review request. >>>>>>>>>>> What do you think? >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>> >>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds C++ source to SA. >>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>> But this function is provided by libiberty.a, so we need to link it to libsaproc and need to check libiberty.a in configure script. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>> >> > From daniel.daugherty at oracle.com Mon Nov 4 16:40:22 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 4 Nov 2019 11:40:22 -0500 Subject: ClhsdbCDSCore jtreg test fails on OSX In-Reply-To: References: <15b4b0dc-6ad0-7a3c-643b-b121766ff1db@gmail.com> Message-ID: <85cf1260-fdcb-c0e9-4d05-b411b97de7f3@oracle.com> Moving this thread over to serviceability-dev at ... since this question is about Serviceability Agent tests... Bcc'ing hotspot-dev at ... so folks know that the thread moved... On 11/4/19 9:49 AM, Jaikiran Pai wrote: > On 04/11/19 8:11 PM, Jaikiran Pai wrote: >> ... >> Looking at the testcase itself, I see this >> http://hg.openjdk.java.net/jdk/jdk/file/6f98d0173a72/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l112 >> >> if (Platform.isOSX()) { >> >> ??? File coresDir = new File("/cores"); >> >> ??? if (!coresDir.isDirectory() || !coresDir.canWrite()) { >> >> ??????? throw new Error("cores is not a directory or does not have write >> permissions"); >> >> >> I'm on OSX. So this test expects a directory called "cores" at the root >> of the filesystem? That looks odd. I don't have any such directory. > Correction - I do have that directory (my "ls" command that I previously > used to check had a typo), but that /cores directory is owned by "root" > and the test is running as a regular user. > > -Jaikiran $ ls -ld /cores drwxrwxr-t? 2 root? admin? 64 Nov? 4 09:22 /cores/ so the directory on my macOSX machine is writable by group 'admin' and my login happens to belong to group 'admin'. Dan From chris.plummer at oracle.com Mon Nov 4 17:28:08 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 4 Nov 2019 09:28:08 -0800 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> Message-ID: <1e3b0d70-b9bd-d826-1aac-493821daf7f2@oracle.com> On 11/2/19 5:33 AM, Yasumasa Suenaga wrote: > Hi Chris, > > On 2019/11/02 7:07, Chris Plummer wrote: >> Hi? Yasumasa, >> >> I can't comment on the build changes. I don't know how the build >> works well enough. > > I ensured it on submit repo > (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009). > This change affects for Linux only. So I changed toolchain if SA would > be built for Linux only. > > >> LinuxDebuggerLocal.cpp should show up as a rename + diffs, not as a >> new file. > > I uploaded it: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/LinuxDebuggerLocal.patch > > If we can use Git, it can show "rename + diffs"... > I tried to use "hg rename" and "hg move", but the result did not change. Hi Yasumasa, I just did an "hg mv" and then modified the file, and webrev did what it is suppose to do and showed just the diffs, but also indicated that the file was moved. Which version of webrev are you using? What do "hg diff" and "hg status" show? thanks, Chris > >> The rest of the changes look fine. > > Thanks! > > > Yasumasa > > >> thanks, >> >> Chris >> >> On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: >>> (Changed subject to review request) >>> >>> Hi, >>> >>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on >>> submit repo. >>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>> >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>> >>> Could you review it? >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> On 2019/11/01 7:55, David Holmes wrote: >>>>> Hi Yasumasa, >>>>> >>>>> New build dependencies cannot be added lightly. This impacts >>>>> everyone who maintains build/test farms. >>>> >>>> Ok, thanks for telling it. >>>> >>>>> We already use the C++ demangling capabilities in the VM. Is there >>>>> some way to export that for use by libsaproc ? >>>>> >>>>> Otherwise using C++ demangle may still be the better choice given >>>>> we already have it as a dependency. >>>> >>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at >>>> decoder_linux.cpp . >>>> It is similar with my original proposal. >>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>> >>>> I agree with David to use C++ demangle way. >>>> However we need to choice the fix from following: >>>> >>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>> >>>> I've discussed with Chris about it in [1]. >>>> Option A might be large change. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] >>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>> >>>> >>>>> David >>>>> >>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> Here's the failure during configure: >>>>>> >>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>>> install binutils-devel'. >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I filed this enhancement to JBS: >>>>>>> >>>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>> >>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>> >>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>> >>>>>>> According to the email from Mach 5, dependency errors were >>>>>>> occurred in jib. >>>>>>> Can someone share the details? >>>>>>> I'm not familiar in jib, so I want help. >>>>>>> >>>>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>> You can change the configure script. I don't know if there's >>>>>>>> any concerns with using libiberty.a. That's possibly a legal >>>>>>>> question (GNU GPL). You might want to ask that on jdk-dev >>>>>>>> and/or build-dev. >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Thanks for quick reply! >>>>>>>>> >>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>>> For example: >>>>>>>>> >>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>> ????? to >>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>> >>>>>>>>> Can it be accepted? >>>>>>>>> >>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>>>> link libiberty.a which is provided by binutils. Thus I think >>>>>>>>> we need to check libiberty.a in configure script. Is it ok? >>>>>>>>> >>>>>>>>> >>>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>>> script. >>>>>>>>> >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>>> order to do so you've put the new native code in its own file >>>>>>>>>> rather than in LinuxDebuggerLocal.c. I'd like to see that >>>>>>>>>> resolved. So either convert LinuxDebuggerLocal.c to C++, or >>>>>>>>>> use cplus_demangle(). >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>>> mangled as below: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>> + 0x6ac >>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>> + 0x33d >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>>> I think we can demangle in jstack with this patch. If it is >>>>>>>>>>> accepted, I will file it to JBS and send review request. >>>>>>>>>>> What do you think? >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>> >>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>>> adds C++ source to SA. >>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>> But this function is provided by libiberty.a, so we need to >>>>>>>>>>> link it to libsaproc and need to check libiberty.a in >>>>>>>>>>> configure script. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >> >> From leonid.mesnik at oracle.com Mon Nov 4 17:46:59 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Mon, 4 Nov 2019 09:46:59 -0800 Subject: ClhsdbCDSCore jtreg test fails on OSX In-Reply-To: <85cf1260-fdcb-c0e9-4d05-b411b97de7f3@oracle.com> References: <15b4b0dc-6ad0-7a3c-643b-b121766ff1db@gmail.com> <85cf1260-fdcb-c0e9-4d05-b411b97de7f3@oracle.com> Message-ID: Hi The location of core files depends on system configuration. So test tries to find core files using test output and searching core files in current directory. See details here: http://hg.openjdk.java.net/jdk/jdk/file/6f98d0173a72/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l206 And only if test fails to find core file then it additionally tries to generate error/skip test checking system configuration. The /cores directory usually available for all uses to dump cores like: lmesnik at mymac:~/ws/ks-apps/open/test/lib$ ls -all /cores/ total 61448520 drwxrwxr-t 11 root admin 352 Sep 5 00:24 . drwxr-xr-x 34 root wheel 1088 Oct 4 22:27 .. -r-------- 1 lmesnik admin 2670608384 Aug 25 01:09 core.32410 ... If /cores doesn't have write permissions that it is one of possible reasons why test can't find core file and fails. It fails even without this check but just with different exception in http://hg.openjdk.java.net/jdk/jdk/file/tip/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l135 So I suggest you to check where core file is dumped actually, if it dumped and why test can't find it. Leonid > On Nov 4, 2019, at 8:40 AM, Daniel D. Daugherty wrote: > > Moving this thread over to serviceability-dev at ... since this question is > about Serviceability Agent tests... Bcc'ing hotspot-dev at ... so folks know > that the thread moved... > > > On 11/4/19 9:49 AM, Jaikiran Pai wrote: >> On 04/11/19 8:11 PM, Jaikiran Pai wrote: >>> ... >>> Looking at the testcase itself, I see this >>> http://hg.openjdk.java.net/jdk/jdk/file/6f98d0173a72/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l112 >>> >>> if (Platform.isOSX()) { >>> >>> File coresDir = new File("/cores"); >>> >>> if (!coresDir.isDirectory() || !coresDir.canWrite()) { >>> >>> throw new Error("cores is not a directory or does not have write >>> permissions"); >>> >>> >>> I'm on OSX. So this test expects a directory called "cores" at the root >>> of the filesystem? That looks odd. I don't have any such directory. >> Correction - I do have that directory (my "ls" command that I previously >> used to check had a typo), but that /cores directory is owned by "root" >> and the test is running as a regular user. >> >> -Jaikiran > > $ ls -ld /cores > drwxrwxr-t 2 root admin 64 Nov 4 09:22 /cores/ > > so the directory on my macOSX machine is writable by group 'admin' > and my login happens to belong to group 'admin'. > > Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ioi.lam at oracle.com Mon Nov 4 19:06:08 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 4 Nov 2019 11:06:08 -0800 Subject: ClhsdbCDSCore jtreg test fails on OSX In-Reply-To: References: <15b4b0dc-6ad0-7a3c-643b-b121766ff1db@gmail.com> <85cf1260-fdcb-c0e9-4d05-b411b97de7f3@oracle.com> Message-ID: <3d56a98f-9235-c591-eedf-9c9cac1256af@oracle.com> Jaikiran, My /core dir is writable by root and admin users. I am running Mojave. Is your user mac ID in the admin group? Also, do you have any issues with test/hotspot/jtreg/serviceability/sa/TestJmapCore.java that also tests the use of core files? Leonid, TestJmapCore.java and ClhsdbCDSCore.java seem to have duplicated code in finding core files. Also, there's some partial logic for looking up core files under test/hotspot/jtreg/compiler/ciReplay/CiReplayBase.java. Maybe these should be consolidated into the test library? Thanks - Ioi On 11/4/19 9:46 AM, Leonid Mesnik wrote: > Hi > > The location of core files depends on system configuration. So test > tries to find core files using test output and searching core files in > current directory. See details here: > http://hg.openjdk.java.net/jdk/jdk/file/6f98d0173a72/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l206 > > And only if test fails to find core file then it additionally tries to > generate error/skip test checking system configuration. > > The /cores directory usually available for all uses to dump cores like: > lmesnik at mymac:~/ws/ks-apps/open/test/lib$ ls -all /cores/ > total 61448520 > drwxrwxr-t ?11 root ? ? admin ? ? ? ? 352 Sep ?5 00:24 . > drwxr-xr-x ?34 root ? ? wheel ? ? ? ?1088 Oct ?4 22:27 .. > -r-------- ? 1 lmesnik ?admin ?2670608384 Aug 25 01:09 core.32410 > ... > > If /cores doesn't have write permissions that it is one of possible > reasons why test can't find core file and fails. It fails even without > this check but just with different exception in > http://hg.openjdk.java.net/jdk/jdk/file/tip/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l135 > > So I suggest you to check where core file is dumped actually, if it > dumped and why test can't find it. > > Leonid > >> On Nov 4, 2019, at 8:40 AM, Daniel D. Daugherty >> > wrote: >> >> Moving this thread over to serviceability-dev at ... since this question is >> about Serviceability Agent tests... Bcc'ing hotspot-dev at ... so folks know >> that the thread moved... >> >> >> On 11/4/19 9:49 AM, Jaikiran Pai wrote: >>> On 04/11/19 8:11 PM, Jaikiran Pai wrote: >>>> ... >>>> Looking at the testcase itself, I see this >>>> http://hg.openjdk.java.net/jdk/jdk/file/6f98d0173a72/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l112 >>>> >>>> if (Platform.isOSX()) { >>>> >>>> ??? File coresDir = new File("/cores"); >>>> >>>> ??? if (!coresDir.isDirectory() || !coresDir.canWrite()) { >>>> >>>> ??????? throw new Error("cores is not a directory or does not have >>>> write >>>> permissions"); >>>> >>>> >>>> I'm on OSX. So this test expects a directory called "cores" at the root >>>> of the filesystem? That looks odd. I don't have any such directory. >>> Correction - I do have that directory (my "ls" command that I previously >>> used to check had a typo), but that /cores directory is owned by "root" >>> and the test is running as a regular user. >>> >>> -Jaikiran >> >> $ ls -ld /cores >> drwxrwxr-t? 2 root? admin? 64 Nov? 4 09:22 /cores/ >> >> so the directory on my macOSX machine is writable by group 'admin' >> and my login happens to belong to group 'admin'. >> >> Dan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonid.mesnik at oracle.com Mon Nov 4 20:45:48 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Mon, 4 Nov 2019 12:45:48 -0800 Subject: ClhsdbCDSCore jtreg test fails on OSX In-Reply-To: <3d56a98f-9235-c591-eedf-9c9cac1256af@oracle.com> References: <15b4b0dc-6ad0-7a3c-643b-b121766ff1db@gmail.com> <85cf1260-fdcb-c0e9-4d05-b411b97de7f3@oracle.com> <3d56a98f-9235-c591-eedf-9c9cac1256af@oracle.com> Message-ID: <641CC3AA-2038-4719-B072-23402EEC1370@oracle.com> Good idea, filed RFE https://bugs.openjdk.java.net/browse/JDK-8233533 Leonid > On Nov 4, 2019, at 11:06 AM, Ioi Lam wrote: > > Jaikiran, > > My /core dir is writable by root and admin users. I am running Mojave. Is your user mac ID in the admin group? > > Also, do you have any issues with test/hotspot/jtreg/serviceability/sa/TestJmapCore.java that also tests the use of core files? > > Leonid, > > TestJmapCore.java and ClhsdbCDSCore.java seem to have duplicated code in finding core files. Also, there's some partial logic for looking up core files under test/hotspot/jtreg/compiler/ciReplay/CiReplayBase.java. Maybe these should be consolidated into the test library? > > Thanks > - Ioi > > On 11/4/19 9:46 AM, Leonid Mesnik wrote: >> Hi >> >> The location of core files depends on system configuration. So test tries to find core files using test output and searching core files in current directory. See details here: >> http://hg.openjdk.java.net/jdk/jdk/file/6f98d0173a72/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l206 >> >> And only if test fails to find core file then it additionally tries to generate error/skip test checking system configuration. >> >> The /cores directory usually available for all uses to dump cores like: >> lmesnik at mymac:~/ws/ks-apps/open/test/lib$ ls -all /cores/ >> total 61448520 >> drwxrwxr-t 11 root admin 352 Sep 5 00:24 . >> drwxr-xr-x 34 root wheel 1088 Oct 4 22:27 .. >> -r-------- 1 lmesnik admin 2670608384 Aug 25 01:09 core.32410 >> ... >> >> If /cores doesn't have write permissions that it is one of possible reasons why test can't find core file and fails. It fails even without this check but just with different exception in >> http://hg.openjdk.java.net/jdk/jdk/file/tip/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l135 >> >> So I suggest you to check where core file is dumped actually, if it dumped and why test can't find it. >> >> Leonid >> >>> On Nov 4, 2019, at 8:40 AM, Daniel D. Daugherty > wrote: >>> >>> Moving this thread over to serviceability-dev at ... since this question is >>> about Serviceability Agent tests... Bcc'ing hotspot-dev at ... so folks know >>> that the thread moved... >>> >>> >>> On 11/4/19 9:49 AM, Jaikiran Pai wrote: >>>> On 04/11/19 8:11 PM, Jaikiran Pai wrote: >>>>> ... >>>>> Looking at the testcase itself, I see this >>>>> http://hg.openjdk.java.net/jdk/jdk/file/6f98d0173a72/test/hotspot/jtreg/serviceability/sa/ClhsdbCDSCore.java#l112 >>>>> >>>>> if (Platform.isOSX()) { >>>>> >>>>> File coresDir = new File("/cores"); >>>>> >>>>> if (!coresDir.isDirectory() || !coresDir.canWrite()) { >>>>> >>>>> throw new Error("cores is not a directory or does not have write >>>>> permissions"); >>>>> >>>>> >>>>> I'm on OSX. So this test expects a directory called "cores" at the root >>>>> of the filesystem? That looks odd. I don't have any such directory. >>>> Correction - I do have that directory (my "ls" command that I previously >>>> used to check had a typo), but that /cores directory is owned by "root" >>>> and the test is running as a regular user. >>>> >>>> -Jaikiran >>> >>> $ ls -ld /cores >>> drwxrwxr-t 2 root admin 64 Nov 4 09:22 /cores/ >>> >>> so the directory on my macOSX machine is writable by group 'admin' >>> and my login happens to belong to group 'admin'. >>> >>> Dan >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Nov 4 23:10:28 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 5 Nov 2019 09:10:28 +1000 Subject: Incomplete "fake exception" stacktraces Message-ID: <39ceefd3-a8c1-3a46-84b5-f7bff5693d6a@oracle.com> I'm investigating some JVM TI scenario test failures following a change I made in hotspot. The log shows: The following fake exception stacktrace is for failure analysis. nsk.share.Fake_Exception_for_RULE_Creation: (jvmti_tools.cpp:683) error at nsk_lvcomplain(nsk_tools.cpp:172) # ERROR: jvmti_tools.cpp, 683: error # jvmti error: code=52, name=JVMTI_ERROR_INTERRUPT and that is it. This stacktrace is completely useless as it doesn't show from where nsk_lvcomplain is called! Does anyone know how this is supposed to work and whether there is some way for me to get a real stacktrace? Thanks, David From chris.plummer at oracle.com Tue Nov 5 00:58:45 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 4 Nov 2019 16:58:45 -0800 Subject: Incomplete "fake exception" stacktraces In-Reply-To: <39ceefd3-a8c1-3a46-84b5-f7bff5693d6a@oracle.com> References: <39ceefd3-a8c1-3a46-84b5-f7bff5693d6a@oracle.com> Message-ID: Hi David, The "fake exception" was never meant to give you a stack trace, but only indicate which line the failure happened at. Unfortunately you are hitting: void exitOnError(jvmtiError error) { ??? if (!NSK_JVMTI_VERIFY(error)) { ??????? exit(error); ??? } } So this isn't really the point of failure, just a detection of it after, and buried in a C call. An example of the "fake exception" doing better would be https://bugs.openjdk.java.net/browse/JDK-8224555: # ERROR: tc02t001.cpp, 126: line == lines[enterEventsCount] || line == (lines[enterEventsCount] + 1) # verified assertion is FALSE ??????? if (!NSK_VERIFY(line == lines[enterEventsCount] || ??????????????? line == (lines[enterEventsCount] + 1))) { I haven't looked at NSK_JVMTI_VERIFY yet, but possibly it could do better, and maybe exitOnError would play a role in that (would need to be macroized) so we know who is calling exitOnError(). Chris On 11/4/19 3:10 PM, David Holmes wrote: > I'm investigating some JVM TI scenario test failures following a > change I made in hotspot. The log shows: > > The following fake exception stacktrace is for failure analysis. > nsk.share.Fake_Exception_for_RULE_Creation: (jvmti_tools.cpp:683) error > ????at nsk_lvcomplain(nsk_tools.cpp:172) > # ERROR: jvmti_tools.cpp, 683: error > #?? jvmti error: code=52, name=JVMTI_ERROR_INTERRUPT > > and that is it. This stacktrace is completely useless as it doesn't > show from where nsk_lvcomplain is called! > > Does anyone know how this is supposed to work and whether there is > some way for me to get a real stacktrace? > > Thanks, > David From david.holmes at oracle.com Tue Nov 5 01:14:41 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 5 Nov 2019 11:14:41 +1000 Subject: Incomplete "fake exception" stacktraces In-Reply-To: References: <39ceefd3-a8c1-3a46-84b5-f7bff5693d6a@oracle.com> Message-ID: Hi Chris, Thanks for the explanation. I agree that having exitOnError be a macro so that NSK_JVMTI_VERIFY works in that case, would be better. I suspect I'm hitting an issue in the nsk test infrastructure, but I hvae no visibility into what. :( David On 5/11/2019 10:58 am, Chris Plummer wrote: > Hi David, > > The "fake exception" was never meant to give you a stack trace, but only > indicate which line the failure happened at. Unfortunately you are hitting: > > void exitOnError(jvmtiError error) { > ??? if (!NSK_JVMTI_VERIFY(error)) { > ??????? exit(error); > ??? } > } > > So this isn't really the point of failure, just a detection of it after, > and buried in a C call. An example of the "fake exception" doing better > would be https://bugs.openjdk.java.net/browse/JDK-8224555: > > # ERROR: tc02t001.cpp, 126: line == lines[enterEventsCount] || line == > (lines[enterEventsCount] + 1) > # verified assertion is FALSE > > ??????? if (!NSK_VERIFY(line == lines[enterEventsCount] || > ??????????????? line == (lines[enterEventsCount] + 1))) { > > I haven't looked at NSK_JVMTI_VERIFY yet, but possibly it could do > better, and maybe exitOnError would play a role in that (would need to > be macroized) so we know who is calling exitOnError(). > > Chris > > On 11/4/19 3:10 PM, David Holmes wrote: >> I'm investigating some JVM TI scenario test failures following a >> change I made in hotspot. The log shows: >> >> The following fake exception stacktrace is for failure analysis. >> nsk.share.Fake_Exception_for_RULE_Creation: (jvmti_tools.cpp:683) error >> ????at nsk_lvcomplain(nsk_tools.cpp:172) >> # ERROR: jvmti_tools.cpp, 683: error >> #?? jvmti error: code=52, name=JVMTI_ERROR_INTERRUPT >> >> and that is it. This stacktrace is completely useless as it doesn't >> show from where nsk_lvcomplain is called! >> >> Does anyone know how this is supposed to work and whether there is >> some way for me to get a real stacktrace? >> >> Thanks, >> David > > From suenaga at oss.nttdata.com Tue Nov 5 01:19:14 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 5 Nov 2019 10:19:14 +0900 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <1e3b0d70-b9bd-d826-1aac-493821daf7f2@oracle.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> <1e3b0d70-b9bd-d826-1aac-493821daf7f2@oracle.com> Message-ID: <17935396-c24b-a7d7-164b-af48a8190092@oss.nttdata.com> Hi Chris, On 2019/11/05 2:28, Chris Plummer wrote: > On 11/2/19 5:33 AM, Yasumasa Suenaga wrote: >> Hi Chris, >> >> On 2019/11/02 7:07, Chris Plummer wrote: >>> Hi? Yasumasa, >>> >>> I can't comment on the build changes. I don't know how the build works well enough. >> >> I ensured it on submit repo (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009). >> This change affects for Linux only. So I changed toolchain if SA would be built for Linux only. >> >> >>> LinuxDebuggerLocal.cpp should show up as a rename + diffs, not as a new file. >> >> I uploaded it: >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/LinuxDebuggerLocal.patch >> >> If we can use Git, it can show "rename + diffs"... >> I tried to use "hg rename" and "hg move", but the result did not change. > Hi Yasumasa, > > I just did an "hg mv" and then modified the file, and webrev did what it is suppose to do and showed just the diffs, but also indicated that the file was moved. Which version of webrev are you using? I'm using wevrev changeset: 24:8cd091802cd0 with Mercurial 4.5.3 on Ubuntu 18.04 LTS. > What do "hg diff" and "hg status" show? For example, rename Makefile to Makefile.orig: ``` $ hg mv Makefile Makefile.orig $ hg status A Makefile.orig R Makefile $ hg diff diff -r c41d1303a87c Makefile --- a/Makefile Mon Nov 04 13:02:40 2019 -0800 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,64 +0,0 @@ -# -# Copyright (c) 2012, 2015, Oracle and/or its affiliates. All rights reserved. -# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. : (snip) ``` It seems to be a problem in hg instead of webrev. Thanks, Yasumasa > thanks, > > Chris >> >>> The rest of the changes look fine. >> >> Thanks! >> >> >> Yasumasa >> >> >>> thanks, >>> >>> Chris >>> >>> On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: >>>> (Changed subject to review request) >>>> >>>> Hi, >>>> >>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on submit repo. >>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>> >>>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>> >>>> Could you review it? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> New build dependencies cannot be added lightly. This impacts everyone who maintains build/test farms. >>>>> >>>>> Ok, thanks for telling it. >>>>> >>>>>> We already use the C++ demangling capabilities in the VM. Is there some way to export that for use by libsaproc ? >>>>>> >>>>>> Otherwise using C++ demangle may still be the better choice given we already have it as a dependency. >>>>> >>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at decoder_linux.cpp . >>>>> It is similar with my original proposal. >>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>> >>>>> I agree with David to use C++ demangle way. >>>>> However we need to choice the fix from following: >>>>> >>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>> >>>>> I've discussed with Chris about it in [1]. >>>>> Option A might be large change. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>> >>>>> >>>>>> David >>>>>> >>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> Here's the failure during configure: >>>>>>> >>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find demangle.h! You might be able to fix this by running 'sudo yum install binutils-devel'. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> >>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I filed this enhancement to JBS: >>>>>>>> >>>>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>> >>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>> >>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>> >>>>>>>> According to the email from Mach 5, dependency errors were occurred in jib. >>>>>>>> Can someone share the details? >>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>> >>>>>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>> You can change the configure script. I don't know if there's any concerns with using libiberty.a. That's possibly a legal question (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> Thanks for quick reply! >>>>>>>>>> >>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to convert a lot of JNI calls to C++ style. >>>>>>>>>> For example: >>>>>>>>>> >>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>> ????? to >>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>> >>>>>>>>>> Can it be accepted? >>>>>>>>>> >>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to link libiberty.a which is provided by binutils. Thus I think we need to check libiberty.a in configure script. Is it ok? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in order to do so you've put the new native code in its own file rather than in LinuxDebuggerLocal.c. I'd like to see that resolved. So either convert LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were mangled as below: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 0x00007ff255a8fa4c _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread + 0x6ac >>>>>>>>>>>> 0x00007ff255a8cc1d _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread + 0x33d >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> We can demangle them via c++filt, but I think it is more convenience if jstack can show demangling symbols. >>>>>>>>>>>> I think we can demangle in jstack with this patch. If it is accepted, I will file it to JBS and send review request. >>>>>>>>>>>> What do you think? >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>> >>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds C++ source to SA. >>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>> But this function is provided by libiberty.a, so we need to link it to libsaproc and need to check libiberty.a in configure script. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>> >>> > > From chris.plummer at oracle.com Tue Nov 5 04:47:03 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 4 Nov 2019 20:47:03 -0800 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <17935396-c24b-a7d7-164b-af48a8190092@oss.nttdata.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> <1e3b0d70-b9bd-d826-1aac-493821daf7f2@oracle.com> <17935396-c24b-a7d7-164b-af48a8190092@oss.nttdata.com> Message-ID: <90d7668e-72a4-c207-139a-055eb0d36f95@oracle.com> On 11/4/19 5:19 PM, Yasumasa Suenaga wrote: > Hi Chris, > > On 2019/11/05 2:28, Chris Plummer wrote: >> On 11/2/19 5:33 AM, Yasumasa Suenaga wrote: >>> Hi Chris, >>> >>> On 2019/11/02 7:07, Chris Plummer wrote: >>>> Hi? Yasumasa, >>>> >>>> I can't comment on the build changes. I don't know how the build >>>> works well enough. >>> >>> I ensured it on submit repo >>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009). >>> This change affects for Linux only. So I changed toolchain if SA >>> would be built for Linux only. >>> >>> >>>> LinuxDebuggerLocal.cpp should show up as a rename + diffs, not as a >>>> new file. >>> >>> I uploaded it: >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/LinuxDebuggerLocal.patch >>> >>> >>> If we can use Git, it can show "rename + diffs"... >>> I tried to use "hg rename" and "hg move", but the result did not >>> change. >> Hi Yasumasa, >> >> I just did an "hg mv" and then modified the file, and webrev did what >> it is suppose to do and showed just the diffs, but also indicated >> that the file was moved. Which version of webrev are you using? > > I'm using wevrev changeset: 24:8cd091802cd0 with Mercurial 4.5.3 on > Ubuntu 18.04 LTS. > >> What do "hg diff" and "hg status" show? > > For example, rename Makefile to Makefile.orig: > > ``` > $ hg mv Makefile Makefile.orig > $ hg status > A Makefile.orig > R Makefile This part looks correct. > $ hg diff > diff -r c41d1303a87c Makefile > --- a/Makefile? Mon Nov 04 13:02:40 2019 -0800 > +++ /dev/null?? Thu Jan 01 00:00:00 1970 +0000 > @@ -1,64 +0,0 @@ > -# > -# Copyright (c) 2012, 2015, Oracle and/or its affiliates. All rights > reserved. > -# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. > ?????? : > ???? (snip) > ``` This part is odd. Not sure why it says "diff -r". Mine looks like: $ hg status A test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java R test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java $ hg diff diff --git a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java rename from test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java rename to test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java --- a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java +++ b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java @@ -40,7 +40,7 @@ ..and the actual diff follows the above, which is just a one line edit. Do you have an override of "diff" in your .hgrc? BTW, my Mercurial is 3.6. Chris > > It seems to be a problem in hg instead of webrev. > > > Thanks, > > Yasumasa > > >> thanks, >> >> Chris >>> >>>> The rest of the changes look fine. >>> >>> Thanks! >>> >>> >>> Yasumasa >>> >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: >>>>> (Changed subject to review request) >>>>> >>>>> Hi, >>>>> >>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on >>>>> submit repo. >>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>> >>>>> Could you review it? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> New build dependencies cannot be added lightly. This impacts >>>>>>> everyone who maintains build/test farms. >>>>>> >>>>>> Ok, thanks for telling it. >>>>>> >>>>>>> We already use the C++ demangling capabilities in the VM. Is >>>>>>> there some way to export that for use by libsaproc ? >>>>>>> >>>>>>> Otherwise using C++ demangle may still be the better choice >>>>>>> given we already have it as a dependency. >>>>>> >>>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() >>>>>> at decoder_linux.cpp . >>>>>> It is similar with my original proposal. >>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>> >>>>>> I agree with David to use C++ demangle way. >>>>>> However we need to choice the fix from following: >>>>>> >>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>> >>>>>> I've discussed with Chris about it in [1]. >>>>>> Option A might be large change. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> [1] >>>>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>> >>>>>> >>>>>>> David >>>>>>> >>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> Here's the failure during configure: >>>>>>>> >>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>>>>> install binutils-devel'. >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> >>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I filed this enhancement to JBS: >>>>>>>>> >>>>>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>> >>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>> >>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>> >>>>>>>>> According to the email from Mach 5, dependency errors were >>>>>>>>> occurred in jib. >>>>>>>>> Can someone share the details? >>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>> >>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>> You can change the configure script. I don't know if there's >>>>>>>>>> any concerns with using libiberty.a. That's possibly a legal >>>>>>>>>> question (GNU GPL). You might want to ask that on jdk-dev >>>>>>>>>> and/or build-dev. >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>> >>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>>>>> For example: >>>>>>>>>>> >>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>> ????? to >>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>> >>>>>>>>>>> Can it be accepted? >>>>>>>>>>> >>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>>>>>> link libiberty.a which is provided by binutils. Thus I think >>>>>>>>>>> we need to check libiberty.a in configure script. Is it ok? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>>>>> script. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>>>>> order to do so you've put the new native code in its own >>>>>>>>>>>> file rather than in LinuxDebuggerLocal.c. I'd like to see >>>>>>>>>>>> that resolved. So either convert LinuxDebuggerLocal.c to >>>>>>>>>>>> C++, or use cplus_demangle(). >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>>>>> mangled as below: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>>>> + 0x6ac >>>>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>>>> + 0x33d >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>>>>> I think we can demangle in jstack with this patch. If it >>>>>>>>>>>>> is accepted, I will file it to JBS and send review request. >>>>>>>>>>>>> What do you think? >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>> >>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + >>>>>>>>>>>>> 0x33d >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>>>>> adds C++ source to SA. >>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>> But this function is provided by libiberty.a, so we need >>>>>>>>>>>>> to link it to libsaproc and need to check libiberty.a in >>>>>>>>>>>>> configure script. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>> >>>> >> >> From chris.plummer at oracle.com Tue Nov 5 04:50:03 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 4 Nov 2019 20:50:03 -0800 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <90d7668e-72a4-c207-139a-055eb0d36f95@oracle.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> <1e3b0d70-b9bd-d826-1aac-493821daf7f2@oracle.com> <17935396-c24b-a7d7-164b-af48a8190092@oss.nttdata.com> <90d7668e-72a4-c207-139a-055eb0d36f95@oracle.com> Message-ID: On 11/4/19 8:47 PM, Chris Plummer wrote: > On 11/4/19 5:19 PM, Yasumasa Suenaga wrote: >> Hi Chris, >> >> On 2019/11/05 2:28, Chris Plummer wrote: >>> On 11/2/19 5:33 AM, Yasumasa Suenaga wrote: >>>> Hi Chris, >>>> >>>> On 2019/11/02 7:07, Chris Plummer wrote: >>>>> Hi? Yasumasa, >>>>> >>>>> I can't comment on the build changes. I don't know how the build >>>>> works well enough. >>>> >>>> I ensured it on submit repo >>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009). >>>> This change affects for Linux only. So I changed toolchain if SA >>>> would be built for Linux only. >>>> >>>> >>>>> LinuxDebuggerLocal.cpp should show up as a rename + diffs, not as >>>>> a new file. >>>> >>>> I uploaded it: >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/LinuxDebuggerLocal.patch >>>> >>>> >>>> If we can use Git, it can show "rename + diffs"... >>>> I tried to use "hg rename" and "hg move", but the result did not >>>> change. >>> Hi Yasumasa, >>> >>> I just did an "hg mv" and then modified the file, and webrev did >>> what it is suppose to do and showed just the diffs, but also >>> indicated that the file was moved. Which version of webrev are you >>> using? >> >> I'm using wevrev changeset: 24:8cd091802cd0 with Mercurial 4.5.3 on >> Ubuntu 18.04 LTS. >> >>> What do "hg diff" and "hg status" show? >> >> For example, rename Makefile to Makefile.orig: >> >> ``` >> $ hg mv Makefile Makefile.orig >> $ hg status >> A Makefile.orig >> R Makefile > This part looks correct. >> $ hg diff >> diff -r c41d1303a87c Makefile >> --- a/Makefile? Mon Nov 04 13:02:40 2019 -0800 >> +++ /dev/null?? Thu Jan 01 00:00:00 1970 +0000 >> @@ -1,64 +0,0 @@ >> -# >> -# Copyright (c) 2012, 2015, Oracle and/or its affiliates. All rights >> reserved. >> -# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> ?????? : >> ???? (snip) >> ``` > This part is odd. Not sure why it says "diff -r". Mine looks like: > > $ hg status > A test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java > R test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java > > $ hg diff > diff --git a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java > b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java > rename from test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java > rename to test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java > --- a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java > +++ b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java > @@ -40,7 +40,7 @@ > > > ..and the actual diff follows the above, which is just a one line > edit. Do you have an override of "diff" in your .hgrc? I should have mention what's in my .hgrc. because I just noticed something: diff = [diff] git=1 nodates=1 Don't ask me why I have this. I cloned someones .hgrc when openjdk first moved to Mercurial, and have never touched this part of it. At the very least the "git=1" would explain why my diff output says "diff --git ...". Chris > > BTW, my Mercurial is 3.6. > > Chris > > >> >> It seems to be a problem in hg instead of webrev. >> >> >> Thanks, >> >> Yasumasa >> >> >>> thanks, >>> >>> Chris >>>> >>>>> The rest of the changes look fine. >>>> >>>> Thanks! >>>> >>>> >>>> Yasumasa >>>> >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: >>>>>> (Changed subject to review request) >>>>>> >>>>>> Hi, >>>>>> >>>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine >>>>>> on submit repo. >>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>>> >>>>>> Could you review it? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> New build dependencies cannot be added lightly. This impacts >>>>>>>> everyone who maintains build/test farms. >>>>>>> >>>>>>> Ok, thanks for telling it. >>>>>>> >>>>>>>> We already use the C++ demangling capabilities in the VM. Is >>>>>>>> there some way to export that for use by libsaproc ? >>>>>>>> >>>>>>>> Otherwise using C++ demangle may still be the better choice >>>>>>>> given we already have it as a dependency. >>>>>>> >>>>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() >>>>>>> at decoder_linux.cpp . >>>>>>> It is similar with my original proposal. >>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>> >>>>>>> I agree with David to use C++ demangle way. >>>>>>> However we need to choice the fix from following: >>>>>>> >>>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>>> >>>>>>> I've discussed with Chris about it in [1]. >>>>>>> Option A might be large change. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> [1] >>>>>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>>> >>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> Here's the failure during configure: >>>>>>>>> >>>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>>>>>> install binutils-devel'. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> >>>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I filed this enhancement to JBS: >>>>>>>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>>> >>>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>>> >>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>>> >>>>>>>>>> According to the email from Mach 5, dependency errors were >>>>>>>>>> occurred in jib. >>>>>>>>>> Can someone share the details? >>>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>>> >>>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>>> You can change the configure script. I don't know if there's >>>>>>>>>>> any concerns with using libiberty.a. That's possibly a legal >>>>>>>>>>> question (GNU GPL). You might want to ask that on jdk-dev >>>>>>>>>>> and/or build-dev. >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>>> >>>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>>>>>> For example: >>>>>>>>>>>> >>>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>>> ????? to >>>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>>> >>>>>>>>>>>> Can it be accepted? >>>>>>>>>>>> >>>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need >>>>>>>>>>>> to link libiberty.a which is provided by binutils. Thus I >>>>>>>>>>>> think we need to check libiberty.a in configure script. Is >>>>>>>>>>>> it ok? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>>>>>> script. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>>>>>> order to do so you've put the new native code in its own >>>>>>>>>>>>> file rather than in LinuxDebuggerLocal.c. I'd like to see >>>>>>>>>>>>> that resolved. So either convert LinuxDebuggerLocal.c to >>>>>>>>>>>>> C++, or use cplus_demangle(). >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>>>>>> mangled as below: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>>>>> + 0x6ac >>>>>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>>>>> + 0x33d >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>>>>>> I think we can demangle in jstack with this patch. If it >>>>>>>>>>>>>> is accepted, I will file it to JBS and send review request. >>>>>>>>>>>>>> What do you think? >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + >>>>>>>>>>>>>> 0x33d >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>>>>>> adds C++ source to SA. >>>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>>> But this function is provided by libiberty.a, so we need >>>>>>>>>>>>>> to link it to libsaproc and need to check libiberty.a in >>>>>>>>>>>>>> configure script. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>> >>>>> >>> >>> > > From suenaga at oss.nttdata.com Tue Nov 5 05:26:09 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 5 Nov 2019 14:26:09 +0900 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> <1e3b0d70-b9bd-d826-1aac-493821daf7f2@oracle.com> <17935396-c24b-a7d7-164b-af48a8190092@oss.nttdata.com> <90d7668e-72a4-c207-139a-055eb0d36f95@oracle.com> Message-ID: <523a1023-7054-4c69-ffe4-1b945dddb49b@oss.nttdata.com> Hi Chris, I do not have [diff] section both ~/.hgrc and /.hg/hgrc. In particular I've not edited .hg/hgrc (except defpath). It seems to be nature at least hg of OpenJDK. For example, [1] moves (and make some changes) BaseFileManager, but the changeset removes old file and adds new one. Yasumasa [1] http://hg.openjdk.java.net/jdk10/master/rev/8f8e54a1fa20 On 2019/11/05 13:50, Chris Plummer wrote: > On 11/4/19 8:47 PM, Chris Plummer wrote: >> On 11/4/19 5:19 PM, Yasumasa Suenaga wrote: >>> Hi Chris, >>> >>> On 2019/11/05 2:28, Chris Plummer wrote: >>>> On 11/2/19 5:33 AM, Yasumasa Suenaga wrote: >>>>> Hi Chris, >>>>> >>>>> On 2019/11/02 7:07, Chris Plummer wrote: >>>>>> Hi? Yasumasa, >>>>>> >>>>>> I can't comment on the build changes. I don't know how the build works well enough. >>>>> >>>>> I ensured it on submit repo (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009). >>>>> This change affects for Linux only. So I changed toolchain if SA would be built for Linux only. >>>>> >>>>> >>>>>> LinuxDebuggerLocal.cpp should show up as a rename + diffs, not as a new file. >>>>> >>>>> I uploaded it: >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/LinuxDebuggerLocal.patch >>>>> >>>>> If we can use Git, it can show "rename + diffs"... >>>>> I tried to use "hg rename" and "hg move", but the result did not change. >>>> Hi Yasumasa, >>>> >>>> I just did an "hg mv" and then modified the file, and webrev did what it is suppose to do and showed just the diffs, but also indicated that the file was moved. Which version of webrev are you using? >>> >>> I'm using wevrev changeset: 24:8cd091802cd0 with Mercurial 4.5.3 on Ubuntu 18.04 LTS. >>> >>>> What do "hg diff" and "hg status" show? >>> >>> For example, rename Makefile to Makefile.orig: >>> >>> ``` >>> $ hg mv Makefile Makefile.orig >>> $ hg status >>> A Makefile.orig >>> R Makefile >> This part looks correct. >>> $ hg diff >>> diff -r c41d1303a87c Makefile >>> --- a/Makefile? Mon Nov 04 13:02:40 2019 -0800 >>> +++ /dev/null?? Thu Jan 01 00:00:00 1970 +0000 >>> @@ -1,64 +0,0 @@ >>> -# >>> -# Copyright (c) 2012, 2015, Oracle and/or its affiliates. All rights reserved. >>> -# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>> ?????? : >>> ???? (snip) >>> ``` >> This part is odd. Not sure why it says "diff -r". Mine looks like: >> >> $ hg status >> A test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >> R test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >> >> $ hg diff >> diff --git a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >> rename from test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >> rename to test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >> --- a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >> +++ b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >> @@ -40,7 +40,7 @@ >> >> >> ..and the actual diff follows the above, which is just a one line edit. Do you have an override of "diff" in your .hgrc? > I should have mention what's in my .hgrc. because I just noticed something: > > diff = > [diff] > git=1 > nodates=1 > > Don't ask me why I have this. I cloned someones .hgrc when openjdk first moved to Mercurial, and have never touched this part of it. At the very least the "git=1" would explain why my diff output says "diff --git ...". > > Chris >> >> BTW, my Mercurial is 3.6. >> >> Chris >> >> >>> >>> It seems to be a problem in hg instead of webrev. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>>> thanks, >>>> >>>> Chris >>>>> >>>>>> The rest of the changes look fine. >>>>> >>>>> Thanks! >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: >>>>>>> (Changed subject to review request) >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on submit repo. >>>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>>>> >>>>>>> Could you review it? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> New build dependencies cannot be added lightly. This impacts everyone who maintains build/test farms. >>>>>>>> >>>>>>>> Ok, thanks for telling it. >>>>>>>> >>>>>>>>> We already use the C++ demangling capabilities in the VM. Is there some way to export that for use by libsaproc ? >>>>>>>>> >>>>>>>>> Otherwise using C++ demangle may still be the better choice given we already have it as a dependency. >>>>>>>> >>>>>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at decoder_linux.cpp . >>>>>>>> It is similar with my original proposal. >>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>> >>>>>>>> I agree with David to use C++ demangle way. >>>>>>>> However we need to choice the fix from following: >>>>>>>> >>>>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>>>> >>>>>>>> I've discussed with Chris about it in [1]. >>>>>>>> Option A might be large change. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>>>> >>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> Here's the failure during configure: >>>>>>>>>> >>>>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find demangle.h! You might be able to fix this by running 'sudo yum install binutils-devel'. >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I filed this enhancement to JBS: >>>>>>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>>>> >>>>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>>>> >>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>>>> >>>>>>>>>>> According to the email from Mach 5, dependency errors were occurred in jib. >>>>>>>>>>> Can someone share the details? >>>>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>>>> >>>>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>>>> You can change the configure script. I don't know if there's any concerns with using libiberty.a. That's possibly a legal question (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>>>> >>>>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to convert a lot of JNI calls to C++ style. >>>>>>>>>>>>> For example: >>>>>>>>>>>>> >>>>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>>>> ????? to >>>>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>>>> >>>>>>>>>>>>> Can it be accepted? >>>>>>>>>>>>> >>>>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to link libiberty.a which is provided by binutils. Thus I think we need to check libiberty.a in configure script. Is it ok? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in order to do so you've put the new native code in its own file rather than in LinuxDebuggerLocal.c. I'd like to see that resolved. So either convert LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were mangled as below: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 0x00007ff255a8fa4c _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread + 0x6ac >>>>>>>>>>>>>>> 0x00007ff255a8cc1d _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread + 0x33d >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We can demangle them via c++filt, but I think it is more convenience if jstack can show demangling symbols. >>>>>>>>>>>>>>> I think we can demangle in jstack with this patch. If it is accepted, I will file it to JBS and send review request. >>>>>>>>>>>>>>> What do you think? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds C++ source to SA. >>>>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>>>> But this function is provided by libiberty.a, so we need to link it to libsaproc and need to check libiberty.a in configure script. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> > > From chris.plummer at oracle.com Tue Nov 5 07:06:42 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 4 Nov 2019 23:06:42 -0800 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <523a1023-7054-4c69-ffe4-1b945dddb49b@oss.nttdata.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> <1e3b0d70-b9bd-d826-1aac-493821daf7f2@oracle.com> <17935396-c24b-a7d7-164b-af48a8190092@oss.nttdata.com> <90d7668e-72a4-c207-139a-055eb0d36f95@oracle.com> <523a1023-7054-4c69-ffe4-1b945dddb49b@oss.nttdata.com> Message-ID: Hi? Yasumasa, How did you even find that changeset? It's not clear to me that it was actually a move + modification. I could have been an rm + add. I don't think the changeset URL gives you enough information. But here's one I did that had a 4 moves. They do show up in the changeset rm + add: https://hg.openjdk.java.net/jdk/jdk/rev/a64caa5269cf You can't tell from the chagneset that files were moved, and 1 of them modified, but you can from the webrev: http://cr.openjdk.java.net/~cjplummer/8227645/webrev.00/webrev.open/ Notice 3 of the moved files show up as new files with no diff, but they also says "0 lines changed" so that is why they had no diff. One file was both moved and had a change, so it shows up with a diff and "1 line changed". It looks like when you move a file mercurial actually copies it and deletes the original. See "hg log -v -C" output for example. Notice the "copies" section when you use -C: $ hg log -v -C test/hotspot/jtreg/resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java changeset:?? 55952:a64caa5269cf user:??????? cjplummer date:??????? Fri Aug 09 11:27:08 2019 -0700 files:?????? test/hotspot/jtreg/TEST.groups test/hotspot/jtreg/resourcehogs/TEST.properties test/hotspot/jtreg/resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeArray.java test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeStringArray.java test/hotspot/jtreg/resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java test/hotspot/jtreg/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeArray.java test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeStringArray.java test/hotspot/jtreg/serviceability/sa/TestHeapDumpForLargeArray.java copies: test/hotspot/jtreg/resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java (test/hotspot/jtreg/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java) test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeArray.java (test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeArray.java) test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeStringArray.java (test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeStringArray.java) test/hotspot/jtreg/resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java (test/hotspot/jtreg/serviceability/sa/TestHeapDumpForLargeArray.java) description: 8227645: Some tests in serviceability/sa run with fixed -Xmx values and risk running out of memory Summary: move tests to seprate directory Reviewed-by: dtitov, jcbeyler, ctornqvi, sspitsyn So maybe since mercurial is just making a copy of the file when you move it, newer versions don't even bother trying to make the local edits appear to be a "move + mod" anymore. Chris On 11/4/19 9:26 PM, Yasumasa Suenaga wrote: > Hi Chris, > > I do not have [diff] section both ~/.hgrc and /.hg/hgrc. > In particular I've not edited .hg/hgrc (except defpath). > > It seems to be nature at least hg of OpenJDK. > For example, [1] moves (and make some changes) BaseFileManager, but > the changeset removes old file and adds new one. > > > Yasumasa > > > [1] http://hg.openjdk.java.net/jdk10/master/rev/8f8e54a1fa20 > > > On 2019/11/05 13:50, Chris Plummer wrote: >> On 11/4/19 8:47 PM, Chris Plummer wrote: >>> On 11/4/19 5:19 PM, Yasumasa Suenaga wrote: >>>> Hi Chris, >>>> >>>> On 2019/11/05 2:28, Chris Plummer wrote: >>>>> On 11/2/19 5:33 AM, Yasumasa Suenaga wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 2019/11/02 7:07, Chris Plummer wrote: >>>>>>> Hi? Yasumasa, >>>>>>> >>>>>>> I can't comment on the build changes. I don't know how the build >>>>>>> works well enough. >>>>>> >>>>>> I ensured it on submit repo >>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009). >>>>>> This change affects for Linux only. So I changed toolchain if SA >>>>>> would be built for Linux only. >>>>>> >>>>>> >>>>>>> LinuxDebuggerLocal.cpp should show up as a rename + diffs, not >>>>>>> as a new file. >>>>>> >>>>>> I uploaded it: >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/LinuxDebuggerLocal.patch >>>>>> >>>>>> >>>>>> If we can use Git, it can show "rename + diffs"... >>>>>> I tried to use "hg rename" and "hg move", but the result did not >>>>>> change. >>>>> Hi Yasumasa, >>>>> >>>>> I just did an "hg mv" and then modified the file, and webrev did >>>>> what it is suppose to do and showed just the diffs, but also >>>>> indicated that the file was moved. Which version of webrev are you >>>>> using? >>>> >>>> I'm using wevrev changeset: 24:8cd091802cd0 with Mercurial 4.5.3 on >>>> Ubuntu 18.04 LTS. >>>> >>>>> What do "hg diff" and "hg status" show? >>>> >>>> For example, rename Makefile to Makefile.orig: >>>> >>>> ``` >>>> $ hg mv Makefile Makefile.orig >>>> $ hg status >>>> A Makefile.orig >>>> R Makefile >>> This part looks correct. >>>> $ hg diff >>>> diff -r c41d1303a87c Makefile >>>> --- a/Makefile? Mon Nov 04 13:02:40 2019 -0800 >>>> +++ /dev/null?? Thu Jan 01 00:00:00 1970 +0000 >>>> @@ -1,64 +0,0 @@ >>>> -# >>>> -# Copyright (c) 2012, 2015, Oracle and/or its affiliates. All >>>> rights reserved. >>>> -# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>>> ?????? : >>>> ???? (snip) >>>> ``` >>> This part is odd. Not sure why it says "diff -r". Mine looks like: >>> >>> $ hg status >>> A test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>> R test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>> >>> $ hg diff >>> diff --git >>> a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>> b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>> rename from test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>> rename to test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>> --- a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>> +++ b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>> @@ -40,7 +40,7 @@ >>> >>> >>> ..and the actual diff follows the above, which is just a one line >>> edit. Do you have an override of "diff" in your .hgrc? >> I should have mention what's in my .hgrc. because I just noticed >> something: >> >> diff = >> [diff] >> git=1 >> nodates=1 >> >> Don't ask me why I have this. I cloned someones .hgrc when openjdk >> first moved to Mercurial, and have never touched this part of it. At >> the very least the "git=1" would explain why my diff output says >> "diff --git ...". >> >> Chris >>> >>> BTW, my Mercurial is 3.6. >>> >>> Chris >>> >>> >>>> >>>> It seems to be a problem in hg instead of webrev. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>>> >>>>>>> The rest of the changes look fine. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: >>>>>>>> (Changed subject to review request) >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine >>>>>>>> on submit repo. >>>>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>>>>> >>>>>>>> Could you review it? >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> New build dependencies cannot be added lightly. This impacts >>>>>>>>>> everyone who maintains build/test farms. >>>>>>>>> >>>>>>>>> Ok, thanks for telling it. >>>>>>>>> >>>>>>>>>> We already use the C++ demangling capabilities in the VM. Is >>>>>>>>>> there some way to export that for use by libsaproc ? >>>>>>>>>> >>>>>>>>>> Otherwise using C++ demangle may still be the better choice >>>>>>>>>> given we already have it as a dependency. >>>>>>>>> >>>>>>>>> I found abi::__cxa_demangle() is used in >>>>>>>>> ElfDecoder::demangle() at decoder_linux.cpp . >>>>>>>>> It is similar with my original proposal. >>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>> >>>>>>>>> I agree with David to use C++ demangle way. >>>>>>>>> However we need to choice the fix from following: >>>>>>>>> >>>>>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>>>>> >>>>>>>>> I've discussed with Chris about it in [1]. >>>>>>>>> Option A might be large change. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>>>>> >>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> Here's the failure during configure: >>>>>>>>>>> >>>>>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>>>>>>> demangle.h! You might be able to fix this by running 'sudo >>>>>>>>>>> yum install binutils-devel'. >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I filed this enhancement to JBS: >>>>>>>>>>>> >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>>>>> >>>>>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>>>>> >>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>>>>> >>>>>>>>>>>> According to the email from Mach 5, dependency errors were >>>>>>>>>>>> occurred in jib. >>>>>>>>>>>> Can someone share the details? >>>>>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>>>>> >>>>>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>>>>> You can change the configure script. I don't know if >>>>>>>>>>>>> there's any concerns with using libiberty.a. That's >>>>>>>>>>>>> possibly a legal question (GNU GPL). You might want to ask >>>>>>>>>>>>> that on jdk-dev and/or build-dev. >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>>>>> >>>>>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have >>>>>>>>>>>>>> to convert a lot of JNI calls to C++ style. >>>>>>>>>>>>>> For example: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>>>>> ????? to >>>>>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can it be accepted? >>>>>>>>>>>>>> >>>>>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need >>>>>>>>>>>>>> to link libiberty.a which is provided by binutils. Thus I >>>>>>>>>>>>>> think we need to check libiberty.a in configure script. >>>>>>>>>>>>>> Is it ok? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I prefer to use cplus_demangle() if we can change >>>>>>>>>>>>>> configure script. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but >>>>>>>>>>>>>>> in order to do so you've put the new native code in its >>>>>>>>>>>>>>> own file rather than in LinuxDebuggerLocal.c. I'd like >>>>>>>>>>>>>>> to see that resolved. So either convert >>>>>>>>>>>>>>> LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they >>>>>>>>>>>>>>>> were mangled as below: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>>>>>>> + 0x6ac >>>>>>>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>>>>>>> + 0x33d >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We can demangle them via c++filt, but I think it is >>>>>>>>>>>>>>>> more convenience if jstack can show demangling symbols. >>>>>>>>>>>>>>>> I think we can demangle in jstack with this patch. If >>>>>>>>>>>>>>>> it is accepted, I will file it to JBS and send review >>>>>>>>>>>>>>>> request. >>>>>>>>>>>>>>>> What do you think? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) >>>>>>>>>>>>>>>> + 0x33d >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this >>>>>>>>>>>>>>>> patch adds C++ source to SA. >>>>>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>>>>> But this function is provided by libiberty.a, so we >>>>>>>>>>>>>>>> need to link it to libsaproc and need to check >>>>>>>>>>>>>>>> libiberty.a in configure script. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> >> >> From suenaga at oss.nttdata.com Tue Nov 5 07:47:39 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 5 Nov 2019 16:47:39 +0900 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> <1e3b0d70-b9bd-d826-1aac-493821daf7f2@oracle.com> <17935396-c24b-a7d7-164b-af48a8190092@oss.nttdata.com> <90d7668e-72a4-c207-139a-055eb0d36f95@oracle.com> <523a1023-7054-4c69-ffe4-1b945dddb49b@oss.nttdata.com> Message-ID: Hi Chris, It works well when I added "git=1" to ~/.hgrc . "git=1" seems to be the most important for webrev. I added it to all my machine :) Yasumasa On 2019/11/05 16:06, Chris Plummer wrote: > Hi? Yasumasa, > > How did you even find that changeset? It's not clear to me that it was actually a move + modification. I could have been an rm + add. I don't think the changeset URL gives you enough information. But here's one I did that had a 4 moves. They do show up in the changeset rm + add: > > https://hg.openjdk.java.net/jdk/jdk/rev/a64caa5269cf > > You can't tell from the chagneset that files were moved, and 1 of them modified, but you can from the webrev: > > > http://cr.openjdk.java.net/~cjplummer/8227645/webrev.00/webrev.open/ > > Notice 3 of the moved files show up as new files with no diff, but they also says "0 lines changed" so that is why they had no diff. One file was both moved and had a change, so it shows up with a diff and "1 line changed". It looks like when you move a file mercurial actually copies it and deletes the original. See "hg log -v -C" output for example. Notice the "copies" section when you use -C: > > $ hg log -v -C test/hotspot/jtreg/resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java > changeset:?? 55952:a64caa5269cf > user:??????? cjplummer > date:??????? Fri Aug 09 11:27:08 2019 -0700 > files:?????? test/hotspot/jtreg/TEST.groups test/hotspot/jtreg/resourcehogs/TEST.properties test/hotspot/jtreg/resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeArray.java test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeStringArray.java test/hotspot/jtreg/resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java test/hotspot/jtreg/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeArray.java test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeStringArray.java test/hotspot/jtreg/serviceability/sa/TestHeapDumpForLargeArray.java > copies: test/hotspot/jtreg/resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java (test/hotspot/jtreg/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java) test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeArray.java (test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeArray.java) test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeStringArray.java (test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeStringArray.java) test/hotspot/jtreg/resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java (test/hotspot/jtreg/serviceability/sa/TestHeapDumpForLargeArray.java) > description: > 8227645: Some tests in serviceability/sa run with fixed -Xmx values and risk running out of memory > Summary: move tests to seprate directory > Reviewed-by: dtitov, jcbeyler, ctornqvi, sspitsyn > > So maybe since mercurial is just making a copy of the file when you move it, newer versions don't even bother trying to make the local edits appear to be a "move + mod" anymore. > > Chris > > On 11/4/19 9:26 PM, Yasumasa Suenaga wrote: >> Hi Chris, >> >> I do not have [diff] section both ~/.hgrc and /.hg/hgrc. >> In particular I've not edited .hg/hgrc (except defpath). >> >> It seems to be nature at least hg of OpenJDK. >> For example, [1] moves (and make some changes) BaseFileManager, but the changeset removes old file and adds new one. >> >> >> Yasumasa >> >> >> [1] http://hg.openjdk.java.net/jdk10/master/rev/8f8e54a1fa20 >> >> >> On 2019/11/05 13:50, Chris Plummer wrote: >>> On 11/4/19 8:47 PM, Chris Plummer wrote: >>>> On 11/4/19 5:19 PM, Yasumasa Suenaga wrote: >>>>> Hi Chris, >>>>> >>>>> On 2019/11/05 2:28, Chris Plummer wrote: >>>>>> On 11/2/19 5:33 AM, Yasumasa Suenaga wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> On 2019/11/02 7:07, Chris Plummer wrote: >>>>>>>> Hi? Yasumasa, >>>>>>>> >>>>>>>> I can't comment on the build changes. I don't know how the build works well enough. >>>>>>> >>>>>>> I ensured it on submit repo (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009). >>>>>>> This change affects for Linux only. So I changed toolchain if SA would be built for Linux only. >>>>>>> >>>>>>> >>>>>>>> LinuxDebuggerLocal.cpp should show up as a rename + diffs, not as a new file. >>>>>>> >>>>>>> I uploaded it: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/LinuxDebuggerLocal.patch >>>>>>> >>>>>>> If we can use Git, it can show "rename + diffs"... >>>>>>> I tried to use "hg rename" and "hg move", but the result did not change. >>>>>> Hi Yasumasa, >>>>>> >>>>>> I just did an "hg mv" and then modified the file, and webrev did what it is suppose to do and showed just the diffs, but also indicated that the file was moved. Which version of webrev are you using? >>>>> >>>>> I'm using wevrev changeset: 24:8cd091802cd0 with Mercurial 4.5.3 on Ubuntu 18.04 LTS. >>>>> >>>>>> What do "hg diff" and "hg status" show? >>>>> >>>>> For example, rename Makefile to Makefile.orig: >>>>> >>>>> ``` >>>>> $ hg mv Makefile Makefile.orig >>>>> $ hg status >>>>> A Makefile.orig >>>>> R Makefile >>>> This part looks correct. >>>>> $ hg diff >>>>> diff -r c41d1303a87c Makefile >>>>> --- a/Makefile? Mon Nov 04 13:02:40 2019 -0800 >>>>> +++ /dev/null?? Thu Jan 01 00:00:00 1970 +0000 >>>>> @@ -1,64 +0,0 @@ >>>>> -# >>>>> -# Copyright (c) 2012, 2015, Oracle and/or its affiliates. All rights reserved. >>>>> -# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>>>> ?????? : >>>>> ???? (snip) >>>>> ``` >>>> This part is odd. Not sure why it says "diff -r". Mine looks like: >>>> >>>> $ hg status >>>> A test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>>> R test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>>> >>>> $ hg diff >>>> diff --git a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>>> rename from test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>>> rename to test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>>> --- a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>>> +++ b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>>> @@ -40,7 +40,7 @@ >>>> >>>> >>>> ..and the actual diff follows the above, which is just a one line edit. Do you have an override of "diff" in your .hgrc? >>> I should have mention what's in my .hgrc. because I just noticed something: >>> >>> diff = >>> [diff] >>> git=1 >>> nodates=1 >>> >>> Don't ask me why I have this. I cloned someones .hgrc when openjdk first moved to Mercurial, and have never touched this part of it. At the very least the "git=1" would explain why my diff output says "diff --git ...". >>> >>> Chris >>>> >>>> BTW, my Mercurial is 3.6. >>>> >>>> Chris >>>> >>>> >>>>> >>>>> It seems to be a problem in hg instead of webrev. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>>> >>>>>>>> The rest of the changes look fine. >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: >>>>>>>>> (Changed subject to review request) >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on submit repo. >>>>>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>>>>>> >>>>>>>>> Could you review it? >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> New build dependencies cannot be added lightly. This impacts everyone who maintains build/test farms. >>>>>>>>>> >>>>>>>>>> Ok, thanks for telling it. >>>>>>>>>> >>>>>>>>>>> We already use the C++ demangling capabilities in the VM. Is there some way to export that for use by libsaproc ? >>>>>>>>>>> >>>>>>>>>>> Otherwise using C++ demangle may still be the better choice given we already have it as a dependency. >>>>>>>>>> >>>>>>>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at decoder_linux.cpp . >>>>>>>>>> It is similar with my original proposal. >>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>> >>>>>>>>>> I agree with David to use C++ demangle way. >>>>>>>>>> However we need to choice the fix from following: >>>>>>>>>> >>>>>>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>>>>>> >>>>>>>>>> I've discussed with Chris about it in [1]. >>>>>>>>>> Option A might be large change. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> Here's the failure during configure: >>>>>>>>>>>> >>>>>>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find demangle.h! You might be able to fix this by running 'sudo yum install binutils-devel'. >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I filed this enhancement to JBS: >>>>>>>>>>>>> >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>>>>>> >>>>>>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>>>>>> >>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>>>>>> >>>>>>>>>>>>> According to the email from Mach 5, dependency errors were occurred in jib. >>>>>>>>>>>>> Can someone share the details? >>>>>>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>>>>>> >>>>>>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>>>>>> You can change the configure script. I don't know if there's any concerns with using libiberty.a. That's possibly a legal question (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to convert a lot of JNI calls to C++ style. >>>>>>>>>>>>>>> For example: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>>>>>> ????? to >>>>>>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can it be accepted? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to link libiberty.a which is provided by binutils. Thus I think we need to check libiberty.a in configure script. Is it ok? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in order to do so you've put the new native code in its own file rather than in LinuxDebuggerLocal.c. I'd like to see that resolved. So either convert LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were mangled as below: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 0x00007ff255a8fa4c _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread + 0x6ac >>>>>>>>>>>>>>>>> 0x00007ff255a8cc1d _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread + 0x33d >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We can demangle them via c++filt, but I think it is more convenience if jstack can show demangling symbols. >>>>>>>>>>>>>>>>> I think we can demangle in jstack with this patch. If it is accepted, I will file it to JBS and send review request. >>>>>>>>>>>>>>>>> What do you think? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds C++ source to SA. >>>>>>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>>>>>> But this function is provided by libiberty.a, so we need to link it to libsaproc and need to check libiberty.a in configure script. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >>> >>> > > From chris.plummer at oracle.com Tue Nov 5 19:45:08 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 5 Nov 2019 11:45:08 -0800 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <45d18362-ef97-3c76-17fb-3e3ab7f2745d@oracle.com> <90fef6a4-e607-068a-79bd-7b3ebfcc5139@oss.nttdata.com> <1e3b0d70-b9bd-d826-1aac-493821daf7f2@oracle.com> <17935396-c24b-a7d7-164b-af48a8190092@oss.nttdata.com> <90d7668e-72a4-c207-139a-055eb0d36f95@oracle.com> <523a1023-7054-4c69-ffe4-1b945dddb49b@oss.nttdata.com> Message-ID: <2995f984-d89b-ec9f-aad2-a716b0b8e320@oracle.com> That's good to know. thanks, Chris On 11/4/19 11:47 PM, Yasumasa Suenaga wrote: > Hi Chris, > > It works well when I added "git=1" to ~/.hgrc . > "git=1" seems to be the most important for webrev. > > I added it to all my machine :) > > > Yasumasa > > > On 2019/11/05 16:06, Chris Plummer wrote: >> Hi? Yasumasa, >> >> How did you even find that changeset? It's not clear to me that it >> was actually a move + modification. I could have been an rm + add. I >> don't think the changeset URL gives you enough information. But >> here's one I did that had a 4 moves. They do show up in the changeset >> rm + add: >> >> https://hg.openjdk.java.net/jdk/jdk/rev/a64caa5269cf >> >> You can't tell from the chagneset that files were moved, and 1 of >> them modified, but you can from the webrev: >> >> >> http://cr.openjdk.java.net/~cjplummer/8227645/webrev.00/webrev.open/ >> >> Notice 3 of the moved files show up as new files with no diff, but >> they also says "0 lines changed" so that is why they had no diff. One >> file was both moved and had a change, so it shows up with a diff and >> "1 line changed". It looks like when you move a file mercurial >> actually copies it and deletes the original. See "hg log -v -C" >> output for example. Notice the "copies" section when you use -C: >> >> $ hg log -v -C >> test/hotspot/jtreg/resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java >> changeset:?? 55952:a64caa5269cf >> user:??????? cjplummer >> date:??????? Fri Aug 09 11:27:08 2019 -0700 >> files:?????? test/hotspot/jtreg/TEST.groups >> test/hotspot/jtreg/resourcehogs/TEST.properties >> test/hotspot/jtreg/resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java >> test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeArray.java >> test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeStringArray.java >> test/hotspot/jtreg/resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java >> test/hotspot/jtreg/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java >> test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeArray.java >> test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeStringArray.java >> test/hotspot/jtreg/serviceability/sa/TestHeapDumpForLargeArray.java >> copies: >> test/hotspot/jtreg/resourcehogs/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java >> (test/hotspot/jtreg/serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java) >> test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeArray.java >> (test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeArray.java) >> test/hotspot/jtreg/resourcehogs/serviceability/sa/LingeredAppWithLargeStringArray.java >> (test/hotspot/jtreg/serviceability/sa/LingeredAppWithLargeStringArray.java) >> test/hotspot/jtreg/resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java >> (test/hotspot/jtreg/serviceability/sa/TestHeapDumpForLargeArray.java) >> description: >> 8227645: Some tests in serviceability/sa run with fixed -Xmx values >> and risk running out of memory >> Summary: move tests to seprate directory >> Reviewed-by: dtitov, jcbeyler, ctornqvi, sspitsyn >> >> So maybe since mercurial is just making a copy of the file when you >> move it, newer versions don't even bother trying to make the local >> edits appear to be a "move + mod" anymore. >> >> Chris >> >> On 11/4/19 9:26 PM, Yasumasa Suenaga wrote: >>> Hi Chris, >>> >>> I do not have [diff] section both ~/.hgrc and /.hg/hgrc. >>> In particular I've not edited .hg/hgrc (except defpath). >>> >>> It seems to be nature at least hg of OpenJDK. >>> For example, [1] moves (and make some changes) BaseFileManager, but >>> the changeset removes old file and adds new one. >>> >>> >>> Yasumasa >>> >>> >>> [1] http://hg.openjdk.java.net/jdk10/master/rev/8f8e54a1fa20 >>> >>> >>> On 2019/11/05 13:50, Chris Plummer wrote: >>>> On 11/4/19 8:47 PM, Chris Plummer wrote: >>>>> On 11/4/19 5:19 PM, Yasumasa Suenaga wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 2019/11/05 2:28, Chris Plummer wrote: >>>>>>> On 11/2/19 5:33 AM, Yasumasa Suenaga wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> On 2019/11/02 7:07, Chris Plummer wrote: >>>>>>>>> Hi? Yasumasa, >>>>>>>>> >>>>>>>>> I can't comment on the build changes. I don't know how the >>>>>>>>> build works well enough. >>>>>>>> >>>>>>>> I ensured it on submit repo >>>>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009). >>>>>>>> This change affects for Linux only. So I changed toolchain if >>>>>>>> SA would be built for Linux only. >>>>>>>> >>>>>>>> >>>>>>>>> LinuxDebuggerLocal.cpp should show up as a rename + diffs, not >>>>>>>>> as a new file. >>>>>>>> >>>>>>>> I uploaded it: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/LinuxDebuggerLocal.patch >>>>>>>> >>>>>>>> >>>>>>>> If we can use Git, it can show "rename + diffs"... >>>>>>>> I tried to use "hg rename" and "hg move", but the result did >>>>>>>> not change. >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> I just did an "hg mv" and then modified the file, and webrev did >>>>>>> what it is suppose to do and showed just the diffs, but also >>>>>>> indicated that the file was moved. Which version of webrev are >>>>>>> you using? >>>>>> >>>>>> I'm using wevrev changeset: 24:8cd091802cd0 with Mercurial 4.5.3 >>>>>> on Ubuntu 18.04 LTS. >>>>>> >>>>>>> What do "hg diff" and "hg status" show? >>>>>> >>>>>> For example, rename Makefile to Makefile.orig: >>>>>> >>>>>> ``` >>>>>> $ hg mv Makefile Makefile.orig >>>>>> $ hg status >>>>>> A Makefile.orig >>>>>> R Makefile >>>>> This part looks correct. >>>>>> $ hg diff >>>>>> diff -r c41d1303a87c Makefile >>>>>> --- a/Makefile? Mon Nov 04 13:02:40 2019 -0800 >>>>>> +++ /dev/null?? Thu Jan 01 00:00:00 1970 +0000 >>>>>> @@ -1,64 +0,0 @@ >>>>>> -# >>>>>> -# Copyright (c) 2012, 2015, Oracle and/or its affiliates. All >>>>>> rights reserved. >>>>>> -# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>>>>> ?????? : >>>>>> ???? (snip) >>>>>> ``` >>>>> This part is odd. Not sure why it says "diff -r". Mine looks like: >>>>> >>>>> $ hg status >>>>> A test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>>>> R test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>>>> >>>>> $ hg diff >>>>> diff --git >>>>> a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>>>> b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>>>> rename from test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>>>> rename to test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>>>> --- a/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats.java >>>>> +++ b/test/hotspot/jtreg/serviceability/sa/CDSJMapClstats2.java >>>>> @@ -40,7 +40,7 @@ >>>>> >>>>> >>>>> ..and the actual diff follows the above, which is just a one line >>>>> edit. Do you have an override of "diff" in your .hgrc? >>>> I should have mention what's in my .hgrc. because I just noticed >>>> something: >>>> >>>> diff = >>>> [diff] >>>> git=1 >>>> nodates=1 >>>> >>>> Don't ask me why I have this. I cloned someones .hgrc when openjdk >>>> first moved to Mercurial, and have never touched this part of it. >>>> At the very least the "git=1" would explain why my diff output says >>>> "diff --git ...". >>>> >>>> Chris >>>>> >>>>> BTW, my Mercurial is 3.6. >>>>> >>>>> Chris >>>>> >>>>> >>>>>> >>>>>> It seems to be a problem in hg instead of webrev. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>>> >>>>>>>>> The rest of the changes look fine. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 11/1/19 1:56 AM, Yasumasa Suenaga wrote: >>>>>>>>>> (Changed subject to review request) >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works >>>>>>>>>> fine on submit repo. >>>>>>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>>>>>>> >>>>>>>>>> Could you review it? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> New build dependencies cannot be added lightly. This >>>>>>>>>>>> impacts everyone who maintains build/test farms. >>>>>>>>>>> >>>>>>>>>>> Ok, thanks for telling it. >>>>>>>>>>> >>>>>>>>>>>> We already use the C++ demangling capabilities in the VM. >>>>>>>>>>>> Is there some way to export that for use by libsaproc ? >>>>>>>>>>>> >>>>>>>>>>>> Otherwise using C++ demangle may still be the better choice >>>>>>>>>>>> given we already have it as a dependency. >>>>>>>>>>> >>>>>>>>>>> I found abi::__cxa_demangle() is used in >>>>>>>>>>> ElfDecoder::demangle() at decoder_linux.cpp . >>>>>>>>>>> It is similar with my original proposal. >>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>> >>>>>>>>>>> I agree with David to use C++ demangle way. >>>>>>>>>>> However we need to choice the fix from following: >>>>>>>>>>> >>>>>>>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>>>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>>>>>>> >>>>>>>>>>> I've discussed with Chris about it in [1]. >>>>>>>>>>> Option A might be large change. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [1] >>>>>>>>>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> Here's the failure during configure: >>>>>>>>>>>>> >>>>>>>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h >>>>>>>>>>>>> usability... no >>>>>>>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not >>>>>>>>>>>>> find demangle.h! You might be able to fix this by running >>>>>>>>>>>>> 'sudo yum install binutils-devel'. >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I filed this enhancement to JBS: >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>>>>>>> >>>>>>>>>>>>>> According to the email from Mach 5, dependency errors >>>>>>>>>>>>>> were occurred in jib. >>>>>>>>>>>>>> Can someone share the details? >>>>>>>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>>>>>>> >>>>>>>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>>>>>>> You can change the configure script. I don't know if >>>>>>>>>>>>>>> there's any concerns with using libiberty.a. That's >>>>>>>>>>>>>>> possibly a legal question (GNU GPL). You might want to >>>>>>>>>>>>>>> ask that on jdk-dev and/or build-dev. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have >>>>>>>>>>>>>>>> to convert a lot of JNI calls to C++ style. >>>>>>>>>>>>>>>> For example: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>>>>>>> ????? to >>>>>>>>>>>>>>>> env->FindClass("java/lang/String") >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Can it be accepted? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we >>>>>>>>>>>>>>>> need to link libiberty.a which is provided by binutils. >>>>>>>>>>>>>>>> Thus I think we need to check libiberty.a in configure >>>>>>>>>>>>>>>> script. Is it ok? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I prefer to use cplus_demangle() if we can change >>>>>>>>>>>>>>>> configure script. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I don't have concerns with adding C++ source to SA, >>>>>>>>>>>>>>>>> but in order to do so you've put the new native code >>>>>>>>>>>>>>>>> in its own file rather than in LinuxDebuggerLocal.c. >>>>>>>>>>>>>>>>> I'd like to see that resolved. So either convert >>>>>>>>>>>>>>>>> LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they >>>>>>>>>>>>>>>>>> were mangled as below: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>>>>>>>>> + 0x6ac >>>>>>>>>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>>>>>>>>> + 0x33d >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We can demangle them via c++filt, but I think it is >>>>>>>>>>>>>>>>>> more convenience if jstack can show demangling symbols. >>>>>>>>>>>>>>>>>> I think we can demangle in jstack with this patch. If >>>>>>>>>>>>>>>>>> it is accepted, I will file it to JBS and send review >>>>>>>>>>>>>>>>>> request. >>>>>>>>>>>>>>>>>> What do you think? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + >>>>>>>>>>>>>>>>>> 0x6ac >>>>>>>>>>>>>>>>>> 0x00007ff1aba1dc1d >>>>>>>>>>>>>>>>>> JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, >>>>>>>>>>>>>>>>>> Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this >>>>>>>>>>>>>>>>>> patch adds C++ source to SA. >>>>>>>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>>>>>>> But this function is provided by libiberty.a, so we >>>>>>>>>>>>>>>>>> need to link it to libsaproc and need to check >>>>>>>>>>>>>>>>>> libiberty.a in configure script. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>>> >>>> >> >> From david.holmes at oracle.com Wed Nov 6 13:13:57 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 6 Nov 2019 23:13:57 +1000 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> Message-ID: <1fd10248-13e9-16ed-6730-e7d530bbd6af@oracle.com> On 4/11/2019 8:27 pm, Magnus Ihse Bursie wrote: > On 2019-11-02 13:43, Daniel D. Daugherty wrote: >> Since this review contains build changes, I've added build-dev at ... > Thanks Dan for noticing this and cc:ing us. > > Yasumasa: build changes look fine. Thanks. This change broke all cross-compilation. David > /Magnus >> >> Dan >> >> >> On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: >>> (Changed subject to review request) >>> >>> Hi, >>> >>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on >>> submit repo. >>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>> >>> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>> >>> Could you review it? >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>> Hi David, >>>> >>>> On 2019/11/01 7:55, David Holmes wrote: >>>>> Hi Yasumasa, >>>>> >>>>> New build dependencies cannot be added lightly. This impacts >>>>> everyone who maintains build/test farms. >>>> >>>> Ok, thanks for telling it. >>>> >>>>> We already use the C++ demangling capabilities in the VM. Is there >>>>> some way to export that for use by libsaproc ? >>>>> >>>>> Otherwise using C++ demangle may still be the better choice given >>>>> we already have it as a dependency. >>>> >>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at >>>> decoder_linux.cpp . >>>> It is similar with my original proposal. >>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>> >>>> I agree with David to use C++ demangle way. >>>> However we need to choice the fix from following: >>>> >>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>> >>>> I've discussed with Chris about it in [1]. >>>> Option A might be large change. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] >>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>> >>>> >>>> >>>>> David >>>>> >>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> Here's the failure during configure: >>>>>> >>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>>> install binutils-devel'. >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I filed this enhancement to JBS: >>>>>>> >>>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>> >>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>> >>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>> ? http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>> >>>>>>> According to the email from Mach 5, dependency errors were >>>>>>> occurred in jib. >>>>>>> Can someone share the details? >>>>>>> I'm not familiar in jib, so I want help. >>>>>>> >>>>>>> ? mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>> You can change the configure script. I don't know if there's any >>>>>>>> concerns with using libiberty.a. That's possibly a legal >>>>>>>> question (GNU GPL). You might want to ask that on jdk-dev and/or >>>>>>>> build-dev. >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Thanks for quick reply! >>>>>>>>> >>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>>> For example: >>>>>>>>> >>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>> ????? to >>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>> >>>>>>>>> Can it be accepted? >>>>>>>>> >>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>>>> link libiberty.a which is provided by binutils. Thus I think we >>>>>>>>> need to check libiberty.a in configure script. Is it ok? >>>>>>>>> >>>>>>>>> >>>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>>> script. >>>>>>>>> >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>>> order to do so you've put the new native code in its own file >>>>>>>>>> rather than in LinuxDebuggerLocal.c. I'd like to see that >>>>>>>>>> resolved. So either convert LinuxDebuggerLocal.c to C++, or >>>>>>>>>> use cplus_demangle(). >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>>> mangled as below: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>> + 0x6ac >>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>> + 0x33d >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>>> I think we can demangle in jstack with this patch. If it is >>>>>>>>>>> accepted, I will file it to JBS and send review request. >>>>>>>>>>> What do you think? >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>> >>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>>> adds C++ source to SA. >>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>> But this function is provided by libiberty.a, so we need to >>>>>>>>>>> link it to libsaproc and need to check libiberty.a in >>>>>>>>>>> configure script. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>> >> > From erik.joelsson at oracle.com Wed Nov 6 13:33:17 2019 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Wed, 6 Nov 2019 05:33:17 -0800 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <1fd10248-13e9-16ed-6730-e7d530bbd6af@oracle.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> <1fd10248-13e9-16ed-6730-e7d530bbd6af@oracle.com> Message-ID: <2e06b688-d3bb-9cb7-e221-62c3462a308f@oracle.com> I looked closer at it now and the build change is not good. Any toolchain definition with BUILD in the name, like TOOLCHAIN_BUILD_LINK_CXX, is only meant to be used for building tools that are run during the build. I believe the fix is to just remove the "BUILD_". /Erik On 2019-11-06 05:13, David Holmes wrote: > On 4/11/2019 8:27 pm, Magnus Ihse Bursie wrote: >> On 2019-11-02 13:43, Daniel D. Daugherty wrote: >>> Since this review contains build changes, I've added build-dev at ... >> Thanks Dan for noticing this and cc:ing us. >> >> Yasumasa: build changes look fine. Thanks. > > This change broke all cross-compilation. > > David > >> /Magnus >>> >>> Dan >>> >>> >>> On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: >>>> (Changed subject to review request) >>>> >>>> Hi, >>>> >>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on >>>> submit repo. >>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>> >>>> Could you review it? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> New build dependencies cannot be added lightly. This impacts >>>>>> everyone who maintains build/test farms. >>>>> >>>>> Ok, thanks for telling it. >>>>> >>>>>> We already use the C++ demangling capabilities in the VM. Is >>>>>> there some way to export that for use by libsaproc ? >>>>>> >>>>>> Otherwise using C++ demangle may still be the better choice given >>>>>> we already have it as a dependency. >>>>> >>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at >>>>> decoder_linux.cpp . >>>>> It is similar with my original proposal. >>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>> >>>>> I agree with David to use C++ demangle way. >>>>> However we need to choice the fix from following: >>>>> >>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>> >>>>> I've discussed with Chris about it in [1]. >>>>> Option A might be large change. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> [1] >>>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>> >>>>> >>>>> >>>>>> David >>>>>> >>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> Here's the failure during configure: >>>>>>> >>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>>>> install binutils-devel'. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> >>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I filed this enhancement to JBS: >>>>>>>> >>>>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>> >>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>> >>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>> >>>>>>>> According to the email from Mach 5, dependency errors were >>>>>>>> occurred in jib. >>>>>>>> Can someone share the details? >>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>> >>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>> You can change the configure script. I don't know if there's >>>>>>>>> any concerns with using libiberty.a. That's possibly a legal >>>>>>>>> question (GNU GPL). You might want to ask that on jdk-dev >>>>>>>>> and/or build-dev. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> Thanks for quick reply! >>>>>>>>>> >>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>>>> For example: >>>>>>>>>> >>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>> ????? to >>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>> >>>>>>>>>> Can it be accepted? >>>>>>>>>> >>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>>>>> link libiberty.a which is provided by binutils. Thus I think >>>>>>>>>> we need to check libiberty.a in configure script. Is it ok? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>>>> script. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>> >>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>>>> order to do so you've put the new native code in its own >>>>>>>>>>> file rather than in LinuxDebuggerLocal.c. I'd like to see >>>>>>>>>>> that resolved. So either convert LinuxDebuggerLocal.c to >>>>>>>>>>> C++, or use cplus_demangle(). >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>>>> mangled as below: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>>> + 0x6ac >>>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>>> + 0x33d >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>>>> I think we can demangle in jstack with this patch. If it is >>>>>>>>>>>> accepted, I will file it to JBS and send review request. >>>>>>>>>>>> What do you think? >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>> >>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>>>> adds C++ source to SA. >>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>> But this function is provided by libiberty.a, so we need to >>>>>>>>>>>> link it to libsaproc and need to check libiberty.a in >>>>>>>>>>>> configure script. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>> >>> >> From suenaga at oss.nttdata.com Wed Nov 6 13:50:34 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 6 Nov 2019 22:50:34 +0900 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <2e06b688-d3bb-9cb7-e221-62c3462a308f@oracle.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> <1fd10248-13e9-16ed-6730-e7d530bbd6af@oracle.com> <2e06b688-d3bb-9cb7-e221-62c3462a308f@oracle.com> Message-ID: <832479a6-dbb5-00ed-5ef6-18a1b4b71151@oss.nttdata.com> Hi, Thanks for telling it me. I removed "BUILD_" from Makefile as Erik said, then it works fine on my Linux box: ``` diff -r a3b046720c3b make/lib/Lib-jdk.hotspot.agent.gmk --- a/make/lib/Lib-jdk.hotspot.agent.gmk Wed Nov 06 21:49:30 2019 +0900 +++ b/make/lib/Lib-jdk.hotspot.agent.gmk Wed Nov 06 22:48:00 2019 +0900 @@ -55,7 +55,7 @@ SA_TOOLCHAIN := $(TOOLCHAIN_DEFAULT) ifeq ($(call isTargetOs, linux), true) - SA_TOOLCHAIN := TOOLCHAIN_BUILD_LINK_CXX + SA_TOOLCHAIN := TOOLCHAIN_LINK_CXX endif ################################################################################ ``` Could you file it to JBS? I don't know details. If you assign it to me, I will send review request. Thanks, Yasumasa On 2019/11/06 22:33, Erik Joelsson wrote: > I looked closer at it now and the build change is not good. Any toolchain definition with BUILD in the name, like TOOLCHAIN_BUILD_LINK_CXX, is only meant to be used for building tools that are run during the build. I believe the fix is to just remove the "BUILD_". > > /Erik > > On 2019-11-06 05:13, David Holmes wrote: >> On 4/11/2019 8:27 pm, Magnus Ihse Bursie wrote: >>> On 2019-11-02 13:43, Daniel D. Daugherty wrote: >>>> Since this review contains build changes, I've added build-dev at ... >>> Thanks Dan for noticing this and cc:ing us. >>> >>> Yasumasa: build changes look fine. Thanks. >> >> This change broke all cross-compilation. >> >> David >> >>> /Magnus >>>> >>>> Dan >>>> >>>> >>>> On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: >>>>> (Changed subject to review request) >>>>> >>>>> Hi, >>>>> >>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on submit repo. >>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>> >>>>> Could you review it? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> New build dependencies cannot be added lightly. This impacts everyone who maintains build/test farms. >>>>>> >>>>>> Ok, thanks for telling it. >>>>>> >>>>>>> We already use the C++ demangling capabilities in the VM. Is there some way to export that for use by libsaproc ? >>>>>>> >>>>>>> Otherwise using C++ demangle may still be the better choice given we already have it as a dependency. >>>>>> >>>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at decoder_linux.cpp . >>>>>> It is similar with my original proposal. >>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>> >>>>>> I agree with David to use C++ demangle way. >>>>>> However we need to choice the fix from following: >>>>>> >>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>> >>>>>> I've discussed with Chris about it in [1]. >>>>>> Option A might be large change. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>> >>>>>> >>>>>>> David >>>>>>> >>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> Here's the failure during configure: >>>>>>>> >>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find demangle.h! You might be able to fix this by running 'sudo yum install binutils-devel'. >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> >>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I filed this enhancement to JBS: >>>>>>>>> >>>>>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>> >>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>> >>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>> >>>>>>>>> According to the email from Mach 5, dependency errors were occurred in jib. >>>>>>>>> Can someone share the details? >>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>> >>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>> You can change the configure script. I don't know if there's any concerns with using libiberty.a. That's possibly a legal question (GNU GPL). You might want to ask that on jdk-dev and/or build-dev. >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>> >>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to convert a lot of JNI calls to C++ style. >>>>>>>>>>> For example: >>>>>>>>>>> >>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>> ????? to >>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>> >>>>>>>>>>> Can it be accepted? >>>>>>>>>>> >>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to link libiberty.a which is provided by binutils. Thus I think we need to check libiberty.a in configure script. Is it ok? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I prefer to use cplus_demangle() if we can change configure script. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in order to do so you've put the new native code in its own file rather than in LinuxDebuggerLocal.c. I'd like to see that resolved. So either convert LinuxDebuggerLocal.c to C++, or use cplus_demangle(). >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were mangled as below: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 0x00007ff255a8fa4c _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread + 0x6ac >>>>>>>>>>>>> 0x00007ff255a8cc1d _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread + 0x33d >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> We can demangle them via c++filt, but I think it is more convenience if jstack can show demangling symbols. >>>>>>>>>>>>> I think we can demangle in jstack with this patch. If it is accepted, I will file it to JBS and send review request. >>>>>>>>>>>>> What do you think? >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>> >>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch adds C++ source to SA. >>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>> But this function is provided by libiberty.a, so we need to link it to libsaproc and need to check libiberty.a in configure script. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>> >>>> >>> From boris.ulasevich at bell-sw.com Wed Nov 6 14:31:13 2019 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Wed, 6 Nov 2019 17:31:13 +0300 Subject: RFR: 8233600: cross-builds fails after JDK-8233285 In-Reply-To: <2e06b688-d3bb-9cb7-e221-62c3462a308f@oracle.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> <1fd10248-13e9-16ed-6730-e7d530bbd6af@oracle.com> <2e06b688-d3bb-9cb7-e221-62c3462a308f@oracle.com> Message-ID: Hi, Indeed, the fix is quite evident. I checked it works for arm32/aarch cross-compilation builds. http://bugs.openjdk.java.net/browse/JDK-8233600 http://cr.openjdk.java.net/~bulasevich/8233600/webrev.00 regards, Boris On 06.11.2019 16:33, Erik Joelsson wrote: > I looked closer at it now and the build change is not good. Any > toolchain definition with BUILD in the name, like > TOOLCHAIN_BUILD_LINK_CXX, is only meant to be used for building tools > that are run during the build. I believe the fix is to just remove the > "BUILD_". > > /Erik > > On 2019-11-06 05:13, David Holmes wrote: >> On 4/11/2019 8:27 pm, Magnus Ihse Bursie wrote: >>> On 2019-11-02 13:43, Daniel D. Daugherty wrote: >>>> Since this review contains build changes, I've added build-dev at ... >>> Thanks Dan for noticing this and cc:ing us. >>> >>> Yasumasa: build changes look fine. Thanks. >> >> This change broke all cross-compilation. >> >> David >> >>> /Magnus >>>> >>>> Dan >>>> >>>> >>>> On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: >>>>> (Changed subject to review request) >>>>> >>>>> Hi, >>>>> >>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on >>>>> submit repo. >>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>> >>>>> Could you review it? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>> Hi David, >>>>>> >>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>> Hi Yasumasa, >>>>>>> >>>>>>> New build dependencies cannot be added lightly. This impacts >>>>>>> everyone who maintains build/test farms. >>>>>> >>>>>> Ok, thanks for telling it. >>>>>> >>>>>>> We already use the C++ demangling capabilities in the VM. Is >>>>>>> there some way to export that for use by libsaproc ? >>>>>>> >>>>>>> Otherwise using C++ demangle may still be the better choice given >>>>>>> we already have it as a dependency. >>>>>> >>>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() at >>>>>> decoder_linux.cpp . >>>>>> It is similar with my original proposal. >>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>> >>>>>> I agree with David to use C++ demangle way. >>>>>> However we need to choice the fix from following: >>>>>> >>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>> >>>>>> I've discussed with Chris about it in [1]. >>>>>> Option A might be large change. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> [1] >>>>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>> >>>>>> >>>>>> >>>>>>> David >>>>>>> >>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> Here's the failure during configure: >>>>>>>> >>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>>>>> install binutils-devel'. >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> >>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I filed this enhancement to JBS: >>>>>>>>> >>>>>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>> >>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>> >>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>> >>>>>>>>> According to the email from Mach 5, dependency errors were >>>>>>>>> occurred in jib. >>>>>>>>> Can someone share the details? >>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>> >>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>> You can change the configure script. I don't know if there's >>>>>>>>>> any concerns with using libiberty.a. That's possibly a legal >>>>>>>>>> question (GNU GPL). You might want to ask that on jdk-dev >>>>>>>>>> and/or build-dev. >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>> >>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>>>>> For example: >>>>>>>>>>> >>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>> ????? to >>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>> >>>>>>>>>>> Can it be accepted? >>>>>>>>>>> >>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>>>>>> link libiberty.a which is provided by binutils. Thus I think >>>>>>>>>>> we need to check libiberty.a in configure script. Is it ok? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>>>>> script. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>> >>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>>>>> order to do so you've put the new native code in its own >>>>>>>>>>>> file rather than in LinuxDebuggerLocal.c. I'd like to see >>>>>>>>>>>> that resolved. So either convert LinuxDebuggerLocal.c to >>>>>>>>>>>> C++, or use cplus_demangle(). >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>>>>> mangled as below: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>>>> + 0x6ac >>>>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>>>> + 0x33d >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>>>>> I think we can demangle in jstack with this patch. If it is >>>>>>>>>>>>> accepted, I will file it to JBS and send review request. >>>>>>>>>>>>> What do you think? >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>> >>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33d >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>>>>> adds C++ source to SA. >>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>> But this function is provided by libiberty.a, so we need to >>>>>>>>>>>>> link it to libsaproc and need to check libiberty.a in >>>>>>>>>>>>> configure script. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>> >>>> >>> From shade at redhat.com Wed Nov 6 16:06:20 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 6 Nov 2019 17:06:20 +0100 Subject: RFR: 8233600: cross-builds fails after JDK-8233285 In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> <1fd10248-13e9-16ed-6730-e7d530bbd6af@oracle.com> <2e06b688-d3bb-9cb7-e221-62c3462a308f@oracle.com> Message-ID: <7af7015a-c962-7643-592a-e441e67498bc@redhat.com> On 11/6/19 3:31 PM, Boris Ulasevich wrote: > http://bugs.openjdk.java.net/browse/JDK-8233600 > http://cr.openjdk.java.net/~bulasevich/8233600/webrev.00 This looks good to me. Fixes aarch64 cross-compilation for me. -- Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From erik.joelsson at oracle.com Wed Nov 6 16:18:00 2019 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Wed, 6 Nov 2019 08:18:00 -0800 Subject: RFR: 8233600: cross-builds fails after JDK-8233285 In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> <1fd10248-13e9-16ed-6730-e7d530bbd6af@oracle.com> <2e06b688-d3bb-9cb7-e221-62c3462a308f@oracle.com> Message-ID: Looks good! Verified the same patch with all our available cross compile builds. /Erik On 2019-11-06 06:31, Boris Ulasevich wrote: > Hi, > > Indeed, the fix is quite evident. I checked it works for arm32/aarch > cross-compilation builds. > > http://bugs.openjdk.java.net/browse/JDK-8233600 > http://cr.openjdk.java.net/~bulasevich/8233600/webrev.00 > > regards, > Boris > > On 06.11.2019 16:33, Erik Joelsson wrote: >> I looked closer at it now and the build change is not good. Any >> toolchain definition with BUILD in the name, like >> TOOLCHAIN_BUILD_LINK_CXX, is only meant to be used for building tools >> that are run during the build. I believe the fix is to just remove >> the "BUILD_". >> >> /Erik >> >> On 2019-11-06 05:13, David Holmes wrote: >>> On 4/11/2019 8:27 pm, Magnus Ihse Bursie wrote: >>>> On 2019-11-02 13:43, Daniel D. Daugherty wrote: >>>>> Since this review contains build changes, I've added build-dev at ... >>>> Thanks Dan for noticing this and cc:ing us. >>>> >>>> Yasumasa: build changes look fine. Thanks. >>> >>> This change broke all cross-compilation. >>> >>> David >>> >>>> /Magnus >>>>> >>>>> Dan >>>>> >>>>> >>>>> On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: >>>>>> (Changed subject to review request) >>>>>> >>>>>> Hi, >>>>>> >>>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine >>>>>> on submit repo. >>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>>> >>>>>> Could you review it? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> New build dependencies cannot be added lightly. This impacts >>>>>>>> everyone who maintains build/test farms. >>>>>>> >>>>>>> Ok, thanks for telling it. >>>>>>> >>>>>>>> We already use the C++ demangling capabilities in the VM. Is >>>>>>>> there some way to export that for use by libsaproc ? >>>>>>>> >>>>>>>> Otherwise using C++ demangle may still be the better choice >>>>>>>> given we already have it as a dependency. >>>>>>> >>>>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() >>>>>>> at decoder_linux.cpp . >>>>>>> It is similar with my original proposal. >>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>> >>>>>>> I agree with David to use C++ demangle way. >>>>>>> However we need to choice the fix from following: >>>>>>> >>>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>>> >>>>>>> I've discussed with Chris about it in [1]. >>>>>>> Option A might be large change. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> [1] >>>>>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>>> >>>>>>> >>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> Here's the failure during configure: >>>>>>>>> >>>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>>>>>> install binutils-devel'. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> >>>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I filed this enhancement to JBS: >>>>>>>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>>> >>>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>>> >>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>>> >>>>>>>>>> According to the email from Mach 5, dependency errors were >>>>>>>>>> occurred in jib. >>>>>>>>>> Can someone share the details? >>>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>>> >>>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>>> You can change the configure script. I don't know if there's >>>>>>>>>>> any concerns with using libiberty.a. That's possibly a legal >>>>>>>>>>> question (GNU GPL). You might want to ask that on jdk-dev >>>>>>>>>>> and/or build-dev. >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>>> >>>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>>>>>> For example: >>>>>>>>>>>> >>>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>>> ????? to >>>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>>> >>>>>>>>>>>> Can it be accepted? >>>>>>>>>>>> >>>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need >>>>>>>>>>>> to link libiberty.a which is provided by binutils. Thus I >>>>>>>>>>>> think we need to check libiberty.a in configure script. Is >>>>>>>>>>>> it ok? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>>>>>> script. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>>>>>> order to do so you've put the new native code in its own >>>>>>>>>>>>> file rather than in LinuxDebuggerLocal.c. I'd like to see >>>>>>>>>>>>> that resolved. So either convert LinuxDebuggerLocal.c to >>>>>>>>>>>>> C++, or use cplus_demangle(). >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>>>>>> mangled as below: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>>>>> + 0x6ac >>>>>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>>>>> + 0x33d >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>>>>>> I think we can demangle in jstack with this patch. If it >>>>>>>>>>>>>> is accepted, I will file it to JBS and send review request. >>>>>>>>>>>>>> What do you think? >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + >>>>>>>>>>>>>> 0x33d >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>>>>>> adds C++ source to SA. >>>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>>> But this function is provided by libiberty.a, so we need >>>>>>>>>>>>>> to link it to libsaproc and need to check libiberty.a in >>>>>>>>>>>>>> configure script. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> >>>>> >>>> From boris.ulasevich at bell-sw.com Wed Nov 6 16:27:57 2019 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Wed, 6 Nov 2019 19:27:57 +0300 Subject: RFR: 8233600: cross-builds fails after JDK-8233285 In-Reply-To: References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> <1fd10248-13e9-16ed-6730-e7d530bbd6af@oracle.com> <2e06b688-d3bb-9cb7-e221-62c3462a308f@oracle.com> Message-ID: <6a7f785c-d9cb-0658-b4b6-09f126f21303@bell-sw.com> Thank you! On 06.11.2019 19:18, Erik Joelsson wrote: > Looks good! Verified the same patch with all our available cross compile > builds. > > /Erik > > On 2019-11-06 06:31, Boris Ulasevich wrote: >> Hi, >> >> Indeed, the fix is quite evident. I checked it works for arm32/aarch >> cross-compilation builds. >> >> http://bugs.openjdk.java.net/browse/JDK-8233600 >> http://cr.openjdk.java.net/~bulasevich/8233600/webrev.00 >> >> regards, >> Boris >> >> On 06.11.2019 16:33, Erik Joelsson wrote: >>> I looked closer at it now and the build change is not good. Any >>> toolchain definition with BUILD in the name, like >>> TOOLCHAIN_BUILD_LINK_CXX, is only meant to be used for building tools >>> that are run during the build. I believe the fix is to just remove >>> the "BUILD_". >>> >>> /Erik >>> >>> On 2019-11-06 05:13, David Holmes wrote: >>>> On 4/11/2019 8:27 pm, Magnus Ihse Bursie wrote: >>>>> On 2019-11-02 13:43, Daniel D. Daugherty wrote: >>>>>> Since this review contains build changes, I've added build-dev at ... >>>>> Thanks Dan for noticing this and cc:ing us. >>>>> >>>>> Yasumasa: build changes look fine. Thanks. >>>> >>>> This change broke all cross-compilation. >>>> >>>> David >>>> >>>>> /Magnus >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>> On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: >>>>>>> (Changed subject to review request) >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine >>>>>>> on submit repo. >>>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>>>> >>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>>>> >>>>>>> Could you review it? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> New build dependencies cannot be added lightly. This impacts >>>>>>>>> everyone who maintains build/test farms. >>>>>>>> >>>>>>>> Ok, thanks for telling it. >>>>>>>> >>>>>>>>> We already use the C++ demangling capabilities in the VM. Is >>>>>>>>> there some way to export that for use by libsaproc ? >>>>>>>>> >>>>>>>>> Otherwise using C++ demangle may still be the better choice >>>>>>>>> given we already have it as a dependency. >>>>>>>> >>>>>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() >>>>>>>> at decoder_linux.cpp . >>>>>>>> It is similar with my original proposal. >>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>> >>>>>>>> I agree with David to use C++ demangle way. >>>>>>>> However we need to choice the fix from following: >>>>>>>> >>>>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>>>> >>>>>>>> I've discussed with Chris about it in [1]. >>>>>>>> Option A might be large change. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yasumasa >>>>>>>> >>>>>>>> >>>>>>>> [1] >>>>>>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>>>> Hi Yasumasa, >>>>>>>>>> >>>>>>>>>> Here's the failure during configure: >>>>>>>>>> >>>>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>>>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>>>>>>> install binutils-devel'. >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I filed this enhancement to JBS: >>>>>>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>>>> >>>>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>>>> >>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>>>> >>>>>>>>>>> According to the email from Mach 5, dependency errors were >>>>>>>>>>> occurred in jib. >>>>>>>>>>> Can someone share the details? >>>>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>>>> >>>>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Yasumasa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>>>> You can change the configure script. I don't know if there's >>>>>>>>>>>> any concerns with using libiberty.a. That's possibly a legal >>>>>>>>>>>> question (GNU GPL). You might want to ask that on jdk-dev >>>>>>>>>>>> and/or build-dev. >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>>>> >>>>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>>>>>>> For example: >>>>>>>>>>>>> >>>>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>>>> ????? to >>>>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>>>> >>>>>>>>>>>>> Can it be accepted? >>>>>>>>>>>>> >>>>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need >>>>>>>>>>>>> to link libiberty.a which is provided by binutils. Thus I >>>>>>>>>>>>> think we need to check libiberty.a in configure script. Is >>>>>>>>>>>>> it ok? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>>>>>>> script. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>>>>>>> order to do so you've put the new native code in its own >>>>>>>>>>>>>> file rather than in LinuxDebuggerLocal.c. I'd like to see >>>>>>>>>>>>>> that resolved. So either convert LinuxDebuggerLocal.c to >>>>>>>>>>>>>> C++, or use cplus_demangle(). >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>>>>>>> mangled as below: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>>>>>> + 0x6ac >>>>>>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>>>>>> + 0x33d >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>>>>>>> I think we can demangle in jstack with this patch. If it >>>>>>>>>>>>>>> is accepted, I will file it to JBS and send review request. >>>>>>>>>>>>>>> What do you think? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + >>>>>>>>>>>>>>> 0x33d >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>>>>>>> adds C++ source to SA. >>>>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>>>> But this function is provided by libiberty.a, so we need >>>>>>>>>>>>>>> to link it to libsaproc and need to check libiberty.a in >>>>>>>>>>>>>>> configure script. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>>> >>>>> From david.holmes at oracle.com Thu Nov 7 00:04:24 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 7 Nov 2019 10:04:24 +1000 Subject: RFR: 8233285: Demangling C++ symbols in jhsdb jstack --mixed In-Reply-To: <832479a6-dbb5-00ed-5ef6-18a1b4b71151@oss.nttdata.com> References: <14a3ed2c-ccd8-83bb-d4f1-d3a319fa2b8a@oracle.com> <81e1e086-6b54-fe6c-f78b-9b2b543b9dcb@oracle.com> <86dc525b-7f4a-2cbd-5893-37fb8ebe4c59@oss.nttdata.com> <644a2d61-ac4b-0106-76fa-8e63d8412a89@oracle.com> <1fd10248-13e9-16ed-6730-e7d530bbd6af@oracle.com> <2e06b688-d3bb-9cb7-e221-62c3462a308f@oracle.com> <832479a6-dbb5-00ed-5ef6-18a1b4b71151@oss.nttdata.com> Message-ID: <2b994a58-56d5-5dfd-05f8-3c4800863632@oracle.com> Just for the record this was fixed by "8233600: cross-builds fails after JDK-8233285" David On 6/11/2019 11:50 pm, Yasumasa Suenaga wrote: > Hi, > > Thanks for telling it me. > I removed "BUILD_" from Makefile as Erik said, then it works fine on my > Linux box: > > ``` > diff -r a3b046720c3b make/lib/Lib-jdk.hotspot.agent.gmk > --- a/make/lib/Lib-jdk.hotspot.agent.gmk??????? Wed Nov 06 21:49:30 2019 > +0900 > +++ b/make/lib/Lib-jdk.hotspot.agent.gmk??????? Wed Nov 06 22:48:00 2019 > +0900 > @@ -55,7 +55,7 @@ > > ?SA_TOOLCHAIN := $(TOOLCHAIN_DEFAULT) > ?ifeq ($(call isTargetOs, linux), true) > -? SA_TOOLCHAIN := TOOLCHAIN_BUILD_LINK_CXX > +? SA_TOOLCHAIN := TOOLCHAIN_LINK_CXX > ?endif > > ?################################################################################ > ``` > > Could you file it to JBS? I don't know details. > If you assign it to me, I will send review request. > > > Thanks, > > Yasumasa > > > On 2019/11/06 22:33, Erik Joelsson wrote: >> I looked closer at it now and the build change is not good. Any >> toolchain definition with BUILD in the name, like >> TOOLCHAIN_BUILD_LINK_CXX, is only meant to be used for building tools >> that are run during the build. I believe the fix is to just remove the >> "BUILD_". >> >> /Erik >> >> On 2019-11-06 05:13, David Holmes wrote: >>> On 4/11/2019 8:27 pm, Magnus Ihse Bursie wrote: >>>> On 2019-11-02 13:43, Daniel D. Daugherty wrote: >>>>> Since this review contains build changes, I've added build-dev at ... >>>> Thanks Dan for noticing this and cc:ing us. >>>> >>>> Yasumasa: build changes look fine. Thanks. >>> >>> This change broke all cross-compilation. >>> >>> David >>> >>>> /Magnus >>>>> >>>>> Dan >>>>> >>>>> >>>>> On 11/1/19 4:56 AM, Yasumasa Suenaga wrote: >>>>>> (Changed subject to review request) >>>>>> >>>>>> Hi, >>>>>> >>>>>> I converted LinuxDebuggerLocal.c to C++ code, and it works fine on >>>>>> submit repo. >>>>>> (mach5-one-ysuenaga-JDK-8233285-1-20191101-0746-6336009) >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233285/webrev.00/ >>>>>> >>>>>> Could you review it? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/11/01 8:54, Yasumasa Suenaga wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> On 2019/11/01 7:55, David Holmes wrote: >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> New build dependencies cannot be added lightly. This impacts >>>>>>>> everyone who maintains build/test farms. >>>>>>> >>>>>>> Ok, thanks for telling it. >>>>>>> >>>>>>>> We already use the C++ demangling capabilities in the VM. Is >>>>>>>> there some way to export that for use by libsaproc ? >>>>>>>> >>>>>>>> Otherwise using C++ demangle may still be the better choice >>>>>>>> given we already have it as a dependency. >>>>>>> >>>>>>> I found abi::__cxa_demangle() is used in ElfDecoder::demangle() >>>>>>> at decoder_linux.cpp . >>>>>>> It is similar with my original proposal. >>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>> >>>>>>> I agree with David to use C++ demangle way. >>>>>>> However we need to choice the fix from following: >>>>>>> >>>>>>> ?? A. Convert LinuxDebuggerLocal.c to C++ code >>>>>>> ?? B. Add C++ code for libsaproc.so to demangle symbols. >>>>>>> >>>>>>> I've discussed with Chris about it in [1]. >>>>>>> Option A might be large change. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> [1] >>>>>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-October/029716.html >>>>>>> >>>>>>> >>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 1/11/2019 12:58 am, Chris Plummer wrote: >>>>>>>>> Hi Yasumasa, >>>>>>>>> >>>>>>>>> Here's the failure during configure: >>>>>>>>> >>>>>>>>> [2019-10-31T06:07:45,131Z] checking demangle.h usability... no >>>>>>>>> [2019-10-31T06:07:45,150Z] checking demangle.h presence... no >>>>>>>>> [2019-10-31T06:07:45,150Z] checking for demangle.h... no >>>>>>>>> [2019-10-31T06:07:45,151Z] configure: error: Could not find >>>>>>>>> demangle.h! You might be able to fix this by running 'sudo yum >>>>>>>>> install binutils-devel'. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> >>>>>>>>> On 10/31/19 1:08 AM, Yasumasa Suenaga wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I filed this enhancement to JBS: >>>>>>>>>> >>>>>>>>>> ? https://bugs.openjdk.java.net/browse/JDK-8233285 >>>>>>>>>> >>>>>>>>>> Also I pushed the changes to submit repo, but it was failed. >>>>>>>>>> >>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/bfbc49233c26 >>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/430e4f65ef25 >>>>>>>>>> >>>>>>>>>> According to the email from Mach 5, dependency errors were >>>>>>>>>> occurred in jib. >>>>>>>>>> Can someone share the details? >>>>>>>>>> I'm not familiar in jib, so I want help. >>>>>>>>>> >>>>>>>>>> mach5-one-ysuenaga-JDK-8233285-20191031-0606-6301426 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Yasumasa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2019/10/31 11:23, Chris Plummer wrote: >>>>>>>>>>> You can change the configure script. I don't know if there's >>>>>>>>>>> any concerns with using libiberty.a. That's possibly a legal >>>>>>>>>>> question (GNU GPL). You might want to ask that on jdk-dev >>>>>>>>>>> and/or build-dev. >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 10/30/19 7:14 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for quick reply! >>>>>>>>>>>> >>>>>>>>>>>> If we convert LinuxDebuggerLocal.c to C++ code, we have to >>>>>>>>>>>> convert a lot of JNI calls to C++ style. >>>>>>>>>>>> For example: >>>>>>>>>>>> >>>>>>>>>>>> ? (*env)->FindClass(env, "java/lang/String") >>>>>>>>>>>> ????? to >>>>>>>>>>>> ? env->FindClass("java/lang/String") >>>>>>>>>>>> >>>>>>>>>>>> Can it be accepted? >>>>>>>>>>>> >>>>>>>>>>>> OTOH I said in my email, to use cplus_demangle(), we need to >>>>>>>>>>>> link libiberty.a which is provided by binutils. Thus I think >>>>>>>>>>>> we need to check libiberty.a in configure script. Is it ok? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I prefer to use cplus_demangle() if we can change configure >>>>>>>>>>>> script. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yasumasa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 2019/10/31 11:03, Chris Plummer wrote: >>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>> >>>>>>>>>>>>> I don't have concerns with adding C++ source to SA, but in >>>>>>>>>>>>> order to do so you've put the new native code in its own >>>>>>>>>>>>> file rather than in LinuxDebuggerLocal.c. I'd like to see >>>>>>>>>>>>> that resolved. So either convert LinuxDebuggerLocal.c to >>>>>>>>>>>>> C++, or use cplus_demangle(). >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 10/30/19 6:54 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I saw C++ frames in `jhsdb jstack --mixed`, and they were >>>>>>>>>>>>>> mangled as below: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 0x00007ff255a8fa4c >>>>>>>>>>>>>> _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP6Thread >>>>>>>>>>>>>> + 0x6ac >>>>>>>>>>>>>> 0x00007ff255a8cc1d >>>>>>>>>>>>>> _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP6Thread >>>>>>>>>>>>>> + 0x33d >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can demangle them via c++filt, but I think it is more >>>>>>>>>>>>>> convenience if jstack can show demangling symbols. >>>>>>>>>>>>>> I think we can demangle in jstack with this patch. If it >>>>>>>>>>>>>> is accepted, I will file it to JBS and send review request. >>>>>>>>>>>>>> What do you think? >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/sa-demangle/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can get the stack as below after applying this patch: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 0x00007ff1aba20a4c JavaCalls::call_helper(JavaValue*, >>>>>>>>>>>>>> methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac >>>>>>>>>>>>>> 0x00007ff1aba1dc1d JavaCalls::call_virtual(JavaValue*, >>>>>>>>>>>>>> Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + >>>>>>>>>>>>>> 0x33d >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I use abi::__cxa_demangle() for demangling, so this patch >>>>>>>>>>>>>> adds C++ source to SA. >>>>>>>>>>>>>> If it is not comfortable, we can use cplus_demangle(). >>>>>>>>>>>>>> But this function is provided by libiberty.a, so we need >>>>>>>>>>>>>> to link it to libsaproc and need to check libiberty.a in >>>>>>>>>>>>>> configure script. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> >>>>> >>>> From david.holmes at oracle.com Thu Nov 7 10:45:01 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 7 Nov 2019 20:45:01 +1000 Subject: RFR: 8233784: ProblemList failing JVMTI scenario tests Message-ID: <9cb34006-e88f-c0a9-776d-fb7c6791a4dd@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8233784 Patch below. Getting the fix tested will take a little while and we are getting numerous failures in our CI testing. I ran all the scenario tests multiple times to try and find all that fail due to this problem. It may not be exhaustive, so if needed I'll add more later. Thanks, David ----- iff -r bb2a436e616c test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -206,4 +206,14 @@ vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn001/forceEarlyReturn001.java 7199837 generic-all -############################################################################# +vmTestbase/nsk/jvmti/scenarios/allocation/AP01/ap01t001/TestDescription.java 8233549 generic-all +vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t002/TestDescription.java 8233549 generic-all +vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/TestDescription.java 8233549 generic-all +vmTestbase/nsk/jvmti/scenarios/allocation/AP12/ap12t001/TestDescription.java 8233549 generic-all +vmTestbase/nsk/jvmti/scenarios/capability/CM02/cm02t001/TestDescription.java 8233549 generic-all +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t002/TestDescription.java 8233549 generic-all +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t003/TestDescription.java 8233549 generic-all +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t005/TestDescription.java 8233549 generic-all +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.java 8233549 generic-all + +############################################################################# From goetz.lindenmaier at sap.com Thu Nov 7 11:39:52 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 7 Nov 2019 11:39:52 +0000 Subject: RFR: 8233784: ProblemList failing JVMTI scenario tests In-Reply-To: <9cb34006-e88f-c0a9-776d-fb7c6791a4dd@oracle.com> References: <9cb34006-e88f-c0a9-776d-fb7c6791a4dd@oracle.com> Message-ID: Hi David, we also see these failures in our CI. It is intermittent and on all platforms. It happens since Nov 4. We saw the following ones you already listed: vmTestbase/nsk/jvmti/scenarios/allocation/AP12/ap12t001/TestDescription.java vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t005/TestDescription.java Further we saw this one, could you please add it, too? vmTestbase/nsk/jvmti/scenarios/events/EM07/em07t002/TestDescription.java It failed once with the same kind of message. Besides that, the change looks good. Further we saw these failing, but I'm not sure it's the same issue: vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001/TestDescription.java vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t002/TestDescription.java with this message: CASE #3: Allocating objects... Start heap iteration thread and field modification loop thread3 started. - ap04t002.cpp, 346: Calling IterateOverObjectsReachableFromObject... - ap04t002.cpp, 352: IterateOverObjectsReachableFromObject finished. - ap04t002.cpp, 354: Iterations count: 247089 - ap04t002.cpp, 355: Modifications count: 14 - ap04t002.cpp, 358: Errors detected: 0 Wait for completion thread to finish thread3 finished. Cleaning tags and references to objects... The following fake exception stacktrace is for failure analysis. nsk.share.Fake_Exception_for_RULE_Creation: (jvmti_tools.cpp:683) error at nsk_lvcomplain(nsk_tools.cpp:172) # ERROR: jvmti_tools.cpp, 683: error # jvmti error: code=52, name=JVMTI_ERROR_INTERRUPT CASE #3 finished. CASE #4: Allocating objects... ----------System.err:(18/4306)---------- java.lang.AssertionError: .../jvm_14/bin/java, -Xmx768m, -Djava.awt.headless=true, ... -agentlib:ap04t002=-waittime=5 -verbose, nsk.jvmti.scenarios.allocation.AP04.ap04t002] exit code is 52 at ExecDriver.main(ExecDriver.java:137) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at PropertyResolvingWrapper.main(PropertyResolvingWrapper.java:104) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) at java.base/java.lang.Thread.run(Thread.java:833) > -----Original Message----- > From: hotspot-runtime-dev > On Behalf Of David Holmes > Sent: Donnerstag, 7. November 2019 11:45 > To: serviceability-dev ; hotspot-runtime- > dev at openjdk.java.net > Subject: RFR: 8233784: ProblemList failing JVMTI scenario tests > > Bug: https://bugs.openjdk.java.net/browse/JDK-8233784 > > Patch below. > > Getting the fix tested will take a little while and we are getting > numerous failures in our CI testing. I ran all the scenario tests > multiple times to try and find all that fail due to this problem. It may > not be exhaustive, so if needed I'll add more later. > > Thanks, > David > ----- > > iff -r bb2a436e616c test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -206,4 +206,14 @@ > > > vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn00 > 1/forceEarlyReturn001.java > 7199837 generic-all > > - > ################################################################## > ########### > +vmTestbase/nsk/jvmti/scenarios/allocation/AP01/ap01t001/TestDescription. > java > 8233549 generic-all > +vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t002/TestDescription. > java > 8233549 generic-all > +vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/TestDescription. > java > 8233549 generic-all > +vmTestbase/nsk/jvmti/scenarios/allocation/AP12/ap12t001/TestDescription. > java > 8233549 generic-all > +vmTestbase/nsk/jvmti/scenarios/capability/CM02/cm02t001/TestDescriptio > n.java > 8233549 generic-all > +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t002/TestDescription.ja > va > 8233549 generic-all > +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t003/TestDescription.ja > va > 8233549 generic-all > +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t005/TestDescription.ja > va > 8233549 generic-all > +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.ja > va > 8233549 generic-all > + > +################################################################# > ############ From david.holmes at oracle.com Thu Nov 7 12:23:45 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 7 Nov 2019 22:23:45 +1000 Subject: RFR: 8233784: ProblemList failing JVMTI scenario tests In-Reply-To: References: <9cb34006-e88f-c0a9-776d-fb7c6791a4dd@oracle.com> Message-ID: <17f834a6-eb52-f69f-c827-d5856280955e@oracle.com> Hi Goetz, Thanks for looking at this. On 7/11/2019 9:39 pm, Lindenmaier, Goetz wrote: > Hi David, > > we also see these failures in our CI. It is intermittent and on all platforms. > It happens since Nov 4. > > We saw the following ones you already listed: > vmTestbase/nsk/jvmti/scenarios/allocation/AP12/ap12t001/TestDescription.java > vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t005/TestDescription.java > > Further we saw this one, could you please add it, too? > vmTestbase/nsk/jvmti/scenarios/events/EM07/em07t002/TestDescription.java > It failed once with the same kind of message. My latest test run just saw that one fail too. > Besides that, the change looks good. > > Further we saw these failing, but I'm not sure it's the same issue: > vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001/TestDescription.java > vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t002/TestDescription.java > with this message: Yes this is the same issue. I've ensured both are in the list. I now have 11 on the list, which leaves 151 that have not yet been seen to fail. Any test that uses the AgentThread functionality and uses a RawMonitorWait is potentially affected. Unfortunately there seems to be no easy way to actually determine all the tests affected due to the use of utility functions. I'll push what I have now so that we can stem the failures. Unfortunately I'm out of the office tomorrow morning. Thanks, David ----- > CASE #3: > Allocating objects... > Start heap iteration thread and field modification loop > thread3 started. > - ap04t002.cpp, 346: Calling IterateOverObjectsReachableFromObject... > - ap04t002.cpp, 352: IterateOverObjectsReachableFromObject finished. > - ap04t002.cpp, 354: Iterations count: 247089 > - ap04t002.cpp, 355: Modifications count: 14 > - ap04t002.cpp, 358: Errors detected: 0 > Wait for completion thread to finish > thread3 finished. > Cleaning tags and references to objects... > The following fake exception stacktrace is for failure analysis. > nsk.share.Fake_Exception_for_RULE_Creation: (jvmti_tools.cpp:683) error > at nsk_lvcomplain(nsk_tools.cpp:172) > # ERROR: jvmti_tools.cpp, 683: error > # jvmti error: code=52, name=JVMTI_ERROR_INTERRUPT > CASE #3 finished. > > CASE #4: > Allocating objects... > ----------System.err:(18/4306)---------- > java.lang.AssertionError: .../jvm_14/bin/java, -Xmx768m, -Djava.awt.headless=true, ... -agentlib:ap04t002=-waittime=5 -verbose, nsk.jvmti.scenarios.allocation.AP04.ap04t002] exit code is 52 > at ExecDriver.main(ExecDriver.java:137) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at PropertyResolvingWrapper.main(PropertyResolvingWrapper.java:104) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > at java.base/java.lang.Thread.run(Thread.java:833) > > >> -----Original Message----- >> From: hotspot-runtime-dev >> On Behalf Of David Holmes >> Sent: Donnerstag, 7. November 2019 11:45 >> To: serviceability-dev ; hotspot-runtime- >> dev at openjdk.java.net >> Subject: RFR: 8233784: ProblemList failing JVMTI scenario tests >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8233784 >> >> Patch below. >> >> Getting the fix tested will take a little while and we are getting >> numerous failures in our CI testing. I ran all the scenario tests >> multiple times to try and find all that fail due to this problem. It may >> not be exhaustive, so if needed I'll add more later. >> >> Thanks, >> David >> ----- >> >> iff -r bb2a436e616c test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -206,4 +206,14 @@ >> >> >> vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn00 >> 1/forceEarlyReturn001.java >> 7199837 generic-all >> >> - >> ################################################################## >> ########### >> +vmTestbase/nsk/jvmti/scenarios/allocation/AP01/ap01t001/TestDescription. >> java >> 8233549 generic-all >> +vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t002/TestDescription. >> java >> 8233549 generic-all >> +vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/TestDescription. >> java >> 8233549 generic-all >> +vmTestbase/nsk/jvmti/scenarios/allocation/AP12/ap12t001/TestDescription. >> java >> 8233549 generic-all >> +vmTestbase/nsk/jvmti/scenarios/capability/CM02/cm02t001/TestDescriptio >> n.java >> 8233549 generic-all >> +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t002/TestDescription.ja >> va >> 8233549 generic-all >> +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t003/TestDescription.ja >> va >> 8233549 generic-all >> +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t005/TestDescription.ja >> va >> 8233549 generic-all >> +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.ja >> va >> 8233549 generic-all >> + >> +################################################################# >> ############ From goetz.lindenmaier at sap.com Thu Nov 7 13:17:32 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 7 Nov 2019 13:17:32 +0000 Subject: RFR: 8233784: ProblemList failing JVMTI scenario tests In-Reply-To: <17f834a6-eb52-f69f-c827-d5856280955e@oracle.com> References: <9cb34006-e88f-c0a9-776d-fb7c6791a4dd@oracle.com> <17f834a6-eb52-f69f-c827-d5856280955e@oracle.com> Message-ID: Hi David, thanks for adding our failures. Best regards, Goetz. > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 7. November 2019 13:24 > To: Lindenmaier, Goetz ; serviceability-dev > ; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR: 8233784: ProblemList failing JVMTI scenario tests > > Hi Goetz, > > Thanks for looking at this. > > On 7/11/2019 9:39 pm, Lindenmaier, Goetz wrote: > > Hi David, > > > > we also see these failures in our CI. It is intermittent and on all platforms. > > It happens since Nov 4. > > > > We saw the following ones you already listed: > > > vmTestbase/nsk/jvmti/scenarios/allocation/AP12/ap12t001/TestDescription.j > ava > > > vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t005/TestDescription.jav > a > > > > Further we saw this one, could you please add it, too? > > > vmTestbase/nsk/jvmti/scenarios/events/EM07/em07t002/TestDescription.jav > a > > It failed once with the same kind of message. > > My latest test run just saw that one fail too. > > > Besides that, the change looks good. > > > > Further we saw these failing, but I'm not sure it's the same issue: > > > vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001/TestDescription.j > ava > > > vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t002/TestDescription.j > ava > > with this message: > > Yes this is the same issue. I've ensured both are in the list. > > I now have 11 on the list, which leaves 151 that have not yet been seen > to fail. Any test that uses the AgentThread functionality and uses a > RawMonitorWait is potentially affected. Unfortunately there seems to be > no easy way to actually determine all the tests affected due to the use > of utility functions. > > I'll push what I have now so that we can stem the failures. > Unfortunately I'm out of the office tomorrow morning. > > Thanks, > David > ----- > > > CASE #3: > > Allocating objects... > > Start heap iteration thread and field modification loop > > thread3 started. > > - ap04t002.cpp, 346: Calling IterateOverObjectsReachableFromObject... > > - ap04t002.cpp, 352: IterateOverObjectsReachableFromObject finished. > > - ap04t002.cpp, 354: Iterations count: 247089 > > - ap04t002.cpp, 355: Modifications count: 14 > > - ap04t002.cpp, 358: Errors detected: 0 > > Wait for completion thread to finish > > thread3 finished. > > Cleaning tags and references to objects... > > The following fake exception stacktrace is for failure analysis. > > nsk.share.Fake_Exception_for_RULE_Creation: (jvmti_tools.cpp:683) error > > at nsk_lvcomplain(nsk_tools.cpp:172) > > # ERROR: jvmti_tools.cpp, 683: error > > # jvmti error: code=52, name=JVMTI_ERROR_INTERRUPT > > CASE #3 finished. > > > > CASE #4: > > Allocating objects... > > ----------System.err:(18/4306)---------- > > java.lang.AssertionError: .../jvm_14/bin/java, -Xmx768m, - > Djava.awt.headless=true, ... -agentlib:ap04t002=-waittime=5 -verbose, > nsk.jvmti.scenarios.allocation.AP04.ap04t002] exit code is 52 > > at ExecDriver.main(ExecDriver.java:137) > > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMetho > dAccessorImpl.java:62) > > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Delegatin > gMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > > at > PropertyResolvingWrapper.main(PropertyResolvingWrapper.java:104) > > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMetho > dAccessorImpl.java:62) > > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Delegatin > gMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > > at > com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.j > ava:127) > > at java.base/java.lang.Thread.run(Thread.java:833) > > > > > >> -----Original Message----- > >> From: hotspot-runtime-dev bounces at openjdk.java.net> > >> On Behalf Of David Holmes > >> Sent: Donnerstag, 7. November 2019 11:45 > >> To: serviceability-dev ; hotspot- > runtime- > >> dev at openjdk.java.net > >> Subject: RFR: 8233784: ProblemList failing JVMTI scenario tests > >> > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8233784 > >> > >> Patch below. > >> > >> Getting the fix tested will take a little while and we are getting > >> numerous failures in our CI testing. I ran all the scenario tests > >> multiple times to try and find all that fail due to this problem. It may > >> not be exhaustive, so if needed I'll add more later. > >> > >> Thanks, > >> David > >> ----- > >> > >> iff -r bb2a436e616c test/hotspot/jtreg/ProblemList.txt > >> --- a/test/hotspot/jtreg/ProblemList.txt > >> +++ b/test/hotspot/jtreg/ProblemList.txt > >> @@ -206,4 +206,14 @@ > >> > >> > >> > vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn00 > >> 1/forceEarlyReturn001.java > >> 7199837 generic-all > >> > >> - > >> > ################################################################## > >> ########### > >> > +vmTestbase/nsk/jvmti/scenarios/allocation/AP01/ap01t001/TestDescription. > >> java > >> 8233549 generic-all > >> > +vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t002/TestDescription. > >> java > >> 8233549 generic-all > >> > +vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/TestDescription. > >> java > >> 8233549 generic-all > >> > +vmTestbase/nsk/jvmti/scenarios/allocation/AP12/ap12t001/TestDescription. > >> java > >> 8233549 generic-all > >> > +vmTestbase/nsk/jvmti/scenarios/capability/CM02/cm02t001/TestDescriptio > >> n.java > >> 8233549 generic-all > >> > +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t002/TestDescription.ja > >> va > >> 8233549 generic-all > >> > +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t003/TestDescription.ja > >> va > >> 8233549 generic-all > >> > +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t005/TestDescription.ja > >> va > >> 8233549 generic-all > >> > +vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.ja > >> va > >> 8233549 generic-all > >> + > >> > +################################################################# > >> ############ From chris.plummer at oracle.com Thu Nov 7 22:01:50 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 7 Nov 2019 14:01:50 -0800 Subject: RFR(S): JDK-8231635: SA Stackwalking code stuck in BasicTypeDataBase.findDynamicTypeForAddress() Message-ID: Hi, Please review the following fix for JDK-8231635: https://bugs.openjdk.java.net/browse/JDK-8231635 http://cr.openjdk.java.net/~cjplummer/8231635/webrev.00/ I've tried to explain below to the best of my ability what's is going on, but keep in mind that I basically had no background in this area before looking into this CR, so this is all new to me. Please feel free to chime in with corrections to my explanation, or any additional insight that might help to further understanding of this code. When doing a thread stack dump, SA has to figure out the SP for the current frame when it may not in fact be stored anywhere. So it goes through a series of guesses, starting with the current value of SP. See AMD64CurrentFrameGuess.run(): ??? Address sp? = context.getRegisterAsAddress(AMD64ThreadContext.RSP); There are a number of checks done to see if this is the SP for the actual current frame, one of the checks being (and kind of a last resort) to follow the frame links and see if they eventually lead to the first entry frame: ??????????? while (frame != null) { ????????????? if (frame.isEntryFrame() && frame.entryFrameIsFirst()) { ???????????????? ... ???????????????? return true; ????????????? } ????????????? frame = frame.sender(map); ??????????? } If this fails, there is an outer loop to try the next address: ??????? for (long offset = 0; ???????????? offset < regionInBytesToSearch; ???????????? offset += vm.getAddressSize()) { Note that offset is added to the initial SP value that was fetched from RSP. This approach is fraught with danger, because SP could be incorrect, and you can easily follow a bad frame link to an invalid address. So the body of this loop is in a try block that catches all Exceptions, and simply retries with the next offset if one is caught. Exceptions could be ones like UnalignedAddressException or UnmappedAddressException. The bug in question turns up with the following harmless looking line: ????????????? frame = frame.sender(map); This is fine if you know that "frame" is valid, but what if it is not (which is very commonly the case). The frame values (SP, FP, and PC) in the returned frame could be just about anything, including being the same as the previous frame. This is what will happen if the SP stored in "frame" is the same as the SP that was used to initialize "frame" in the first place. This can certainly happen when SP is not valid to start with, and is indeed what caused this bug. The end result is the inner while loop gets stuck in an infinite loop traversing the same frame. So the fix is to add a check for this to make sure to break out of the while loop if this happens. Initially I did this with an Address.equal() call, and that seemed to fix the problem, but then I realized it would be possible to traverse through one or more sender frames and eventually end up returning to a previously visited frame, thus still an infinite loop. So I decided on checking for Address.lessThanOrEqual() instead since the send frame's SP should always be greater than the current frame's (referred to as oldFrame) SP. As long as we always move in one direction (towards a higher frame address), you can't have an infinite loop in this code. I applied this fix to x86. Although not tested, it is built (all platform support is always built with SA). The x86 and amd64 versions are identical except for x86/amd64 references, so I thought it best to go ahead and do the update to x86. I did not touch ppc, but would be willing to update if someone passes along a fix that is tested. One final bit of clarification. The bug synopsis mentions getting stuck in BasicTypeDataBase.findDynamicTypeForAddress(). This turns out to not actually be the case, but every stack trace I initially looked when I filed this CR was showing the thread being in this frame and at the same line number. This appears to be the next available safepoint where the thread can be suspended for stack dumping. When debugging this some more and adding a lot of println() calls in a lot of different locations, I started to see different frames in the stacktrace, presumably because the println() calls where adding additional safepoints. thanks, Chris From ralf.schmelter at sap.com Fri Nov 8 11:56:16 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Fri, 8 Nov 2019 11:56:16 +0000 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap Message-ID: This change forwards the output from the HeapDumper.dump() command to an optional output stream supplied by the caller. Until now the diagnositic command and the "dumpheap" command of the attach framework partly recreated the output by hand, but missing some information. Old output: Heap dump file created New output: Dumping heap to test.hprof ... Heap dump file created [9719330384 bytes in 27,759 secs] In addition to getting this improved information, it saves code too. Bugreport: https://bugs.openjdk.java.net/browse/JDK-8233790 Webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8233790/webrev.0/ Best regards, Ralf From thomas.stuefe at gmail.com Fri Nov 8 15:06:54 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 8 Nov 2019 16:06:54 +0100 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: References: Message-ID: Hi Ralf, this makes sense. Some small remarks: --- heapDumper.hpp // dumps the heap to the specified file, returns 0 if success. - int dump(const char* path); + int dump(const char* path, outputStream* out = NULL); Can you please add a comment about the new parameter? E.g. "optional outputStream to which progress- and error messages will be written". -- heapDumper.cpp - in HeapDumper::dump(): while you are on it could you please initialize my_path to NULL at function start? --- You can actually get rid of HeapDumper::error_as_C_string() altogher now. Last remaining caller is jmm_DumpHeap0(). You could use a stringStream to catch the output of HeapDumper::dump() and use that string to build the error message for the IOException in case of an error. I leave that up to you. --- Did you test that no jtreg tests fall over the changed output? e.g. test/jdk/sun/tools/jmap/BasicJMapTest.java, or whatever is under hotspot/jtreg/serviceability/jcmd ? Thanks, Thomas On Fri, Nov 8, 2019 at 12:56 PM Schmelter, Ralf wrote: > This change forwards the output from the HeapDumper.dump() command to an > optional output stream supplied by the caller. > Until now the diagnositic command and the "dumpheap" command of the attach > framework partly recreated the output by hand, but missing some information. > > Old output: > Heap dump file created > > New output: > Dumping heap to test.hprof ... > Heap dump file created [9719330384 bytes in 27,759 secs] > > In addition to getting this improved information, it saves code too. > > Bugreport: https://bugs.openjdk.java.net/browse/JDK-8233790 > Webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8233790/webrev.0/ > > Best regards, > Ralf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.schmelter at sap.com Fri Nov 8 15:44:53 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Fri, 8 Nov 2019 15:44:53 +0000 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: References: Message-ID: Hi Thomas, thanks for the review. > Can you please add a comment about the new parameter? E.g. "optional outputStream to which progress- and error messages will be written". Will do. > - in HeapDumper::dump(): while you are on it could you please initialize my_path to NULL at function start?? Do you mean the HeapDumper::dump_heap() method? That seems unrelated to my change. And I would prefer to get a warning about a potentially uninitialized use. > You can actually get rid of HeapDumper::error_as_C_string() altogher now. I thought about it, but the message is more than just the error text (you would get the "Dumping heap to .." part) and is multiple lines. This seems not fitting for an exception message. > Did you test that no jtreg tests fall over the changed output? e.g. test/jdk/sun/tools/jmap/BasicJMapTest.java, or whatever is under hotspot/jtreg/serviceability/jcmd I've checked them. They work because they just check that "Heap dump file created" is contained in the output, which is still the case or don't check the output at all (the dcmd tests). Best regards, Ralf From thomas.stuefe at gmail.com Fri Nov 8 15:50:14 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 8 Nov 2019 16:50:14 +0100 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: References: Message-ID: Hi Ralf, On Fri, Nov 8, 2019 at 4:45 PM Schmelter, Ralf wrote: > Hi Thomas, > > thanks for the review. > > > > Can you please add a comment about the new parameter? E.g. "optional > outputStream to which progress- and error messages will be written". > > Will do. > > > > - in HeapDumper::dump(): while you are on it could you please initialize > my_path to NULL at function start? > > Do you mean the HeapDumper::dump_heap() method? That seems unrelated to my > change. And I would prefer to get a warning about a potentially > uninitialized use. > > Yes its unrelated but since you are in the area... But okay, I leave it up to you. > > > You can actually get rid of HeapDumper::error_as_C_string() altogher now. > > I thought about it, but the message is more than just the error text (you > would get the "Dumping heap to .." part) and is multiple lines. This > seems not fitting for an exception message. > > Okay. > > > Did you test that no jtreg tests fall over the changed output? e.g. > test/jdk/sun/tools/jmap/BasicJMapTest.java, or whatever is under > hotspot/jtreg/serviceability/jcmd > > I've checked them. They work because they just check that "Heap dump file > created" is contained in the output, which is still the case or don't check > the output at all (the dcmd tests). > > Okay. > Best regards, > Ralf > Okay. If you only add the comment I do not need to see a new webrev. Cheers, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Nov 8 21:27:12 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 8 Nov 2019 13:27:12 -0800 Subject: RFR(S): JDK-8231635: SA Stackwalking code stuck in BasicTypeDataBase.findDynamicTypeForAddress() In-Reply-To: References: Message-ID: <5ac21baa-e125-e0d1-7283-6359140ae6f4@oracle.com> Ping! On 11/7/19 2:01 PM, Chris Plummer wrote: > Hi, > > Please review the following fix for JDK-8231635: > > https://bugs.openjdk.java.net/browse/JDK-8231635 > http://cr.openjdk.java.net/~cjplummer/8231635/webrev.00/ > > I've tried to explain below to the best of my ability what's is going > on, but keep in mind that I basically had no background in this area > before looking into this CR, so this is all new to me. Please feel > free to chime in with corrections to my explanation, or any additional > insight that might help to further understanding of this code. > > When doing a thread stack dump, SA has to figure out the SP for the > current frame when it may not in fact be stored anywhere. So it goes > through a series of guesses, starting with the current value of SP. > See AMD64CurrentFrameGuess.run(): > > ??? Address sp? = context.getRegisterAsAddress(AMD64ThreadContext.RSP); > > There are a number of checks done to see if this is the SP for the > actual current frame, one of the checks being (and kind of a last > resort) to follow the frame links and see if they eventually lead to > the first entry frame: > > ??????????? while (frame != null) { > ????????????? if (frame.isEntryFrame() && frame.entryFrameIsFirst()) { > ???????????????? ... > ???????????????? return true; > ????????????? } > ????????????? frame = frame.sender(map); > ??????????? } > > If this fails, there is an outer loop to try the next address: > > ??????? for (long offset = 0; > ???????????? offset < regionInBytesToSearch; > ???????????? offset += vm.getAddressSize()) { > > Note that offset is added to the initial SP value that was fetched > from RSP. This approach is fraught with danger, because SP could be > incorrect, and you can easily follow a bad frame link to an invalid > address. So the body of this loop is in a try block that catches all > Exceptions, and simply retries with the next offset if one is caught. > Exceptions could be ones like UnalignedAddressException or > UnmappedAddressException. > > The bug in question turns up with the following harmless looking line: > > ????????????? frame = frame.sender(map); > > This is fine if you know that "frame" is valid, but what if it is not > (which is very commonly the case). The frame values (SP, FP, and PC) > in the returned frame could be just about anything, including being > the same as the previous frame. This is what will happen if the SP > stored in "frame" is the same as the SP that was used to initialize > "frame" in the first place. This can certainly happen when SP is not > valid to start with, and is indeed what caused this bug. The end > result is the inner while loop gets stuck in an infinite loop > traversing the same frame. So the fix is to add a check for this to > make sure to break out of the while loop if this happens. Initially I > did this with an Address.equal() call, and that seemed to fix the > problem, but then I realized it would be possible to traverse through > one or more sender frames and eventually end up returning to a > previously visited frame, thus still an infinite loop. So I decided on > checking for Address.lessThanOrEqual() instead since the send frame's > SP should always be greater than the current frame's (referred to as > oldFrame) SP. As long as we always move in one direction (towards a > higher frame address), you can't have an infinite loop in this code. > > I applied this fix to x86. Although not tested, it is built (all > platform support is always built with SA). The x86 and amd64 versions > are identical except for x86/amd64 references, so I thought it best to > go ahead and do the update to x86. I did not touch ppc, but would be > willing to update if someone passes along a fix that is tested. > > One final bit of clarification. The bug synopsis mentions getting > stuck in BasicTypeDataBase.findDynamicTypeForAddress(). This turns out > to not actually be the case, but every stack trace I initially looked > when I filed this CR was showing the thread being in this frame and at > the same line number. This appears to be the next available safepoint > where the thread can be suspended for stack dumping. When debugging > this some more and adding a lot of println() calls in a lot of > different locations, I started to see different frames in the > stacktrace, presumably because the println() calls where adding > additional safepoints. > > thanks, > > Chris > From daniil.x.titov at oracle.com Fri Nov 8 22:58:53 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Fri, 08 Nov 2019 14:58:53 -0800 Subject: RFR: 8233868: Unproblem list sun/tools/jstat/jstatClassloadOutput1.sh Message-ID: Please a review a changeset below that removes test sun/tools/jstat/jstatClassloadOutput1.sh from test/jdk/ProblemList.txt. diff -r f92ef5d182b5 test/jdk/ProblemList.txt --- a/test/jdk/ProblemList.txt Fri Nov 08 11:41:17 2019 -0500 +++ b/test/jdk/ProblemList.txt Fri Nov 08 22:37:11 2019 +0000 @@ -861,7 +861,6 @@ # svc_tools -sun/tools/jstat/jstatClassloadOutput1.sh 8173942 generic-all sun/tools/jhsdb/BasicLauncherTest.java 8193639,8211767 solaris-all,linux-ppc64,linux-ppc64le sun/tools/jhsdb/HeapDumpTest.java 8193639 solaris-all sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8230731,8001227 windows-all This test was added in ProblemList.txt in [1] but this issue is no longer reproducible in JDK 14 and the test runs fine in Mach5. Mach5 tests for tier1,tier2, and tier3 successfully passed. [1] https://bugs.openjdk.java.net/browse/JDK-8173942 Thanks. Daniil From alexey.menkov at oracle.com Fri Nov 8 23:22:53 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 8 Nov 2019 15:22:53 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" Message-ID: Hi all, Please review the fix for https://bugs.openjdk.java.net/browse/JDK-8215196 webrev: http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ Currently PopFrame is disabled with JVMCI by [1], so for testing I reverted [1] changes. [1] https://bugs.openjdk.java.net/browse/JDK-8218025 --alex From serguei.spitsyn at oracle.com Sat Nov 9 00:00:31 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 8 Nov 2019 16:00:31 -0800 Subject: RFR(S): JDK-8231635: SA Stackwalking code stuck in BasicTypeDataBase.findDynamicTypeForAddress() In-Reply-To: References: Message-ID: <42ae6371-5450-9973-eead-b2fa277b13b5@oracle.com> Hi Chris, This seems to be a good fix to have in any case. This check and bail out is right thing to do and should not break anything. I understand, this also fixes the test failures. I only had some experience a long time ago with the support of pstack and DTrace jstack action implementation which also does such SP recovering because the ebp can be used by JIT compiler as a general purpose register. There is no such a problem on sparc. Thanks, Serguei On 11/7/19 14:01, Chris Plummer wrote: > Hi, > > Please review the following fix for JDK-8231635: > > https://bugs.openjdk.java.net/browse/JDK-8231635 > http://cr.openjdk.java.net/~cjplummer/8231635/webrev.00/ > > I've tried to explain below to the best of my ability what's is going > on, but keep in mind that I basically had no background in this area > before looking into this CR, so this is all new to me. Please feel > free to chime in with corrections to my explanation, or any additional > insight that might help to further understanding of this code. > > When doing a thread stack dump, SA has to figure out the SP for the > current frame when it may not in fact be stored anywhere. So it goes > through a series of guesses, starting with the current value of SP. > See AMD64CurrentFrameGuess.run(): > > ??? Address sp? = context.getRegisterAsAddress(AMD64ThreadContext.RSP); > > There are a number of checks done to see if this is the SP for the > actual current frame, one of the checks being (and kind of a last > resort) to follow the frame links and see if they eventually lead to > the first entry frame: > > ??????????? while (frame != null) { > ????????????? if (frame.isEntryFrame() && frame.entryFrameIsFirst()) { > ???????????????? ... > ???????????????? return true; > ????????????? } > ????????????? frame = frame.sender(map); > ??????????? } > > If this fails, there is an outer loop to try the next address: > > ??????? for (long offset = 0; > ???????????? offset < regionInBytesToSearch; > ???????????? offset += vm.getAddressSize()) { > > Note that offset is added to the initial SP value that was fetched > from RSP. This approach is fraught with danger, because SP could be > incorrect, and you can easily follow a bad frame link to an invalid > address. So the body of this loop is in a try block that catches all > Exceptions, and simply retries with the next offset if one is caught. > Exceptions could be ones like UnalignedAddressException or > UnmappedAddressException. > > The bug in question turns up with the following harmless looking line: > > ????????????? frame = frame.sender(map); > > This is fine if you know that "frame" is valid, but what if it is not > (which is very commonly the case). The frame values (SP, FP, and PC) > in the returned frame could be just about anything, including being > the same as the previous frame. This is what will happen if the SP > stored in "frame" is the same as the SP that was used to initialize > "frame" in the first place. This can certainly happen when SP is not > valid to start with, and is indeed what caused this bug. The end > result is the inner while loop gets stuck in an infinite loop > traversing the same frame. So the fix is to add a check for this to > make sure to break out of the while loop if this happens. Initially I > did this with an Address.equal() call, and that seemed to fix the > problem, but then I realized it would be possible to traverse through > one or more sender frames and eventually end up returning to a > previously visited frame, thus still an infinite loop. So I decided on > checking for Address.lessThanOrEqual() instead since the send frame's > SP should always be greater than the current frame's (referred to as > oldFrame) SP. As long as we always move in one direction (towards a > higher frame address), you can't have an infinite loop in this code. > > I applied this fix to x86. Although not tested, it is built (all > platform support is always built with SA). The x86 and amd64 versions > are identical except for x86/amd64 references, so I thought it best to > go ahead and do the update to x86. I did not touch ppc, but would be > willing to update if someone passes along a fix that is tested. > > One final bit of clarification. The bug synopsis mentions getting > stuck in BasicTypeDataBase.findDynamicTypeForAddress(). This turns out > to not actually be the case, but every stack trace I initially looked > when I filed this CR was showing the thread being in this frame and at > the same line number. This appears to be the next available safepoint > where the thread can be suspended for stack dumping. When debugging > this some more and adding a lot of println() calls in a lot of > different locations, I started to see different frames in the > stacktrace, presumably because the println() calls where adding > additional safepoints. > > thanks, > > Chris > From serguei.spitsyn at oracle.com Sat Nov 9 00:25:39 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 8 Nov 2019 16:25:39 -0800 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Sat Nov 9 00:39:19 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 8 Nov 2019 16:39:19 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: References: Message-ID: On 11/08/2019 15:22, Alex Menkov wrote: > Hi all, > > Please review the fix for > https://bugs.openjdk.java.net/browse/JDK-8215196 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ > > Currently PopFrame is disabled with JVMCI by [1], so for testing I > reverted [1] changes. Just to be clear - I temporary reverted [1] for test runs. --alex > > [1] https://bugs.openjdk.java.net/browse/JDK-8218025 > > --alex From alexey.menkov at oracle.com Sat Nov 9 00:40:04 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 8 Nov 2019 16:40:04 -0800 Subject: RFR: 8233868: Unproblem list sun/tools/jstat/jstatClassloadOutput1.sh In-Reply-To: References: Message-ID: LGTM --alex On 11/08/2019 14:58, Daniil Titov wrote: > Please a review a changeset below that removes test sun/tools/jstat/jstatClassloadOutput1.sh from test/jdk/ProblemList.txt. > > > diff -r f92ef5d182b5 test/jdk/ProblemList.txt > --- a/test/jdk/ProblemList.txt Fri Nov 08 11:41:17 2019 -0500 > +++ b/test/jdk/ProblemList.txt Fri Nov 08 22:37:11 2019 +0000 > @@ -861,7 +861,6 @@ > > # svc_tools > > -sun/tools/jstat/jstatClassloadOutput1.sh 8173942 generic-all > sun/tools/jhsdb/BasicLauncherTest.java 8193639,8211767 solaris-all,linux-ppc64,linux-ppc64le > sun/tools/jhsdb/HeapDumpTest.java 8193639 solaris-all > sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8230731,8001227 windows-all > > > > This test was added in ProblemList.txt in [1] but this issue is no longer reproducible in JDK 14 and the test runs fine in Mach5. > > Mach5 tests for tier1,tier2, and tier3 successfully passed. > > [1] https://bugs.openjdk.java.net/browse/JDK-8173942 > > Thanks. > Daniil > > From chris.plummer at oracle.com Sat Nov 9 00:42:59 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 8 Nov 2019 16:42:59 -0800 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: References: Message-ID: <4a9851a3-163e-751f-53ae-1e42d2cbdc6d@oracle.com> Hi Ralf, Also looks good to me. Serguei, the removed code is consolidated into HeapDumper::dump(). thanks, Chris On 11/8/19 4:25 PM, serguei.spitsyn at oracle.com wrote: > Hi Ralf, > > The fix looks Okay in general. > > > A couple of questions. > > The dumper.dump() returns int value. > Returned value is not used anymore in the attachListener.cpp and > diagnisticCommand.cpp. > Is it still used somewhere else or we can replace it with void? > > Could you explain a little bit why the following fragments were removed? > Is it because this information is not that useful or there is some > other motivation? > > attachListener.cpp: > - int res = dumper.dump(op->arg(0)); > - if (res == 0) { > - out->print_cr("Heap dump file created"); > - } else { > - // heap dump failed > - ResourceMark rm; > - char* error = dumper.error_as_C_string(); > - if (error == NULL) { > - out->print_cr("Dump failed - reason unknown"); > - } else { > - out->print_cr("%s", error); > - } > - } > > diagnisticCommand.cpp > - if (res == 0) { > - output()->print_cr("Heap dump file created"); > - } else { > - // heap dump failed > - ResourceMark rm; > - char* error = dumper.error_as_C_string(); > - if (error == NULL) { > - output()->print_cr("Dump failed - reason unknown"); > - } else { > - output()->print_cr("%s", error); > - } > - } > Thanks, > Serguei > > > On 11/8/19 03:56, Schmelter, Ralf wrote: >> This change forwards the output from the HeapDumper.dump() command to an optional output stream supplied by the caller. >> Until now the diagnositic command and the "dumpheap" command of the attach framework partly recreated the output by hand, but missing some information. >> >> Old output: >> Heap dump file created >> >> New output: >> Dumping heap to test.hprof ... >> Heap dump file created [9719330384 bytes in 27,759 secs] >> >> In addition to getting this improved information, it saves code too. >> >> Bugreport:https://bugs.openjdk.java.net/browse/JDK-8233790 >> Webrev:http://cr.openjdk.java.net/~rschmelter/webrevs/8233790/webrev.0/ >> >> Best regards, >> Ralf > From chris.plummer at oracle.com Sat Nov 9 00:45:14 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 8 Nov 2019 16:45:14 -0800 Subject: RFR: 8233868: Unproblem list sun/tools/jstat/jstatClassloadOutput1.sh In-Reply-To: References: Message-ID: <3fa1f235-d98b-e78a-3401-6f440d6ffc99@oracle.com> +1 On 11/8/19 4:40 PM, Alex Menkov wrote: > LGTM > > --alex > > On 11/08/2019 14:58, Daniil Titov wrote: >> Please a review a changeset below that removes test >> sun/tools/jstat/jstatClassloadOutput1.sh? from test/jdk/ProblemList.txt. >> >> >> diff -r f92ef5d182b5 test/jdk/ProblemList.txt >> --- a/test/jdk/ProblemList.txt??? Fri Nov 08 11:41:17 2019 -0500 >> +++ b/test/jdk/ProblemList.txt??? Fri Nov 08 22:37:11 2019 +0000 >> @@ -861,7 +861,6 @@ >> ? ? # svc_tools >> -sun/tools/jstat/jstatClassloadOutput1.sh 8173942 generic-all >> sun/tools/jhsdb/BasicLauncherTest.java 8193639,8211767 >> solaris-all,linux-ppc64,linux-ppc64le >> sun/tools/jhsdb/HeapDumpTest.java 8193639 solaris-all >> sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8230731,8001227 >> windows-all >> >> >> >> This test was added in ProblemList.txt in [1] but this issue is no >> longer reproducible in JDK 14? and the test runs fine in Mach5. >> >> Mach5 tests for tier1,tier2, and tier3 successfully passed. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8173942 >> >> Thanks. >> Daniil >> >> From chris.plummer at oracle.com Sat Nov 9 00:55:31 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 8 Nov 2019 16:55:31 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: References: Message-ID: Hi Alex, Comments below: On 11/8/19 4:39 PM, Alex Menkov wrote: > > > On 11/08/2019 15:22, Alex Menkov wrote: >> Hi all, >> >> Please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8215196 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ I don't really see a resolution in the JDK-8215196 comments as to what is actually broken. Are we sure we want to fix this in the test, and not require different behavior by the compiler (and also clarify the spec)? >> >> Currently PopFrame is disabled with JVMCI by [1], so for testing I >> reverted [1] changes. > > Just to be clear - I temporary reverted [1] for test runs. > The description for JDK-8218025 says that the intention is to only disable these capabilities for JDK12. Is there a CR to re-enabled them? thanks, Chris > --alex > >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >> >> --alex From serguei.spitsyn at oracle.com Sat Nov 9 01:02:14 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 8 Nov 2019 17:02:14 -0800 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: <4a9851a3-163e-751f-53ae-1e42d2cbdc6d@oracle.com> References: <4a9851a3-163e-751f-53ae-1e42d2cbdc6d@oracle.com> Message-ID: <29b1f326-421c-71f2-efd8-a34fd136d659@oracle.com> Hi Chris, I'm a little bit confused. The Ralf's change in the HeapDumper::dump() is just replacement of 'tty' occurrences with 'out', so the change has not moved the deleted code in these files into the HeapDumper::dump(). Probably, you wanted to say that the pre-existed error messages printed in the HeapDumper::dump() are enough. This would explain why the code is deleted. Just wanted a bit of clarification from Ralf to make sure it is the case. Thanks, Serguei On 11/8/19 16:42, Chris Plummer wrote: > Hi Ralf, > > Also looks good to me. > > Serguei, the removed code is consolidated into HeapDumper::dump(). > > thanks, > > Chris > > On 11/8/19 4:25 PM, serguei.spitsyn at oracle.com wrote: >> Hi Ralf, >> >> The fix looks Okay in general. >> >> >> A couple of questions. >> >> The dumper.dump() returns int value. >> Returned value is not used anymore in the attachListener.cpp and >> diagnisticCommand.cpp. >> Is it still used somewhere else or we can replace it with void? >> >> Could you explain a little bit why the following fragments were removed? >> Is it because this information is not that useful or there is some >> other motivation? >> >> attachListener.cpp: >> - int res = dumper.dump(op->arg(0)); >> - if (res == 0) { >> - out->print_cr("Heap dump file created"); >> - } else { >> - // heap dump failed >> - ResourceMark rm; >> - char* error = dumper.error_as_C_string(); >> - if (error == NULL) { >> - out->print_cr("Dump failed - reason unknown"); >> - } else { >> - out->print_cr("%s", error); >> - } >> - } >> >> diagnisticCommand.cpp >> - if (res == 0) { >> - output()->print_cr("Heap dump file created"); >> - } else { >> - // heap dump failed >> - ResourceMark rm; >> - char* error = dumper.error_as_C_string(); >> - if (error == NULL) { >> - output()->print_cr("Dump failed - reason unknown"); >> - } else { >> - output()->print_cr("%s", error); >> - } >> - } >> Thanks, >> Serguei >> >> >> On 11/8/19 03:56, Schmelter, Ralf wrote: >>> This change forwards the output from the HeapDumper.dump() command >>> to an optional output stream supplied by the caller. >>> Until now the diagnositic command and the "dumpheap" command of the >>> attach framework partly recreated the output by hand, but missing >>> some information. >>> >>> Old output: >>> Heap dump file created >>> >>> New output: >>> Dumping heap to test.hprof ... >>> Heap dump file created [9719330384 bytes in 27,759 secs] >>> >>> In addition to getting this improved information, it saves code too. >>> >>> Bugreport:https://bugs.openjdk.java.net/browse/JDK-8233790 >>> Webrev:http://cr.openjdk.java.net/~rschmelter/webrevs/8233790/webrev.0/ >>> Best regards, >>> Ralf >> > From alexey.menkov at oracle.com Sat Nov 9 01:57:36 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 8 Nov 2019 17:57:36 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: References: Message-ID: On 11/08/2019 16:55, Chris Plummer wrote: > Hi Alex, > > Comments below: > > On 11/8/19 4:39 PM, Alex Menkov wrote: >> >> >> On 11/08/2019 15:22, Alex Menkov wrote: >>> Hi all, >>> >>> Please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ > I don't really see a resolution in the JDK-8215196 comments as to what > is actually broken. Are we sure we want to fix this in the test, and not > require different behavior by the compiler (and also clarify the spec)? In the test activeMeth method changes its arguments values and then don't use them later. I think dropping useless code is good compiler optimization and I'd prefer to not restrict to do the optimization. >>> >>> Currently PopFrame is disabled with JVMCI by [1], so for testing I >>> reverted [1] changes. >> >> Just to be clear - I temporary reverted [1] for test runs. >> > The description for JDK-8218025 says that the intention is to only > disable these capabilities for JDK12. Is there a CR to re-enabled them? https://bugs.openjdk.java.net/browse/JDK-8218885 Unfortunately the problem why the capabilities were disabled are still unresolved and looks like won't be resolved in 14, so for now it's targeted to tbd. --alex > > thanks, > > Chris >> --alex >> >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>> >>> --alex > From chris.plummer at oracle.com Sat Nov 9 02:14:03 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 8 Nov 2019 18:14:03 -0800 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: <29b1f326-421c-71f2-efd8-a34fd136d659@oracle.com> References: <4a9851a3-163e-751f-53ae-1e42d2cbdc6d@oracle.com> <29b1f326-421c-71f2-efd8-a34fd136d659@oracle.com> Message-ID: He also added the outputStream argument to HeapDumper::dump(). Both of the sections of code below already called dump(), and now they do so with the added outputStream argument. HeapDumper::dump() has been modified to print on the outputStream rather than to the tty. Chris On 11/8/19 5:02 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > I'm a little bit confused. > The Ralf's change in the HeapDumper::dump() is just replacement of > 'tty' occurrences with 'out', > so the change has not moved the deleted code in these files into the > HeapDumper::dump(). > > Probably, you wanted to say that the pre-existed error messages > printed in the HeapDumper::dump() are enough. > This would explain why the code is deleted. > Just wanted a bit of clarification from Ralf to make sure it is the case. > > Thanks, > Serguei > > > On 11/8/19 16:42, Chris Plummer wrote: >> Hi Ralf, >> >> Also looks good to me. >> >> Serguei, the removed code is consolidated into HeapDumper::dump(). >> >> thanks, >> >> Chris >> >> On 11/8/19 4:25 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Ralf, >>> >>> The fix looks Okay in general. >>> >>> >>> A couple of questions. >>> >>> The dumper.dump() returns int value. >>> Returned value is not used anymore in the attachListener.cpp and >>> diagnisticCommand.cpp. >>> Is it still used somewhere else or we can replace it with void? >>> >>> Could you explain a little bit why the following fragments were >>> removed? >>> Is it because this information is not that useful or there is some >>> other motivation? >>> >>> attachListener.cpp: >>> - int res = dumper.dump(op->arg(0)); >>> - if (res == 0) { >>> - out->print_cr("Heap dump file created"); >>> - } else { >>> - // heap dump failed >>> - ResourceMark rm; >>> - char* error = dumper.error_as_C_string(); >>> - if (error == NULL) { >>> - out->print_cr("Dump failed - reason unknown"); >>> - } else { >>> - out->print_cr("%s", error); >>> - } >>> - } >>> >>> diagnisticCommand.cpp >>> - if (res == 0) { >>> - output()->print_cr("Heap dump file created"); >>> - } else { >>> - // heap dump failed >>> - ResourceMark rm; >>> - char* error = dumper.error_as_C_string(); >>> - if (error == NULL) { >>> - output()->print_cr("Dump failed - reason unknown"); >>> - } else { >>> - output()->print_cr("%s", error); >>> - } >>> - } >>> Thanks, >>> Serguei >>> >>> >>> On 11/8/19 03:56, Schmelter, Ralf wrote: >>>> This change forwards the output from the HeapDumper.dump() command >>>> to an optional output stream supplied by the caller. >>>> Until now the diagnositic command and the "dumpheap" command of the >>>> attach framework partly recreated the output by hand, but missing >>>> some information. >>>> >>>> Old output: >>>> Heap dump file created >>>> >>>> New output: >>>> Dumping heap to test.hprof ... >>>> Heap dump file created [9719330384 bytes in 27,759 secs] >>>> >>>> In addition to getting this improved information, it saves code too. >>>> >>>> Bugreport:https://bugs.openjdk.java.net/browse/JDK-8233790 >>>> Webrev:http://cr.openjdk.java.net/~rschmelter/webrevs/8233790/webrev.0/ >>>> >>>> Best regards, >>>> Ralf >>> >> > From serguei.spitsyn at oracle.com Sat Nov 9 02:31:45 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 8 Nov 2019 18:31:45 -0800 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: References: <4a9851a3-163e-751f-53ae-1e42d2cbdc6d@oracle.com> <29b1f326-421c-71f2-efd8-a34fd136d659@oracle.com> Message-ID: <68968918-d7d6-f511-6704-f07d0b4accc6@oracle.com> Okay, thanks. Agreed, this aspect was clear to me. The deleted fragments are about printing the summarizing conclusion about the dumping which is not present in the HeapDumper::dump(). I expected it to be moved into the HeapDumper::dump() but it was not. So, I wonder if there was such an intention. Thanks, Serguei On 11/8/19 18:14, Chris Plummer wrote: > He also added the outputStream argument to HeapDumper::dump(). Both of > the sections of code below already called dump(), and now they do so > with the added outputStream argument. HeapDumper::dump() has been > modified to print on the outputStream rather than to the tty. > > Chris > > On 11/8/19 5:02 PM, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> I'm a little bit confused. >> The Ralf's change in the HeapDumper::dump() is just replacement of >> 'tty' occurrences with 'out', >> so the change has not moved the deleted code in these files into the >> HeapDumper::dump(). >> >> Probably, you wanted to say that the pre-existed error messages >> printed in the HeapDumper::dump() are enough. >> This would explain why the code is deleted. >> Just wanted a bit of clarification from Ralf to make sure it is the >> case. >> >> Thanks, >> Serguei >> >> >> On 11/8/19 16:42, Chris Plummer wrote: >>> Hi Ralf, >>> >>> Also looks good to me. >>> >>> Serguei, the removed code is consolidated into HeapDumper::dump(). >>> >>> thanks, >>> >>> Chris >>> >>> On 11/8/19 4:25 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Ralf, >>>> >>>> The fix looks Okay in general. >>>> >>>> >>>> A couple of questions. >>>> >>>> The dumper.dump() returns int value. >>>> Returned value is not used anymore in the attachListener.cpp and >>>> diagnisticCommand.cpp. >>>> Is it still used somewhere else or we can replace it with void? >>>> >>>> Could you explain a little bit why the following fragments were >>>> removed? >>>> Is it because this information is not that useful or there is some >>>> other motivation? >>>> >>>> attachListener.cpp: >>>> - int res = dumper.dump(op->arg(0)); >>>> - if (res == 0) { >>>> - out->print_cr("Heap dump file created"); >>>> - } else { >>>> - // heap dump failed >>>> - ResourceMark rm; >>>> - char* error = dumper.error_as_C_string(); >>>> - if (error == NULL) { >>>> - out->print_cr("Dump failed - reason unknown"); >>>> - } else { >>>> - out->print_cr("%s", error); >>>> - } >>>> - } >>>> >>>> diagnisticCommand.cpp >>>> - if (res == 0) { >>>> - output()->print_cr("Heap dump file created"); >>>> - } else { >>>> - // heap dump failed >>>> - ResourceMark rm; >>>> - char* error = dumper.error_as_C_string(); >>>> - if (error == NULL) { >>>> - output()->print_cr("Dump failed - reason unknown"); >>>> - } else { >>>> - output()->print_cr("%s", error); >>>> - } >>>> - } >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/8/19 03:56, Schmelter, Ralf wrote: >>>>> This change forwards the output from the HeapDumper.dump() command >>>>> to an optional output stream supplied by the caller. >>>>> Until now the diagnositic command and the "dumpheap" command of >>>>> the attach framework partly recreated the output by hand, but >>>>> missing some information. >>>>> >>>>> Old output: >>>>> Heap dump file created >>>>> >>>>> New output: >>>>> Dumping heap to test.hprof ... >>>>> Heap dump file created [9719330384 bytes in 27,759 secs] >>>>> >>>>> In addition to getting this improved information, it saves code too. >>>>> >>>>> Bugreport:https://bugs.openjdk.java.net/browse/JDK-8233790 >>>>> Webrev:http://cr.openjdk.java.net/~rschmelter/webrevs/8233790/webrev.0/ >>>>> >>>>> Best regards, >>>>> Ralf >>>> >>> >> > From chris.plummer at oracle.com Sat Nov 9 06:33:45 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 8 Nov 2019 22:33:45 -0800 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: <68968918-d7d6-f511-6704-f07d0b4accc6@oracle.com> References: <4a9851a3-163e-751f-53ae-1e42d2cbdc6d@oracle.com> <29b1f326-421c-71f2-efd8-a34fd136d659@oracle.com> <68968918-d7d6-f511-6704-f07d0b4accc6@oracle.com> Message-ID: Hi Serguei, I've inlined below the corresponding code in HeapDumper::dump() that covers the deleted output functionality from attachListener.cpp. Note it's not exactly the same, but I think in the end it includes at least the same info in all cases, and in some cases more info: -??? int res = dumper.dump(op->arg(0)); -??? if (res == 0) { -????? out->print_cr("Heap dump file created"); 2009?????? out->print_cr("Heap dump file created [" JULONG_FORMAT " bytes in %3.3f secs]", 2010???????????????????? writer.bytes_written(), timer()->seconds()); -??? } else { -????? // heap dump failed -????? ResourceMark rm; -????? char* error = dumper.error_as_C_string(); -????? if (error == NULL) { -??????? out->print_cr("Dump failed - reason unknown"); 1986?????? out->print_cr("Unable to create %s: %s", path, 1987???????? (error() != NULL) ? error() : "reason unknown"); -????? } else { -??????? out->print_cr("%s", error); -????? } 2012?????? out->print_cr("Dump file is incomplete: %s", writer.error()); -??? } +??? dumper.dump(op->arg(0), out); Chris On 11/8/19 6:31 PM, serguei.spitsyn at oracle.com wrote: > Okay, thanks. > Agreed, this aspect was clear to me. > The deleted fragments are about printing the summarizing conclusion > about the dumping which is not present in the HeapDumper::dump(). > I expected it to be moved into the HeapDumper::dump() but it was not. > So, I wonder if there was such an intention. > > Thanks, > Serguei > > > On 11/8/19 18:14, Chris Plummer wrote: >> He also added the outputStream argument to HeapDumper::dump(). Both >> of the sections of code below already called dump(), and now they do >> so with the added outputStream argument. HeapDumper::dump() has been >> modified to print on the outputStream rather than to the tty. >> >> Chris >> >> On 11/8/19 5:02 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> I'm a little bit confused. >>> The Ralf's change in the HeapDumper::dump() is just replacement of >>> 'tty' occurrences with 'out', >>> so the change has not moved the deleted code in these files into the >>> HeapDumper::dump(). >>> >>> Probably, you wanted to say that the pre-existed error messages >>> printed in the HeapDumper::dump() are enough. >>> This would explain why the code is deleted. >>> Just wanted a bit of clarification from Ralf to make sure it is the >>> case. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/8/19 16:42, Chris Plummer wrote: >>>> Hi Ralf, >>>> >>>> Also looks good to me. >>>> >>>> Serguei, the removed code is consolidated into HeapDumper::dump(). >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 11/8/19 4:25 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Ralf, >>>>> >>>>> The fix looks Okay in general. >>>>> >>>>> >>>>> A couple of questions. >>>>> >>>>> The dumper.dump() returns int value. >>>>> Returned value is not used anymore in the attachListener.cpp and >>>>> diagnisticCommand.cpp. >>>>> Is it still used somewhere else or we can replace it with void? >>>>> >>>>> Could you explain a little bit why the following fragments were >>>>> removed? >>>>> Is it because this information is not that useful or there is some >>>>> other motivation? >>>>> >>>>> attachListener.cpp: >>>>> - int res = dumper.dump(op->arg(0)); >>>>> - if (res == 0) { >>>>> - out->print_cr("Heap dump file created"); >>>>> - } else { >>>>> - // heap dump failed >>>>> - ResourceMark rm; >>>>> - char* error = dumper.error_as_C_string(); >>>>> - if (error == NULL) { >>>>> - out->print_cr("Dump failed - reason unknown"); >>>>> - } else { >>>>> - out->print_cr("%s", error); >>>>> - } >>>>> - } >>>>> >>>>> diagnisticCommand.cpp >>>>> - if (res == 0) { >>>>> - output()->print_cr("Heap dump file created"); >>>>> - } else { >>>>> - // heap dump failed >>>>> - ResourceMark rm; >>>>> - char* error = dumper.error_as_C_string(); >>>>> - if (error == NULL) { >>>>> - output()->print_cr("Dump failed - reason unknown"); >>>>> - } else { >>>>> - output()->print_cr("%s", error); >>>>> - } >>>>> - } >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/8/19 03:56, Schmelter, Ralf wrote: >>>>>> This change forwards the output from the HeapDumper.dump() >>>>>> command to an optional output stream supplied by the caller. >>>>>> Until now the diagnositic command and the "dumpheap" command of >>>>>> the attach framework partly recreated the output by hand, but >>>>>> missing some information. >>>>>> >>>>>> Old output: >>>>>> Heap dump file created >>>>>> >>>>>> New output: >>>>>> Dumping heap to test.hprof ... >>>>>> Heap dump file created [9719330384 bytes in 27,759 secs] >>>>>> >>>>>> In addition to getting this improved information, it saves code too. >>>>>> >>>>>> Bugreport:https://bugs.openjdk.java.net/browse/JDK-8233790 >>>>>> Webrev:http://cr.openjdk.java.net/~rschmelter/webrevs/8233790/webrev.0/ >>>>>> >>>>>> Best regards, >>>>>> Ralf >>>>> >>>> >>> >> > From serguei.spitsyn at oracle.com Sat Nov 9 08:39:59 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sat, 9 Nov 2019 00:39:59 -0800 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: References: <4a9851a3-163e-751f-53ae-1e42d2cbdc6d@oracle.com> <29b1f326-421c-71f2-efd8-a34fd136d659@oracle.com> <68968918-d7d6-f511-6704-f07d0b4accc6@oracle.com> Message-ID: <61b96d2e-ff46-07b4-f0cd-5fca67649367@oracle.com> Okay. Thanks, Chris Serguei On 11/8/19 22:33, Chris Plummer wrote: > Hi Serguei, > > I've inlined below the corresponding code in HeapDumper::dump() that > covers the deleted output functionality from attachListener.cpp. Note > it's not exactly the same, but I think in the end it includes at least > the same info in all cases, and in some cases more info: > > -??? int res = dumper.dump(op->arg(0)); > -??? if (res == 0) { > -????? out->print_cr("Heap dump file created"); > 2009?????? out->print_cr("Heap dump file created [" JULONG_FORMAT " > bytes in %3.3f secs]", > 2010???????????????????? writer.bytes_written(), timer()->seconds()); > > -??? } else { > -????? // heap dump failed > -????? ResourceMark rm; > -????? char* error = dumper.error_as_C_string(); > -????? if (error == NULL) { > -??????? out->print_cr("Dump failed - reason unknown"); > 1986?????? out->print_cr("Unable to create %s: %s", path, > 1987???????? (error() != NULL) ? error() : "reason unknown"); > > -????? } else { > -??????? out->print_cr("%s", error); > -????? } > 2012?????? out->print_cr("Dump file is incomplete: %s", writer.error()); > > -??? } > +??? dumper.dump(op->arg(0), out); > > Chris > > On 11/8/19 6:31 PM, serguei.spitsyn at oracle.com wrote: >> Okay, thanks. >> Agreed, this aspect was clear to me. >> The deleted fragments are about printing the summarizing conclusion >> about the dumping which is not present in the HeapDumper::dump(). >> I expected it to be moved into the HeapDumper::dump() but it was not. >> So, I wonder if there was such an intention. >> >> Thanks, >> Serguei >> >> >> On 11/8/19 18:14, Chris Plummer wrote: >>> He also added the outputStream argument to HeapDumper::dump(). Both >>> of the sections of code below already called dump(), and now they do >>> so with the added outputStream argument. HeapDumper::dump() has been >>> modified to print on the outputStream rather than to the tty. >>> >>> Chris >>> >>> On 11/8/19 5:02 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris, >>>> >>>> I'm a little bit confused. >>>> The Ralf's change in the HeapDumper::dump() is just replacement of >>>> 'tty' occurrences with 'out', >>>> so the change has not moved the deleted code in these files into >>>> the HeapDumper::dump(). >>>> >>>> Probably, you wanted to say that the pre-existed error messages >>>> printed in the HeapDumper::dump() are enough. >>>> This would explain why the code is deleted. >>>> Just wanted a bit of clarification from Ralf to make sure it is the >>>> case. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/8/19 16:42, Chris Plummer wrote: >>>>> Hi Ralf, >>>>> >>>>> Also looks good to me. >>>>> >>>>> Serguei, the removed code is consolidated into HeapDumper::dump(). >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 11/8/19 4:25 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Ralf, >>>>>> >>>>>> The fix looks Okay in general. >>>>>> >>>>>> >>>>>> A couple of questions. >>>>>> >>>>>> The dumper.dump() returns int value. >>>>>> Returned value is not used anymore in the attachListener.cpp and >>>>>> diagnisticCommand.cpp. >>>>>> Is it still used somewhere else or we can replace it with void? >>>>>> >>>>>> Could you explain a little bit why the following fragments were >>>>>> removed? >>>>>> Is it because this information is not that useful or there is >>>>>> some other motivation? >>>>>> >>>>>> attachListener.cpp: >>>>>> - int res = dumper.dump(op->arg(0)); >>>>>> - if (res == 0) { >>>>>> - out->print_cr("Heap dump file created"); >>>>>> - } else { >>>>>> - // heap dump failed >>>>>> - ResourceMark rm; >>>>>> - char* error = dumper.error_as_C_string(); >>>>>> - if (error == NULL) { >>>>>> - out->print_cr("Dump failed - reason unknown"); >>>>>> - } else { >>>>>> - out->print_cr("%s", error); >>>>>> - } >>>>>> - } >>>>>> >>>>>> diagnisticCommand.cpp >>>>>> - if (res == 0) { >>>>>> - output()->print_cr("Heap dump file created"); >>>>>> - } else { >>>>>> - // heap dump failed >>>>>> - ResourceMark rm; >>>>>> - char* error = dumper.error_as_C_string(); >>>>>> - if (error == NULL) { >>>>>> - output()->print_cr("Dump failed - reason unknown"); >>>>>> - } else { >>>>>> - output()->print_cr("%s", error); >>>>>> - } >>>>>> - } >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 11/8/19 03:56, Schmelter, Ralf wrote: >>>>>>> This change forwards the output from the HeapDumper.dump() >>>>>>> command to an optional output stream supplied by the caller. >>>>>>> Until now the diagnositic command and the "dumpheap" command of >>>>>>> the attach framework partly recreated the output by hand, but >>>>>>> missing some information. >>>>>>> >>>>>>> Old output: >>>>>>> Heap dump file created >>>>>>> >>>>>>> New output: >>>>>>> Dumping heap to test.hprof ... >>>>>>> Heap dump file created [9719330384 bytes in 27,759 secs] >>>>>>> >>>>>>> In addition to getting this improved information, it saves code >>>>>>> too. >>>>>>> >>>>>>> Bugreport:https://bugs.openjdk.java.net/browse/JDK-8233790 >>>>>>> Webrev:http://cr.openjdk.java.net/~rschmelter/webrevs/8233790/webrev.0/ >>>>>>> >>>>>>> Best regards, >>>>>>> Ralf >>>>>> >>>>> >>>> >>> >> > > From ralf.schmelter at sap.com Mon Nov 11 08:48:34 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Mon, 11 Nov 2019 08:48:34 +0000 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: References: Message-ID: Hi Serguei, Thanks for the review. The only open question seems to be: > The dumper.dump() returns int value. > Returned value is not used anymore in the attachListener.cpp and diagnisticCommand.cpp. > Is it still used somewhere else or we can replace it with void? The jmm_DumpHeap0 method still uses the return code (and the error string) to throw an exception in case of an error. Best regards, Ralf From serguei.spitsyn at oracle.com Mon Nov 11 10:33:40 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 11 Nov 2019 02:33:40 -0800 Subject: RFR (XS) 8233790: Forward output from heap dumper to jcmd/jmap In-Reply-To: References: Message-ID: <36e15875-98ac-559d-3d15-e7c5723d46c0@oracle.com> Hi Ralf, Okay, thanks! Serguei On 11/11/19 00:48, Schmelter, Ralf wrote: > Hi Serguei, > > Thanks for the review. The only open question seems to be: > >> The dumper.dump() returns int value. >> Returned value is not used anymore in the attachListener.cpp and diagnisticCommand.cpp. >> Is it still used somewhere else or we can replace it with void? > The jmm_DumpHeap0 method still uses the return code (and the error string) to throw an > exception in case of an error. > > Best regards, > Ralf From serguei.spitsyn at oracle.com Mon Nov 11 11:06:11 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 11 Nov 2019 03:06:11 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: References: Message-ID: <8305de0b-e3b1-32c9-558f-a2813a675563@oracle.com> Hi Chris, On 11/8/19 16:55, Chris Plummer wrote: > Hi Alex, > > Comments below: > > On 11/8/19 4:39 PM, Alex Menkov wrote: >> >> >> On 11/08/2019 15:22, Alex Menkov wrote: >>> Hi all, >>> >>> Please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ > I don't really see a resolution in the JDK-8215196 comments as to what > is actually broken. Are we sure we want to fix this in the test, and > not require different behavior by the compiler (and also clarify the > spec)? This is one more case to the topic "Should optimizations be observable for JVMTI agents?" which was recently discussed on the serviceability-dev mailing list. The question is about the following statement of the JVMTI PopFrame spec: ?"Note however, that any changes to the arguments, which occurred in the called method, remain; ? when execution continues, the first instruction to execute will be the invoke." One point is we could consider this statement as a possible side affect which has to be preserved by the JIT compiler. So, the optimization to delete such a code which "looks useless" has to be disabled. Another point is that it is hard to understand why such a side effect can be really useful. Maybe it was specified like this just because it does not make sense to preserve original argument values at the call side. We can consider to relax the JVMTI PopFrame spec by changing it to something like: ?"Note however, that the original argument values are not preserved and can be changed by the called method;" Let's wait for other opinions. Thanks, Serguei >>> >>> Currently PopFrame is disabled with JVMCI by [1], so for testing I >>> reverted [1] changes. >> >> Just to be clear - I temporary reverted [1] for test runs. >> > The description for JDK-8218025 says that the intention is to only > disable these capabilities for JDK12. Is there a CR to re-enabled them? > > thanks, > > Chris >> --alex >> >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>> >>> --alex > From serguei.spitsyn at oracle.com Mon Nov 11 11:15:42 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 11 Nov 2019 03:15:42 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <8305de0b-e3b1-32c9-558f-a2813a675563@oracle.com> References: <8305de0b-e3b1-32c9-558f-a2813a675563@oracle.com> Message-ID: <2c41e50c-d0e2-bc49-743a-55c22e5fe628@oracle.com> On 11/11/19 03:06, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > > On 11/8/19 16:55, Chris Plummer wrote: >> Hi Alex, >> >> Comments below: >> >> On 11/8/19 4:39 PM, Alex Menkov wrote: >>> >>> >>> On 11/08/2019 15:22, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >> I don't really see a resolution in the JDK-8215196 comments as to >> what is actually broken. Are we sure we want to fix this in the test, >> and not require different behavior by the compiler (and also clarify >> the spec)? > > This is one more case to the topic "Should optimizations be observable > for JVMTI agents?" > which was recently discussed on the serviceability-dev mailing list. > The question is about the following statement of the JVMTI PopFrame spec: > ?"Note however, that any changes to the arguments, which occurred in > the called method, remain; > ? when execution continues, the first instruction to execute will be > the invoke." > > One point is we could consider this statement as a possible side > affect which has to be preserved by the JIT compiler. > So, the optimization to delete such a code which "looks useless" has > to be disabled. > > Another point is that it is hard to understand why such a side effect > can be really useful. > Maybe it was specified like this just because it does not make sense > to preserve original argument values at the call side. > We can consider to relax the JVMTI PopFrame spec by changing it to > something like: > ?"Note however, that the original argument values are not preserved > and can be changed by the called method;" Forgot to list one more option which is: ? Consider it is Okay for compiler to eliminate useless code, ? so the argument values can be reinitialized by the PopFrame. ? Than this problem becomes just a test bug. Thanks, Serguei > > Let's wait for other opinions. > > Thanks, > Serguei > >>>> >>>> Currently PopFrame is disabled with JVMCI by [1], so for testing I >>>> reverted [1] changes. >>> >>> Just to be clear - I temporary reverted [1] for test runs. >>> >> The description for JDK-8218025 says that the intention is to only >> disable these capabilities for JDK12. Is there a CR to re-enabled them? >> >> thanks, >> >> Chris >>> --alex >>> >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>> >>>> --alex >> > From serguei.spitsyn at oracle.com Mon Nov 11 11:17:33 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 11 Nov 2019 03:17:33 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: References: Message-ID: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Mon Nov 11 15:29:36 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 11 Nov 2019 15:29:36 +0000 Subject: RFC 8233915: JVMTI FollowReferences: Java Heap Leak not found because of C2 Scalar Replacement Message-ID: Hi, I have created https://bugs.openjdk.java.net/browse/JDK-8233915 In short, a set of live objects L is not found using JVMTI FollowReferences() if L is only reachable from a scalar replaced object in a frame of a C2 compiled method. If L happens to be a growing leak, then a dynamically loaded JVMTI agent (note: can_tag_objects is an always capability) for heap diagnostics won't discover L as live and it won't be able to find root references that lead to L. I'd like to suggest the implementation for the proposed enhancement JDK-8227745 as bug-fix. RFE: https://bugs.openjdk.java.net/browse/JDK-8227745 Webrev(*): http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.1/ Please comment on the suggestion. Dou you see other solutions that allow an agent to discover the chain of references to L? I'd like to work on the complexity as well. One significant simplification could be, if it was possible to reallocate scalar replaced objects at safepoints (i.e. allow the VM thread to call Deoptimization::realloc_objects()). The GC interface does not seem to allow this. Thanks, Richard. (*) Not yet accepted, because deemed too complex for the performance gain. Note that I was able to reduce webrev.1 in size compared to webrev.0 From david.holmes at oracle.com Tue Nov 12 04:52:42 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 12 Nov 2019 14:52:42 +1000 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state Message-ID: webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ bug: https://bugs.openjdk.java.net/browse/JDK-8233549 In JDK-8229516 I moved the interrupted state of a thread from the osThread in the VM to the java.lang.Thread instance. In doing that I overlooked a critical aspect, which is that to access the field of a Java object the JavaThread must not be in a safepoint-safe state** - otherwise the oop, and anything referenced there from could be relocated by the GC whilst the JavaThread is accessing it. This manifested in a number of tests using JVM TI Agent threads and JVM TI RawMonitors because the JavaThread's were marked _thread_blocked and hence safepoint-safe, and we read a non-zero value for the interrupted field even though we had never been interrupted. This problem existed in all the code that checks for interruption when "waiting": - Parker::park (the code underpinning java.util.concurrent.LockSupport.park()) To fix this code I simply deleted a late check of the interrupted field. The check was not needed because if an interrupt has occurred then we will find the ParkEvent in a signalled state. - ObjectMonitor::wait Here the late check of the interrupted state is essential as we reset the ParkEvent after an earlier check of the interrupted state. But the fix was simply achieved by moving the check slightly earlier before we use ThreadBlockInVm to become _thread_blocked. - RawMonitor::wait This fix was much more involved. The RawMonitor code directly transitions the JavaThread from _thread_in_Native to _thread_blocked. This is safe from a safepoint perspective because they are equivalent safepoint-safe states. To allow access to the interrupted field I have to transition from native to _thread_in_vm, and that has to be done by proper thread-state transitions to ensure correct access to the oop and its fields. Having done that I can then use ThreadBlockInVM for the transitions to blocked. However, as the old code noted it can't use proper thread-state transitions as this will lead to deadlocks with the VMThread that can also use RawMonitors when executing various event callbacks. To deal with that we have to note that the real constraint is that the JavaThread cannot block at a safepoint whilst it holds the RawMonitor. Hence the fix was push all the interrupt checking code and the thread-state transitions to the lowest level of RawMonitorWait, around the final park() call, after we have enqueued the waiter and released the monitor. That avoids any deadlock possibility. I also added checks to is_interrupted/interrupted to ensure they are only called by a thread in a suitable state. This should only be the VMThread (as a consequence of the Thread.stop implementation occurring at a safepoint and issuing a JavaThread::interrupt() call to unblock the target); or a JavaThread that is not _thread_in_native or _thread_blocked. Testing: (still finalizing) - tiers 1 - 6 (Oracle platforms) - Local Linux testing - vmTestbase/nsk/monitoring/ - vmTestbase/nsk/jdwp - vmTestbase/nsk/jdb/ - vmTestbase/nsk/jdi/ - vmTestbase/nsk/jvmti/ - serviceability/jvmti/ - serviceability/jdwp - JDK: java/lang/management com/sun/management ** Note that this applies to all accesses we make via code in javaClasses.*. For this particular code I thought about adding a guard in JavaThread::threadObj() but it turns out when we generate a crash report we access the Thread's name() field and that can happen when in any state, so we'd always trigger a secondary assertion failure during error reporting if we did that. Note that accessing name() can still easily lead to secondary assertions failures as I discovered when trying to debug this and print the thread name out - I would see an is_instance assertion fail checking that the Thread name() is an instance of java.lang.String! Thanks, David ----- From daniel.daugherty at oracle.com Tue Nov 12 17:18:12 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 12 Nov 2019 12:18:12 -0500 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: References: Message-ID: <05b4ec18-1a93-7d3d-fb17-1ce2f5c27e11@oracle.com> On 11/11/19 11:52 PM, David Holmes wrote: > webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ src/hotspot/os/posix/os_posix.cpp ??? L2078: ? // Can't access interrupt state now we are _thread_blocked. If we've been ??? L2079: ? // interrupted since we checked above then _counter will be > 0. ??????? nit - grammar. Please consider: ?? ? ? ? ? ? // Can't access interrupt state now that we are _thread_blocked. If we've ??? ? ? ?? ? // been interrupted since we checked above then _counter will be > 0. src/hotspot/os/solaris/os_solaris.cpp ??? L4924: ? // Can't access interrupt state now we are _thread_blocked. If we've been ??? L4925: ? // interrupted since we checked above then _counter will be > 0. ??????? nit - grammar. Please consider: ???????????? // Can't access interrupt state now that we are _thread_blocked. If we've ???????????? // been interrupted since we checked above then _counter will be > 0. src/hotspot/share/classfile/javaClasses.cpp ??? No comments. src/hotspot/share/prims/jvmtiEnv.cpp ??? Hmmm... did the "non-JavaThread can't be interrupted" check also get ??? pushed down? ??? Update: Similar check is now in JvmtiRawMonitor::raw_wait(). src/hotspot/share/prims/jvmtiRawMonitor.cpp ??? L239: ??? ThreadInVMfromNative tivm(jt); ??? L240: ??? if (jt->is_interrupted(true)) { ??? L241: ??????? ret = M_INTERRUPTED; ??? L242: ??? } else { ? ? L243: ????? ThreadBlockInVM tbivm(jt); ? ? L244: ????? jt->set_suspend_equivalent(); ? ? L245: ????? if (millis <= 0) { ? ? L246: ??????? self->_ParkEvent->park(); ? ? L247: ????? } else { ? ? L248: ??????? self->_ParkEvent->park(millis); ? ? L249: ????? } ? ? L250: ??? } ? ? L251: ??? // Return to VM before post-check of interrupt state ? ? L252: ??? if (jt->is_interrupted(true)) { ??????? The comment on L251 is better between L249 and L250 since that ??????? is where 'tbivm' gets destroyed and you transition back. ??????? You could have this comment before L252: ?????????????? // Must be in VM to safely access interrupt state: ??????? if you think you really need a comment there. src/hotspot/share/prims/jvmtiRawMonitor.hpp ??? No comments. src/hotspot/share/runtime/objectMonitor.cpp ??? You've moved the is_interrupted() check from after ThreadBlockInVM ??? to before it. ThreadBlockInVM can block for a safepoint which widens ??? the window for an interrupt to come in after the check on L1272 and ??? and before the thread parks on L1286 or L1288. ??? Can this result in an unexpected park() where before we would have ??? taken the "Intentionally empty" code path on L1283? ??? What I'm worried about is whether we've opened a window where we ??? do Object.wait(0) and that wait() is supposed to be interrupted. ??? However, we lose that interrupt because it arrives in the now wider ??? window between L1272 and L1286 and we never return from the wait(0). ??? It is possible that I'm not remembering something about how interrupt() ??? interacts with park(). test/hotspot/jtreg/ProblemList.txt ??? Thanks for remembering to update the ProblemList. The only part I'm worried about is ObjectMonitor::wait(). If my worry is baseless, then thumbs up. I have a couple of nits above. If you choose to fix those, then I don't need to see another webrev. Dan > bug: https://bugs.openjdk.java.net/browse/JDK-8233549 > > In JDK-8229516 I moved the interrupted state of a thread from the > osThread in the VM to the java.lang.Thread instance. In doing that I > overlooked a critical aspect, which is that to access the field of a > Java object the JavaThread must not be in a safepoint-safe state** - > otherwise the oop, and anything referenced there from could be > relocated by the GC whilst the JavaThread is accessing it. This > manifested in a number of tests using JVM TI Agent threads and JVM TI > RawMonitors because the JavaThread's were marked _thread_blocked and > hence safepoint-safe, and we read a non-zero value for the interrupted > field even though we had never been interrupted. > > This problem existed in all the code that checks for interruption when > "waiting": > > - Parker::park (the code underpinning > java.util.concurrent.LockSupport.park()) > > To fix this code I simply deleted a late check of the interrupted > field. The check was not needed because if an interrupt has occurred > then we will find the ParkEvent in a signalled state. > > - ObjectMonitor::wait > > Here the late check of the interrupted state is essential as we reset > the ParkEvent after an earlier check of the interrupted state. But the > fix was simply achieved by moving the check slightly earlier before we > use ThreadBlockInVm to become _thread_blocked. > > - RawMonitor::wait > > This fix was much more involved. The RawMonitor code directly > transitions the JavaThread from _thread_in_Native to _thread_blocked. > This is safe from a safepoint perspective because they are equivalent > safepoint-safe states. To allow access to the interrupted field I have > to transition from native to _thread_in_vm, and that has to be done by > proper thread-state transitions to ensure correct access to the oop > and its fields. Having done that I can then use ThreadBlockInVM for > the transitions to blocked. However, as the old code noted it can't > use proper thread-state transitions as this will lead to deadlocks > with the VMThread that can also use RawMonitors when executing various > event callbacks. To deal with that we have to note that the real > constraint is that the JavaThread cannot block at a safepoint whilst > it holds the RawMonitor. Hence the fix was push all the interrupt > checking code and the thread-state transitions to the lowest level of > RawMonitorWait, around the final park() call, after we have enqueued > the waiter and released the monitor. That avoids any deadlock > possibility. > > I also added checks to is_interrupted/interrupted to ensure they are > only called by a thread in a suitable state. This should only be the > VMThread (as a consequence of the Thread.stop implementation occurring > at a safepoint and issuing a JavaThread::interrupt() call to unblock > the target); or a JavaThread that is not _thread_in_native or > _thread_blocked. > > Testing: (still finalizing) > ?- tiers 1 - 6 (Oracle platforms) > ?- Local Linux testing > ? - vmTestbase/nsk/monitoring/ > ? - vmTestbase/nsk/jdwp > ? - vmTestbase/nsk/jdb/ > ? - vmTestbase/nsk/jdi/ > ? - vmTestbase/nsk/jvmti/ > ? - serviceability/jvmti/ > ? - serviceability/jdwp > ? - JDK: java/lang/management > ???????? com/sun/management > > ** Note that this applies to all accesses we make via code in > javaClasses.*. For this particular code I thought about adding a guard > in JavaThread::threadObj() but it turns out when we generate a crash > report we access the Thread's name() field and that can happen when in > any state, so we'd always trigger a secondary assertion failure during > error reporting if we did that. Note that accessing name() can still > easily lead to secondary assertions failures as I discovered when > trying to debug this and print the thread name out - I would see an > is_instance assertion fail checking that the Thread name() is an > instance of java.lang.String! > > Thanks, > David > ----- From chris.plummer at oracle.com Tue Nov 12 19:06:09 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 12 Nov 2019 11:06:09 -0800 Subject: RFR(S): JDK-8231635: SA Stackwalking code stuck in BasicTypeDataBase.findDynamicTypeForAddress() In-Reply-To: <42ae6371-5450-9973-eead-b2fa277b13b5@oracle.com> References: <42ae6371-5450-9973-eead-b2fa277b13b5@oracle.com> Message-ID: Thanks Serguei! Can I get one more review please? thanks, Chris On 11/8/19 4:00 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > This seems to be a good fix to have in any case. > This check and bail out is right thing to do and should not break > anything. > I understand, this also fixes the test failures. > > I only had some experience a long time ago with the support of pstack > and DTrace jstack action implementation which also does such SP > recovering because the ebp can be used by JIT compiler as a general > purpose register. There is no such a problem on sparc. > > Thanks, > Serguei > > > On 11/7/19 14:01, Chris Plummer wrote: >> Hi, >> >> Please review the following fix for JDK-8231635: >> >> https://bugs.openjdk.java.net/browse/JDK-8231635 >> http://cr.openjdk.java.net/~cjplummer/8231635/webrev.00/ >> >> I've tried to explain below to the best of my ability what's is going >> on, but keep in mind that I basically had no background in this area >> before looking into this CR, so this is all new to me. Please feel >> free to chime in with corrections to my explanation, or any >> additional insight that might help to further understanding of this >> code. >> >> When doing a thread stack dump, SA has to figure out the SP for the >> current frame when it may not in fact be stored anywhere. So it goes >> through a series of guesses, starting with the current value of SP. >> See AMD64CurrentFrameGuess.run(): >> >> ??? Address sp? = context.getRegisterAsAddress(AMD64ThreadContext.RSP); >> >> There are a number of checks done to see if this is the SP for the >> actual current frame, one of the checks being (and kind of a last >> resort) to follow the frame links and see if they eventually lead to >> the first entry frame: >> >> ??????????? while (frame != null) { >> ????????????? if (frame.isEntryFrame() && frame.entryFrameIsFirst()) { >> ???????????????? ... >> ???????????????? return true; >> ????????????? } >> ????????????? frame = frame.sender(map); >> ??????????? } >> >> If this fails, there is an outer loop to try the next address: >> >> ??????? for (long offset = 0; >> ???????????? offset < regionInBytesToSearch; >> ???????????? offset += vm.getAddressSize()) { >> >> Note that offset is added to the initial SP value that was fetched >> from RSP. This approach is fraught with danger, because SP could be >> incorrect, and you can easily follow a bad frame link to an invalid >> address. So the body of this loop is in a try block that catches all >> Exceptions, and simply retries with the next offset if one is caught. >> Exceptions could be ones like UnalignedAddressException or >> UnmappedAddressException. >> >> The bug in question turns up with the following harmless looking line: >> >> ????????????? frame = frame.sender(map); >> >> This is fine if you know that "frame" is valid, but what if it is not >> (which is very commonly the case). The frame values (SP, FP, and PC) >> in the returned frame could be just about anything, including being >> the same as the previous frame. This is what will happen if the SP >> stored in "frame" is the same as the SP that was used to initialize >> "frame" in the first place. This can certainly happen when SP is not >> valid to start with, and is indeed what caused this bug. The end >> result is the inner while loop gets stuck in an infinite loop >> traversing the same frame. So the fix is to add a check for this to >> make sure to break out of the while loop if this happens. Initially I >> did this with an Address.equal() call, and that seemed to fix the >> problem, but then I realized it would be possible to traverse through >> one or more sender frames and eventually end up returning to a >> previously visited frame, thus still an infinite loop. So I decided >> on checking for Address.lessThanOrEqual() instead since the send >> frame's SP should always be greater than the current frame's >> (referred to as oldFrame) SP. As long as we always move in one >> direction (towards a higher frame address), you can't have an >> infinite loop in this code. >> >> I applied this fix to x86. Although not tested, it is built (all >> platform support is always built with SA). The x86 and amd64 versions >> are identical except for x86/amd64 references, so I thought it best >> to go ahead and do the update to x86. I did not touch ppc, but would >> be willing to update if someone passes along a fix that is tested. >> >> One final bit of clarification. The bug synopsis mentions getting >> stuck in BasicTypeDataBase.findDynamicTypeForAddress(). This turns >> out to not actually be the case, but every stack trace I initially >> looked when I filed this CR was showing the thread being in this >> frame and at the same line number. This appears to be the next >> available safepoint where the thread can be suspended for stack >> dumping. When debugging this some more and adding a lot of println() >> calls in a lot of different locations, I started to see different >> frames in the stacktrace, presumably because the println() calls >> where adding additional safepoints. >> >> thanks, >> >> Chris >> > From alexey.menkov at oracle.com Tue Nov 12 22:31:54 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 12 Nov 2019 14:31:54 -0800 Subject: Ping: Re: RFR: JDK-8231915: two JDI tests interfere with each other In-Reply-To: <252e1fcb-3119-d38c-8ed6-1861228b0421@oracle.com> References: <2bfe41ec-a0e5-73fc-7845-c7ca71dcdd29@oracle.com> <252e1fcb-3119-d38c-8ed6-1861228b0421@oracle.com> Message-ID: <887a5584-b5e3-129f-22fd-aff848b38274@oracle.com> Need one more reviewer. --alex On 11/01/2019 20:59, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > It looks good. > > Thanks, > Serguei > > > On 11/1/19 16:54, Alex Menkov wrote: >> Hi all, >> >> please review a small fix for >> https://bugs.openjdk.java.net/browse/JDK-8231915 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp_test_interference/webrev/ >> >> The fix disables "negative" testing for JdwpListenTest. >> Negative testing is useful during development, but can cause >> interference with JdwpAttachTest (JdwpAttachTest already has negative >> testing disabled) >> >> --alex > From david.holmes at oracle.com Tue Nov 12 22:50:15 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 13 Nov 2019 08:50:15 +1000 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: <05b4ec18-1a93-7d3d-fb17-1ce2f5c27e11@oracle.com> References: <05b4ec18-1a93-7d3d-fb17-1ce2f5c27e11@oracle.com> Message-ID: <96cf9e10-a3df-fc5a-2cfa-ac156c10f99c@oracle.com> Hi Dan, Thanks for taking a look so quickly! On 13/11/2019 3:18 am, Daniel D. Daugherty wrote: > On 11/11/19 11:52 PM, David Holmes wrote: >> webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ > > src/hotspot/os/posix/os_posix.cpp > ??? L2078: ? // Can't access interrupt state now we are > _thread_blocked. If we've been > ??? L2079: ? // interrupted since we checked above then _counter will > be > 0. > ??????? nit - grammar. Please consider: > ?? ? ? ? ? ? // Can't access interrupt state now that we are > _thread_blocked. If we've > ??? ? ? ?? ? // been interrupted since we checked above then _counter > will be > 0. > > src/hotspot/os/solaris/os_solaris.cpp > ??? L4924: ? // Can't access interrupt state now we are > _thread_blocked. If we've been > ??? L4925: ? // interrupted since we checked above then _counter will > be > 0. > ??????? nit - grammar. Please consider: > ???????????? // Can't access interrupt state now that we are > _thread_blocked. If we've > ???????????? // been interrupted since we checked above then _counter > will be > 0. Will fix grammatical nits. > src/hotspot/share/classfile/javaClasses.cpp > ??? No comments. > > src/hotspot/share/prims/jvmtiEnv.cpp > ??? Hmmm... did the "non-JavaThread can't be interrupted" check also get > ??? pushed down? > ??? Update: Similar check is now in JvmtiRawMonitor::raw_wait(). > > src/hotspot/share/prims/jvmtiRawMonitor.cpp > ??? L239: ??? ThreadInVMfromNative tivm(jt); > ??? L240: ??? if (jt->is_interrupted(true)) { > ??? L241: ??????? ret = M_INTERRUPTED; > ??? L242: ??? } else { > ? ? L243: ????? ThreadBlockInVM tbivm(jt); > ? ? L244: ????? jt->set_suspend_equivalent(); > ? ? L245: ????? if (millis <= 0) { > ? ? L246: ??????? self->_ParkEvent->park(); > ? ? L247: ????? } else { > ? ? L248: ??????? self->_ParkEvent->park(millis); > ? ? L249: ????? } > ? ? L250: ??? } > ? ? L251: ??? // Return to VM before post-check of interrupt state > ? ? L252: ??? if (jt->is_interrupted(true)) { > ??????? The comment on L251 is better between L249 and L250 since that > ??????? is where 'tbivm' gets destroyed and you transition back. > > ??????? You could have this comment before L252: > > ?????????????? // Must be in VM to safely access interrupt state: > > ??????? if you think you really need a comment there. Will move comment up as suggested. > src/hotspot/share/prims/jvmtiRawMonitor.hpp > ??? No comments. > > src/hotspot/share/runtime/objectMonitor.cpp > ??? You've moved the is_interrupted() check from after ThreadBlockInVM > ??? to before it. ThreadBlockInVM can block for a safepoint which widens > ??? the window for an interrupt to come in after the check on L1272 and > ??? and before the thread parks on L1286 or L1288. > > ??? Can this result in an unexpected park() where before we would have > ??? taken the "Intentionally empty" code path on L1283? > > ??? What I'm worried about is whether we've opened a window where we > ??? do Object.wait(0) and that wait() is supposed to be interrupted. > ??? However, we lose that interrupt because it arrives in the now wider > ??? window between L1272 and L1286 and we never return from the wait(0). > > ??? It is possible that I'm not remembering something about how > interrupt() > ??? interacts with park(). The interrupt() not only sets the field but also issues an unpark() to the ParkEvent. So if we are interrupted whilst processing through the TBIVM, the call to park() will return immediately as the ParkEvent will be in the signalled state. > test/hotspot/jtreg/ProblemList.txt > ??? Thanks for remembering to update the ProblemList. > > The only part I'm worried about is ObjectMonitor::wait(). If my worry is > baseless, then thumbs up. Worry is baseless :) > I have a couple of nits above. If you choose to fix those, then I don't > need to see another webrev. Thanks again! David ----- > Dan > > >> bug: https://bugs.openjdk.java.net/browse/JDK-8233549 >> >> In JDK-8229516 I moved the interrupted state of a thread from the >> osThread in the VM to the java.lang.Thread instance. In doing that I >> overlooked a critical aspect, which is that to access the field of a >> Java object the JavaThread must not be in a safepoint-safe state** - >> otherwise the oop, and anything referenced there from could be >> relocated by the GC whilst the JavaThread is accessing it. This >> manifested in a number of tests using JVM TI Agent threads and JVM TI >> RawMonitors because the JavaThread's were marked _thread_blocked and >> hence safepoint-safe, and we read a non-zero value for the interrupted >> field even though we had never been interrupted. >> >> This problem existed in all the code that checks for interruption when >> "waiting": >> >> - Parker::park (the code underpinning >> java.util.concurrent.LockSupport.park()) >> >> To fix this code I simply deleted a late check of the interrupted >> field. The check was not needed because if an interrupt has occurred >> then we will find the ParkEvent in a signalled state. >> >> - ObjectMonitor::wait >> >> Here the late check of the interrupted state is essential as we reset >> the ParkEvent after an earlier check of the interrupted state. But the >> fix was simply achieved by moving the check slightly earlier before we >> use ThreadBlockInVm to become _thread_blocked. >> >> - RawMonitor::wait >> >> This fix was much more involved. The RawMonitor code directly >> transitions the JavaThread from _thread_in_Native to _thread_blocked. >> This is safe from a safepoint perspective because they are equivalent >> safepoint-safe states. To allow access to the interrupted field I have >> to transition from native to _thread_in_vm, and that has to be done by >> proper thread-state transitions to ensure correct access to the oop >> and its fields. Having done that I can then use ThreadBlockInVM for >> the transitions to blocked. However, as the old code noted it can't >> use proper thread-state transitions as this will lead to deadlocks >> with the VMThread that can also use RawMonitors when executing various >> event callbacks. To deal with that we have to note that the real >> constraint is that the JavaThread cannot block at a safepoint whilst >> it holds the RawMonitor. Hence the fix was push all the interrupt >> checking code and the thread-state transitions to the lowest level of >> RawMonitorWait, around the final park() call, after we have enqueued >> the waiter and released the monitor. That avoids any deadlock >> possibility. >> >> I also added checks to is_interrupted/interrupted to ensure they are >> only called by a thread in a suitable state. This should only be the >> VMThread (as a consequence of the Thread.stop implementation occurring >> at a safepoint and issuing a JavaThread::interrupt() call to unblock >> the target); or a JavaThread that is not _thread_in_native or >> _thread_blocked. >> >> Testing: (still finalizing) >> ?- tiers 1 - 6 (Oracle platforms) >> ?- Local Linux testing >> ? - vmTestbase/nsk/monitoring/ >> ? - vmTestbase/nsk/jdwp >> ? - vmTestbase/nsk/jdb/ >> ? - vmTestbase/nsk/jdi/ >> ? - vmTestbase/nsk/jvmti/ >> ? - serviceability/jvmti/ >> ? - serviceability/jdwp >> ? - JDK: java/lang/management >> ???????? com/sun/management >> >> ** Note that this applies to all accesses we make via code in >> javaClasses.*. For this particular code I thought about adding a guard >> in JavaThread::threadObj() but it turns out when we generate a crash >> report we access the Thread's name() field and that can happen when in >> any state, so we'd always trigger a secondary assertion failure during >> error reporting if we did that. Note that accessing name() can still >> easily lead to secondary assertions failures as I discovered when >> trying to debug this and print the thread name out - I would see an >> is_instance assertion fail checking that the Thread name() is an >> instance of java.lang.String! >> >> Thanks, >> David >> ----- > From daniil.x.titov at oracle.com Wed Nov 13 01:36:51 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 12 Nov 2019 17:36:51 -0800 Subject: Ping: Re: RFR: JDK-8231915: two JDI tests interfere with each other In-Reply-To: <9FD3B9A3-CCFD-4DB2-AE80-61CF58DCE036@oracle.com> References: <2bfe41ec-a0e5-73fc-7845-c7ca71dcdd29@oracle.com> <252e1fcb-3119-d38c-8ed6-1861228b0421@oracle.com> <9FD3B9A3-CCFD-4DB2-AE80-61CF58DCE036@oracle.com> Message-ID: Hi Alex, The fix looks good to me. Thanks! --Daniil ?On 11/12/19, 2:32 PM, "serviceability-dev on behalf of Alex Menkov" wrote: Need one more reviewer. --alex On 11/01/2019 20:59, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > It looks good. > > Thanks, > Serguei > > > On 11/1/19 16:54, Alex Menkov wrote: >> Hi all, >> >> please review a small fix for >> https://bugs.openjdk.java.net/browse/JDK-8231915 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp_test_interference/webrev/ >> >> The fix disables "negative" testing for JdwpListenTest. >> Negative testing is useful during development, but can cause >> interference with JdwpAttachTest (JdwpAttachTest already has negative >> testing disabled) >> >> --alex > From daniil.x.titov at oracle.com Wed Nov 13 05:08:07 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 12 Nov 2019 21:08:07 -0800 Subject: RFR(S): JDK-8231635: SA Stackwalking code stuck in BasicTypeDataBase.findDynamicTypeForAddress() In-Reply-To: <2F27AC9C-4848-43D1-9522-207A103A07A2@oracle.com> References: <42ae6371-5450-9973-eead-b2fa277b13b5@oracle.com> <2F27AC9C-4848-43D1-9522-207A103A07A2@oracle.com> Message-ID: <27B80F27-C4FC-4115-9D31-56FE141128DF@oracle.com> Hi Chris, The change looks good to me. Thanks! --Daniil ?On 11/12/19, 11:06 AM, "serviceability-dev on behalf of Chris Plummer" wrote: Thanks Serguei! Can I get one more review please? thanks, Chris On 11/8/19 4:00 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > This seems to be a good fix to have in any case. > This check and bail out is right thing to do and should not break > anything. > I understand, this also fixes the test failures. > > I only had some experience a long time ago with the support of pstack > and DTrace jstack action implementation which also does such SP > recovering because the ebp can be used by JIT compiler as a general > purpose register. There is no such a problem on sparc. > > Thanks, > Serguei > > > On 11/7/19 14:01, Chris Plummer wrote: >> Hi, >> >> Please review the following fix for JDK-8231635: >> >> https://bugs.openjdk.java.net/browse/JDK-8231635 >> http://cr.openjdk.java.net/~cjplummer/8231635/webrev.00/ >> >> I've tried to explain below to the best of my ability what's is going >> on, but keep in mind that I basically had no background in this area >> before looking into this CR, so this is all new to me. Please feel >> free to chime in with corrections to my explanation, or any >> additional insight that might help to further understanding of this >> code. >> >> When doing a thread stack dump, SA has to figure out the SP for the >> current frame when it may not in fact be stored anywhere. So it goes >> through a series of guesses, starting with the current value of SP. >> See AMD64CurrentFrameGuess.run(): >> >> Address sp = context.getRegisterAsAddress(AMD64ThreadContext.RSP); >> >> There are a number of checks done to see if this is the SP for the >> actual current frame, one of the checks being (and kind of a last >> resort) to follow the frame links and see if they eventually lead to >> the first entry frame: >> >> while (frame != null) { >> if (frame.isEntryFrame() && frame.entryFrameIsFirst()) { >> ... >> return true; >> } >> frame = frame.sender(map); >> } >> >> If this fails, there is an outer loop to try the next address: >> >> for (long offset = 0; >> offset < regionInBytesToSearch; >> offset += vm.getAddressSize()) { >> >> Note that offset is added to the initial SP value that was fetched >> from RSP. This approach is fraught with danger, because SP could be >> incorrect, and you can easily follow a bad frame link to an invalid >> address. So the body of this loop is in a try block that catches all >> Exceptions, and simply retries with the next offset if one is caught. >> Exceptions could be ones like UnalignedAddressException or >> UnmappedAddressException. >> >> The bug in question turns up with the following harmless looking line: >> >> frame = frame.sender(map); >> >> This is fine if you know that "frame" is valid, but what if it is not >> (which is very commonly the case). The frame values (SP, FP, and PC) >> in the returned frame could be just about anything, including being >> the same as the previous frame. This is what will happen if the SP >> stored in "frame" is the same as the SP that was used to initialize >> "frame" in the first place. This can certainly happen when SP is not >> valid to start with, and is indeed what caused this bug. The end >> result is the inner while loop gets stuck in an infinite loop >> traversing the same frame. So the fix is to add a check for this to >> make sure to break out of the while loop if this happens. Initially I >> did this with an Address.equal() call, and that seemed to fix the >> problem, but then I realized it would be possible to traverse through >> one or more sender frames and eventually end up returning to a >> previously visited frame, thus still an infinite loop. So I decided >> on checking for Address.lessThanOrEqual() instead since the send >> frame's SP should always be greater than the current frame's >> (referred to as oldFrame) SP. As long as we always move in one >> direction (towards a higher frame address), you can't have an >> infinite loop in this code. >> >> I applied this fix to x86. Although not tested, it is built (all >> platform support is always built with SA). The x86 and amd64 versions >> are identical except for x86/amd64 references, so I thought it best >> to go ahead and do the update to x86. I did not touch ppc, but would >> be willing to update if someone passes along a fix that is tested. >> >> One final bit of clarification. The bug synopsis mentions getting >> stuck in BasicTypeDataBase.findDynamicTypeForAddress(). This turns >> out to not actually be the case, but every stack trace I initially >> looked when I filed this CR was showing the thread being in this >> frame and at the same line number. This appears to be the next >> available safepoint where the thread can be suspended for stack >> dumping. When debugging this some more and adding a lot of println() >> calls in a lot of different locations, I started to see different >> frames in the stacktrace, presumably because the println() calls >> where adding additional safepoints. >> >> thanks, >> >> Chris >> > From richard.reingruber at sap.com Wed Nov 13 12:24:41 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 13 Nov 2019 12:24:41 +0000 Subject: RFC 8233915: JVMTI FollowReferences: Java Heap Leak not found because of C2 Scalar Replacement In-Reply-To: <729138cc-7a21-cf79-947c-c6a68f34237a@oracle.com> References: <729138cc-7a21-cf79-947c-c6a68f34237a@oracle.com> Message-ID: Hi Leonid, these are valid points. Thanks for making me aware of them. I've increased the maximum heap size in my tests as suggested, and I've also run them with ZGC enabled. I've also added the vm.opt.TieredCompilation != true requirement. I've done the changes in place. Thanks, Richard. -----Original Message----- From: hotspot-compiler-dev On Behalf Of Leonid Mesnik Sent: Dienstag, 12. November 2019 20:34 To: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFC 8233915: JVMTI FollowReferences: Java Heap Leak not found because of C2 Scalar Replacement Hi I don't make complete review just sanity verified your test headers. I see a couple of potential issues with them. 1) The using Xmx32M could cause OOME failures if test is executed with ZGC. I think that at least 256M should be set. Could you please verify that your tests pass with ZGC enabled. 2) I think it makes sense to add requires vm.opt.TieredCompilation != true to just skip tests if anyone runs them with tiered compilation disabled explicitly. Leonid On 11/11/19 7:29 AM, Reingruber, Richard wrote: > Hi, > > I have created https://bugs.openjdk.java.net/browse/JDK-8233915 > > In short, a set of live objects L is not found using JVMTI FollowReferences() if L is only reachable > from a scalar replaced object in a frame of a C2 compiled method. If L happens to be a growing leak, > then a dynamically loaded JVMTI agent (note: can_tag_objects is an always capability) for heap > diagnostics won't discover L as live and it won't be able to find root references that lead to L. > > I'd like to suggest the implementation for the proposed enhancement JDK-8227745 as bug-fix. > > RFE: https://bugs.openjdk.java.net/browse/JDK-8227745 > Webrev(*): http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.1/ > > Please comment on the suggestion. Dou you see other solutions that allow an agent to discover the > chain of references to L? > > I'd like to work on the complexity as well. One significant simplification could be, if it was > possible to reallocate scalar replaced objects at safepoints (i.e. allow the VM thread to call > Deoptimization::realloc_objects()). The GC interface does not seem to allow this. > > Thanks, Richard. > > (*) Not yet accepted, because deemed too complex for the performance gain. Note that I was able to > reduce webrev.1 in size compared to webrev.0 From daniel.daugherty at oracle.com Wed Nov 13 14:17:58 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 13 Nov 2019 09:17:58 -0500 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: <96cf9e10-a3df-fc5a-2cfa-ac156c10f99c@oracle.com> References: <05b4ec18-1a93-7d3d-fb17-1ce2f5c27e11@oracle.com> <96cf9e10-a3df-fc5a-2cfa-ac156c10f99c@oracle.com> Message-ID: <89859c60-97bd-76f0-74dc-3c74fb919d3b@oracle.com> On 11/12/19 5:50 PM, David Holmes wrote: > Hi Dan, > > Thanks for taking a look so quickly! Your welcome! I figured you would prefer to get this one out of the way quickly. > > On 13/11/2019 3:18 am, Daniel D. Daugherty wrote: >> On 11/11/19 11:52 PM, David Holmes wrote: >>> webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ >> >> src/hotspot/os/posix/os_posix.cpp >> ???? L2078: ? // Can't access interrupt state now we are >> _thread_blocked. If we've been >> ???? L2079: ? // interrupted since we checked above then _counter >> will be > 0. >> ???????? nit - grammar. Please consider: >> ??? ? ? ? ? ? // Can't access interrupt state now that we are >> _thread_blocked. If we've >> ???? ? ? ?? ? // been interrupted since we checked above then >> _counter will be > 0. >> >> src/hotspot/os/solaris/os_solaris.cpp >> ???? L4924: ? // Can't access interrupt state now we are >> _thread_blocked. If we've been >> ???? L4925: ? // interrupted since we checked above then _counter >> will be > 0. >> ???????? nit - grammar. Please consider: >> ????????????? // Can't access interrupt state now that we are >> _thread_blocked. If we've >> ????????????? // been interrupted since we checked above then >> _counter will be > 0. > > Will fix grammatical nits. > >> src/hotspot/share/classfile/javaClasses.cpp >> ???? No comments. >> >> src/hotspot/share/prims/jvmtiEnv.cpp >> ???? Hmmm... did the "non-JavaThread can't be interrupted" check also >> get >> ???? pushed down? >> ???? Update: Similar check is now in JvmtiRawMonitor::raw_wait(). >> >> src/hotspot/share/prims/jvmtiRawMonitor.cpp >> ???? L239: ??? ThreadInVMfromNative tivm(jt); >> ???? L240: ??? if (jt->is_interrupted(true)) { >> ???? L241: ??????? ret = M_INTERRUPTED; >> ???? L242: ??? } else { >> ?? ? L243: ????? ThreadBlockInVM tbivm(jt); >> ?? ? L244: ????? jt->set_suspend_equivalent(); >> ?? ? L245: ????? if (millis <= 0) { >> ?? ? L246: ??????? self->_ParkEvent->park(); >> ?? ? L247: ????? } else { >> ?? ? L248: ??????? self->_ParkEvent->park(millis); >> ?? ? L249: ????? } >> ?? ? L250: ??? } >> ?? ? L251: ??? // Return to VM before post-check of interrupt state >> ?? ? L252: ??? if (jt->is_interrupted(true)) { >> ???????? The comment on L251 is better between L249 and L250 since that >> ???????? is where 'tbivm' gets destroyed and you transition back. >> >> ???????? You could have this comment before L252: >> >> ??????????????? // Must be in VM to safely access interrupt state: >> >> ???????? if you think you really need a comment there. > > Will move comment up as suggested. > >> src/hotspot/share/prims/jvmtiRawMonitor.hpp >> ???? No comments. >> >> src/hotspot/share/runtime/objectMonitor.cpp >> ???? You've moved the is_interrupted() check from after ThreadBlockInVM >> ???? to before it. ThreadBlockInVM can block for a safepoint which >> widens >> ???? the window for an interrupt to come in after the check on L1272 and >> ???? and before the thread parks on L1286 or L1288. >> >> ???? Can this result in an unexpected park() where before we would have >> ???? taken the "Intentionally empty" code path on L1283? >> >> ???? What I'm worried about is whether we've opened a window where we >> ???? do Object.wait(0) and that wait() is supposed to be interrupted. >> ???? However, we lose that interrupt because it arrives in the now wider >> ???? window between L1272 and L1286 and we never return from the >> wait(0). >> >> ???? It is possible that I'm not remembering something about how >> interrupt() >> ???? interacts with park(). > > The interrupt() not only sets the field but also issues an unpark() to > the ParkEvent. So if we are interrupted whilst processing through the > TBIVM, the call to park() will return immediately as the ParkEvent > will be in the signalled state. That was the piece I wasn't remembering. Thanks for filling in the detail. > >> test/hotspot/jtreg/ProblemList.txt >> ???? Thanks for remembering to update the ProblemList. >> >> The only part I'm worried about is ObjectMonitor::wait(). If my worry is >> baseless, then thumbs up. > > Worry is baseless :) Agreed! > >> I have a couple of nits above. If you choose to fix those, then I don't >> need to see another webrev. > > Thanks again! You're welcome. Dan > > David > ----- > >> Dan >> >> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8233549 >>> >>> In JDK-8229516 I moved the interrupted state of a thread from the >>> osThread in the VM to the java.lang.Thread instance. In doing that I >>> overlooked a critical aspect, which is that to access the field of a >>> Java object the JavaThread must not be in a safepoint-safe state** - >>> otherwise the oop, and anything referenced there from could be >>> relocated by the GC whilst the JavaThread is accessing it. This >>> manifested in a number of tests using JVM TI Agent threads and JVM >>> TI RawMonitors because the JavaThread's were marked _thread_blocked >>> and hence safepoint-safe, and we read a non-zero value for the >>> interrupted field even though we had never been interrupted. >>> >>> This problem existed in all the code that checks for interruption >>> when "waiting": >>> >>> - Parker::park (the code underpinning >>> java.util.concurrent.LockSupport.park()) >>> >>> To fix this code I simply deleted a late check of the interrupted >>> field. The check was not needed because if an interrupt has occurred >>> then we will find the ParkEvent in a signalled state. >>> >>> - ObjectMonitor::wait >>> >>> Here the late check of the interrupted state is essential as we >>> reset the ParkEvent after an earlier check of the interrupted state. >>> But the fix was simply achieved by moving the check slightly earlier >>> before we use ThreadBlockInVm to become _thread_blocked. >>> >>> - RawMonitor::wait >>> >>> This fix was much more involved. The RawMonitor code directly >>> transitions the JavaThread from _thread_in_Native to >>> _thread_blocked. This is safe from a safepoint perspective because >>> they are equivalent safepoint-safe states. To allow access to the >>> interrupted field I have to transition from native to _thread_in_vm, >>> and that has to be done by proper thread-state transitions to ensure >>> correct access to the oop and its fields. Having done that I can >>> then use ThreadBlockInVM for the transitions to blocked. However, as >>> the old code noted it can't use proper thread-state transitions as >>> this will lead to deadlocks with the VMThread that can also use >>> RawMonitors when executing various event callbacks. To deal with >>> that we have to note that the real constraint is that the JavaThread >>> cannot block at a safepoint whilst it holds the RawMonitor. Hence >>> the fix was push all the interrupt checking code and the >>> thread-state transitions to the lowest level of RawMonitorWait, >>> around the final park() call, after we have enqueued the waiter and >>> released the monitor. That avoids any deadlock possibility. >>> >>> I also added checks to is_interrupted/interrupted to ensure they are >>> only called by a thread in a suitable state. This should only be the >>> VMThread (as a consequence of the Thread.stop implementation >>> occurring at a safepoint and issuing a JavaThread::interrupt() call >>> to unblock the target); or a JavaThread that is not >>> _thread_in_native or _thread_blocked. >>> >>> Testing: (still finalizing) >>> ?- tiers 1 - 6 (Oracle platforms) >>> ?- Local Linux testing >>> ? - vmTestbase/nsk/monitoring/ >>> ? - vmTestbase/nsk/jdwp >>> ? - vmTestbase/nsk/jdb/ >>> ? - vmTestbase/nsk/jdi/ >>> ? - vmTestbase/nsk/jvmti/ >>> ? - serviceability/jvmti/ >>> ? - serviceability/jdwp >>> ? - JDK: java/lang/management >>> ???????? com/sun/management >>> >>> ** Note that this applies to all accesses we make via code in >>> javaClasses.*. For this particular code I thought about adding a >>> guard in JavaThread::threadObj() but it turns out when we generate a >>> crash report we access the Thread's name() field and that can happen >>> when in any state, so we'd always trigger a secondary assertion >>> failure during error reporting if we did that. Note that accessing >>> name() can still easily lead to secondary assertions failures as I >>> discovered when trying to debug this and print the thread name out - >>> I would see an is_instance assertion fail checking that the Thread >>> name() is an instance of java.lang.String! >>> >>> Thanks, >>> David >>> ----- >> From leonid.mesnik at oracle.com Wed Nov 13 16:42:18 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 13 Nov 2019 08:42:18 -0800 Subject: RFC 8233915: JVMTI FollowReferences: Java Heap Leak not found because of C2 Scalar Replacement In-Reply-To: References: <729138cc-7a21-cf79-947c-c6a68f34237a@oracle.com> Message-ID: Thank you for fixing this. Leonid On 11/13/19 4:24 AM, Reingruber, Richard wrote: > Hi Leonid, > > these are valid points. Thanks for making me aware of them. > > I've increased the maximum heap size in my tests as suggested, and I've also run them with ZGC > enabled. > > I've also added the vm.opt.TieredCompilation != true requirement. > > I've done the changes in place. > > Thanks, Richard. > > -----Original Message----- > From: hotspot-compiler-dev On Behalf Of Leonid Mesnik > Sent: Dienstag, 12. November 2019 20:34 > To: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFC 8233915: JVMTI FollowReferences: Java Heap Leak not found because of C2 Scalar Replacement > > Hi > > I don't make complete review just sanity verified your test headers. I > see a couple of potential issues with them. > > 1) The using Xmx32M could cause OOME failures if test is executed with > ZGC. I think that at least 256M should be set. Could you please verify > that your tests pass with ZGC enabled. > > > 2) I think it makes sense to add requires > > vm.opt.TieredCompilation != true > > to just skip tests if anyone runs them with tiered compilation disabled > explicitly. > > Leonid > > On 11/11/19 7:29 AM, Reingruber, Richard wrote: >> Hi, >> >> I have created https://bugs.openjdk.java.net/browse/JDK-8233915 >> >> In short, a set of live objects L is not found using JVMTI FollowReferences() if L is only reachable >> from a scalar replaced object in a frame of a C2 compiled method. If L happens to be a growing leak, >> then a dynamically loaded JVMTI agent (note: can_tag_objects is an always capability) for heap >> diagnostics won't discover L as live and it won't be able to find root references that lead to L. >> >> I'd like to suggest the implementation for the proposed enhancement JDK-8227745 as bug-fix. >> >> RFE: https://bugs.openjdk.java.net/browse/JDK-8227745 >> Webrev(*): http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.1/ >> >> Please comment on the suggestion. Dou you see other solutions that allow an agent to discover the >> chain of references to L? >> >> I'd like to work on the complexity as well. One significant simplification could be, if it was >> possible to reallocate scalar replaced objects at safepoints (i.e. allow the VM thread to call >> Deoptimization::realloc_objects()). The GC interface does not seem to allow this. >> >> Thanks, Richard. >> >> (*) Not yet accepted, because deemed too complex for the performance gain. Note that I was able to >> reduce webrev.1 in size compared to webrev.0 From chris.plummer at oracle.com Wed Nov 13 18:55:30 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 13 Nov 2019 10:55:30 -0800 Subject: RFR(S): JDK-8231635: SA Stackwalking code stuck in BasicTypeDataBase.findDynamicTypeForAddress() In-Reply-To: <27B80F27-C4FC-4115-9D31-56FE141128DF@oracle.com> References: <42ae6371-5450-9973-eead-b2fa277b13b5@oracle.com> <2F27AC9C-4848-43D1-9522-207A103A07A2@oracle.com> <27B80F27-C4FC-4115-9D31-56FE141128DF@oracle.com> Message-ID: <7e1e5139-0700-3e8f-e410-e111d05767d0@oracle.com> Thanks Daniil! Chris On 11/12/19 9:08 PM, Daniil Titov wrote: > Hi Chris, > > The change looks good to me. > > Thanks! > --Daniil > > ?On 11/12/19, 11:06 AM, "serviceability-dev on behalf of Chris Plummer" wrote: > > Thanks Serguei! > > Can I get one more review please? > > thanks, > > Chris > > On 11/8/19 4:00 PM, serguei.spitsyn at oracle.com wrote: > > Hi Chris, > > > > This seems to be a good fix to have in any case. > > This check and bail out is right thing to do and should not break > > anything. > > I understand, this also fixes the test failures. > > > > I only had some experience a long time ago with the support of pstack > > and DTrace jstack action implementation which also does such SP > > recovering because the ebp can be used by JIT compiler as a general > > purpose register. There is no such a problem on sparc. > > > > Thanks, > > Serguei > > > > > > On 11/7/19 14:01, Chris Plummer wrote: > >> Hi, > >> > >> Please review the following fix for JDK-8231635: > >> > >> https://bugs.openjdk.java.net/browse/JDK-8231635 > >> http://cr.openjdk.java.net/~cjplummer/8231635/webrev.00/ > >> > >> I've tried to explain below to the best of my ability what's is going > >> on, but keep in mind that I basically had no background in this area > >> before looking into this CR, so this is all new to me. Please feel > >> free to chime in with corrections to my explanation, or any > >> additional insight that might help to further understanding of this > >> code. > >> > >> When doing a thread stack dump, SA has to figure out the SP for the > >> current frame when it may not in fact be stored anywhere. So it goes > >> through a series of guesses, starting with the current value of SP. > >> See AMD64CurrentFrameGuess.run(): > >> > >> Address sp = context.getRegisterAsAddress(AMD64ThreadContext.RSP); > >> > >> There are a number of checks done to see if this is the SP for the > >> actual current frame, one of the checks being (and kind of a last > >> resort) to follow the frame links and see if they eventually lead to > >> the first entry frame: > >> > >> while (frame != null) { > >> if (frame.isEntryFrame() && frame.entryFrameIsFirst()) { > >> ... > >> return true; > >> } > >> frame = frame.sender(map); > >> } > >> > >> If this fails, there is an outer loop to try the next address: > >> > >> for (long offset = 0; > >> offset < regionInBytesToSearch; > >> offset += vm.getAddressSize()) { > >> > >> Note that offset is added to the initial SP value that was fetched > >> from RSP. This approach is fraught with danger, because SP could be > >> incorrect, and you can easily follow a bad frame link to an invalid > >> address. So the body of this loop is in a try block that catches all > >> Exceptions, and simply retries with the next offset if one is caught. > >> Exceptions could be ones like UnalignedAddressException or > >> UnmappedAddressException. > >> > >> The bug in question turns up with the following harmless looking line: > >> > >> frame = frame.sender(map); > >> > >> This is fine if you know that "frame" is valid, but what if it is not > >> (which is very commonly the case). The frame values (SP, FP, and PC) > >> in the returned frame could be just about anything, including being > >> the same as the previous frame. This is what will happen if the SP > >> stored in "frame" is the same as the SP that was used to initialize > >> "frame" in the first place. This can certainly happen when SP is not > >> valid to start with, and is indeed what caused this bug. The end > >> result is the inner while loop gets stuck in an infinite loop > >> traversing the same frame. So the fix is to add a check for this to > >> make sure to break out of the while loop if this happens. Initially I > >> did this with an Address.equal() call, and that seemed to fix the > >> problem, but then I realized it would be possible to traverse through > >> one or more sender frames and eventually end up returning to a > >> previously visited frame, thus still an infinite loop. So I decided > >> on checking for Address.lessThanOrEqual() instead since the send > >> frame's SP should always be greater than the current frame's > >> (referred to as oldFrame) SP. As long as we always move in one > >> direction (towards a higher frame address), you can't have an > >> infinite loop in this code. > >> > >> I applied this fix to x86. Although not tested, it is built (all > >> platform support is always built with SA). The x86 and amd64 versions > >> are identical except for x86/amd64 references, so I thought it best > >> to go ahead and do the update to x86. I did not touch ppc, but would > >> be willing to update if someone passes along a fix that is tested. > >> > >> One final bit of clarification. The bug synopsis mentions getting > >> stuck in BasicTypeDataBase.findDynamicTypeForAddress(). This turns > >> out to not actually be the case, but every stack trace I initially > >> looked when I filed this CR was showing the thread being in this > >> frame and at the same line number. This appears to be the next > >> available safepoint where the thread can be suspended for stack > >> dumping. When debugging this some more and adding a lot of println() > >> calls in a lot of different locations, I started to see different > >> frames in the stacktrace, presumably because the println() calls > >> where adding additional safepoints. > >> > >> thanks, > >> > >> Chris > >> > > > > > > > > From serguei.spitsyn at oracle.com Thu Nov 14 15:55:34 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 14 Nov 2019 07:55:34 -0800 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: References: Message-ID: Hi David, Just wanted to let you know I'm reviewing this. Thanks, Serguei On 11/11/19 20:52, David Holmes wrote: > webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ > bug: https://bugs.openjdk.java.net/browse/JDK-8233549 > > In JDK-8229516 I moved the interrupted state of a thread from the > osThread in the VM to the java.lang.Thread instance. In doing that I > overlooked a critical aspect, which is that to access the field of a > Java object the JavaThread must not be in a safepoint-safe state** - > otherwise the oop, and anything referenced there from could be > relocated by the GC whilst the JavaThread is accessing it. This > manifested in a number of tests using JVM TI Agent threads and JVM TI > RawMonitors because the JavaThread's were marked _thread_blocked and > hence safepoint-safe, and we read a non-zero value for the interrupted > field even though we had never been interrupted. > > This problem existed in all the code that checks for interruption when > "waiting": > > - Parker::park (the code underpinning > java.util.concurrent.LockSupport.park()) > > To fix this code I simply deleted a late check of the interrupted > field. The check was not needed because if an interrupt has occurred > then we will find the ParkEvent in a signalled state. > > - ObjectMonitor::wait > > Here the late check of the interrupted state is essential as we reset > the ParkEvent after an earlier check of the interrupted state. But the > fix was simply achieved by moving the check slightly earlier before we > use ThreadBlockInVm to become _thread_blocked. > > - RawMonitor::wait > > This fix was much more involved. The RawMonitor code directly > transitions the JavaThread from _thread_in_Native to _thread_blocked. > This is safe from a safepoint perspective because they are equivalent > safepoint-safe states. To allow access to the interrupted field I have > to transition from native to _thread_in_vm, and that has to be done by > proper thread-state transitions to ensure correct access to the oop > and its fields. Having done that I can then use ThreadBlockInVM for > the transitions to blocked. However, as the old code noted it can't > use proper thread-state transitions as this will lead to deadlocks > with the VMThread that can also use RawMonitors when executing various > event callbacks. To deal with that we have to note that the real > constraint is that the JavaThread cannot block at a safepoint whilst > it holds the RawMonitor. Hence the fix was push all the interrupt > checking code and the thread-state transitions to the lowest level of > RawMonitorWait, around the final park() call, after we have enqueued > the waiter and released the monitor. That avoids any deadlock > possibility. > > I also added checks to is_interrupted/interrupted to ensure they are > only called by a thread in a suitable state. This should only be the > VMThread (as a consequence of the Thread.stop implementation occurring > at a safepoint and issuing a JavaThread::interrupt() call to unblock > the target); or a JavaThread that is not _thread_in_native or > _thread_blocked. > > Testing: (still finalizing) > ?- tiers 1 - 6 (Oracle platforms) > ?- Local Linux testing > ? - vmTestbase/nsk/monitoring/ > ? - vmTestbase/nsk/jdwp > ? - vmTestbase/nsk/jdb/ > ? - vmTestbase/nsk/jdi/ > ? - vmTestbase/nsk/jvmti/ > ? - serviceability/jvmti/ > ? - serviceability/jdwp > ? - JDK: java/lang/management > ???????? com/sun/management > > ** Note that this applies to all accesses we make via code in > javaClasses.*. For this particular code I thought about adding a guard > in JavaThread::threadObj() but it turns out when we generate a crash > report we access the Thread's name() field and that can happen when in > any state, so we'd always trigger a secondary assertion failure during > error reporting if we did that. Note that accessing name() can still > easily lead to secondary assertions failures as I discovered when > trying to debug this and print the thread name out - I would see an > is_instance assertion fail checking that the Thread name() is an > instance of java.lang.String! > > Thanks, > David > ----- From serguei.spitsyn at oracle.com Thu Nov 14 18:04:19 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 14 Nov 2019 10:04:19 -0800 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: References: Message-ID: <00254f6c-7532-a12d-9074-831bf3b69abd@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Thu Nov 14 22:21:39 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 15 Nov 2019 08:21:39 +1000 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: <00254f6c-7532-a12d-9074-831bf3b69abd@oracle.com> References: <00254f6c-7532-a12d-9074-831bf3b69abd@oracle.com> Message-ID: <452c2f0f-9e7c-d8cc-c185-1f3349d0c566@oracle.com> Hi Serguei, Thanks for taking a look. On 15/11/2019 4:04 am, serguei.spitsyn at oracle.com wrote: > Hi David, > > It looks good to me. > A couple of nits below. > > http://cr.openjdk.java.net/~dholmes/8233549/webrev/src/hotspot/share/prims/jvmtiRawMonitor.cpp.frames.html > > 236 if (self->is_Java_thread()) { > 237 JavaThread* jt = (JavaThread*) self; > 238 // Transition to VM so we can check interrupt state > 239 ThreadInVMfromNative tivm(jt); > 240 if (jt->is_interrupted(true)) { > 241 ret = M_INTERRUPTED; > 242 } else { > 243 ThreadBlockInVM tbivm(jt); > 244 jt->set_suspend_equivalent(); > 245 if (millis <= 0) { > 246 self->_ParkEvent->park(); > 247 } else { > 248 self->_ParkEvent->park(millis); > 249 } > 250 } > 251 // Return to VM before post-check of interrupt state > 252 if (jt->is_interrupted(true)) { > 253 ret = M_INTERRUPTED; > 254 } > 255 } else { > > > It seems, the fragment at lines 251-254 needs to bebefore the line 250. > It will add more clarity to this code. No, it has to be after line 250 as that is when we will hit the TBIVM destructor and so return to _thread_in_vm which is the state needed to read the interrupted field. Dan commented on the above and I changed it slightly by moving the comment: > 250 // Return to VM before post-check of interrupt state > 251 } > 252 if (jt->is_interrupted(true)) { > 253 ret = M_INTERRUPTED; > 254 } > 412 if (self->is_Java_thread()) { > 413 JavaThread* jt = (JavaThread*)self; > 414 jt->set_suspend_equivalent(); > 415 for (;;) { > 416 if (!jt->handle_special_suspend_equivalent_condition()) { > 417 break; > 418 } else { > 419 // We've been suspended whilst waiting and so we have to > 420 // relinquish the raw monitor until we are resumed. Of course > 421 // after reacquiring we have to re-check for suspension again. > 422 // Suspension requires we are _thread_blocked, and we also have to > 423 // recheck for being interrupted. > 424 simple_exit(jt); > 425 { > 426 ThreadInVMfromNative tivm(jt); > 427 { > 428 ThreadBlockInVM tbivm(jt); > 429 jt->java_suspend_self(); > 430 } > 431 if (jt->is_interrupted(true)) { > 432 ret = M_INTERRUPTED; > 433 } > 434 } > 435 simple_enter(jt); > 436 jt->set_suspend_equivalent(); > 437 } > ... > > This code can be simplified a little bit. > The line: > > 414 jt->set_suspend_equivalent(); > > can be placed before line 416. > Then this line can be removed: > > 436 jt->set_suspend_equivalent(); Yes you're right. I was trying to preserve the original loop structure, but then had to add the additional set_suspend_equivalent for the first iteration. But I can instead just move the existing one to the top of the loop. Webrev updated in place. Thanks, David ----- > > Thanks, > Serguei > > > On 11/11/19 20:52, David Holmes wrote: >> webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8233549 >> >> In JDK-8229516 I moved the interrupted state of a thread from the >> osThread in the VM to the java.lang.Thread instance. In doing that I >> overlooked a critical aspect, which is that to access the field of a >> Java object the JavaThread must not be in a safepoint-safe state** - >> otherwise the oop, and anything referenced there from could be >> relocated by the GC whilst the JavaThread is accessing it. This >> manifested in a number of tests using JVM TI Agent threads and JVM TI >> RawMonitors because the JavaThread's were marked _thread_blocked and >> hence safepoint-safe, and we read a non-zero value for the interrupted >> field even though we had never been interrupted. >> >> This problem existed in all the code that checks for interruption when >> "waiting": >> >> - Parker::park (the code underpinning >> java.util.concurrent.LockSupport.park()) >> >> To fix this code I simply deleted a late check of the interrupted >> field. The check was not needed because if an interrupt has occurred >> then we will find the ParkEvent in a signalled state. >> >> - ObjectMonitor::wait >> >> Here the late check of the interrupted state is essential as we reset >> the ParkEvent after an earlier check of the interrupted state. But the >> fix was simply achieved by moving the check slightly earlier before we >> use ThreadBlockInVm to become _thread_blocked. >> >> - RawMonitor::wait >> >> This fix was much more involved. The RawMonitor code directly >> transitions the JavaThread from _thread_in_Native to _thread_blocked. >> This is safe from a safepoint perspective because they are equivalent >> safepoint-safe states. To allow access to the interrupted field I have >> to transition from native to _thread_in_vm, and that has to be done by >> proper thread-state transitions to ensure correct access to the oop >> and its fields. Having done that I can then use ThreadBlockInVM for >> the transitions to blocked. However, as the old code noted it can't >> use proper thread-state transitions as this will lead to deadlocks >> with the VMThread that can also use RawMonitors when executing various >> event callbacks. To deal with that we have to note that the real >> constraint is that the JavaThread cannot block at a safepoint whilst >> it holds the RawMonitor. Hence the fix was push all the interrupt >> checking code and the thread-state transitions to the lowest level of >> RawMonitorWait, around the final park() call, after we have enqueued >> the waiter and released the monitor. That avoids any deadlock >> possibility. >> >> I also added checks to is_interrupted/interrupted to ensure they are >> only called by a thread in a suitable state. This should only be the >> VMThread (as a consequence of the Thread.stop implementation occurring >> at a safepoint and issuing a JavaThread::interrupt() call to unblock >> the target); or a JavaThread that is not _thread_in_native or >> _thread_blocked. >> >> Testing: (still finalizing) >> ?- tiers 1 - 6 (Oracle platforms) >> ?- Local Linux testing >> ? - vmTestbase/nsk/monitoring/ >> ? - vmTestbase/nsk/jdwp >> ? - vmTestbase/nsk/jdb/ >> ? - vmTestbase/nsk/jdi/ >> ? - vmTestbase/nsk/jvmti/ >> ? - serviceability/jvmti/ >> ? - serviceability/jdwp >> ? - JDK: java/lang/management >> ???????? com/sun/management >> >> ** Note that this applies to all accesses we make via code in >> javaClasses.*. For this particular code I thought about adding a guard >> in JavaThread::threadObj() but it turns out when we generate a crash >> report we access the Thread's name() field and that can happen when in >> any state, so we'd always trigger a secondary assertion failure during >> error reporting if we did that. Note that accessing name() can still >> easily lead to secondary assertions failures as I discovered when >> trying to debug this and print the thread name out - I would see an >> is_instance assertion fail checking that the Thread name() is an >> instance of java.lang.String! >> >> Thanks, >> David >> ----- > From daniel.daugherty at oracle.com Thu Nov 14 22:33:34 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 14 Nov 2019 17:33:34 -0500 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: <452c2f0f-9e7c-d8cc-c185-1f3349d0c566@oracle.com> References: <00254f6c-7532-a12d-9074-831bf3b69abd@oracle.com> <452c2f0f-9e7c-d8cc-c185-1f3349d0c566@oracle.com> Message-ID: > Webrev updated in place. Thumbs up. Dan On 11/14/19 5:21 PM, David Holmes wrote: > Hi Serguei, > > Thanks for taking a look. > > On 15/11/2019 4:04 am, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> It looks good to me. >> A couple of nits below. >> >> http://cr.openjdk.java.net/~dholmes/8233549/webrev/src/hotspot/share/prims/jvmtiRawMonitor.cpp.frames.html >> >> >> 236 if (self->is_Java_thread()) { >> 237 JavaThread* jt = (JavaThread*) self; >> 238 // Transition to VM so we can check interrupt state >> 239 ThreadInVMfromNative tivm(jt); >> 240 if (jt->is_interrupted(true)) { >> 241 ret = M_INTERRUPTED; >> 242 } else { >> 243 ThreadBlockInVM tbivm(jt); >> 244 jt->set_suspend_equivalent(); >> 245 if (millis <= 0) { >> 246 self->_ParkEvent->park(); >> 247 } else { >> 248 self->_ParkEvent->park(millis); >> 249 } >> 250 } >> 251 // Return to VM before post-check of interrupt state >> 252 if (jt->is_interrupted(true)) { >> 253 ret = M_INTERRUPTED; >> 254 } >> 255 } else { >> >> >> It seems, the fragment at lines 251-254 needs to bebefore the line 250. >> It will add more clarity to this code. > > No, it has to be after line 250 as that is when we will hit the TBIVM > destructor and so return to _thread_in_vm which is the state needed to > read the interrupted field. Dan commented on the above and I changed > it slightly by moving the comment: > > > 250?? // Return to VM before post-check of interrupt state > > 251 } > > 252 if (jt->is_interrupted(true)) { > > 253?? ret = M_INTERRUPTED; > > 254 } > > >> ? 412?? if (self->is_Java_thread()) { >> 413 JavaThread* jt = (JavaThread*)self; >> 414 jt->set_suspend_equivalent(); >> ? 415???? for (;;) { >> ? 416?????? if (!jt->handle_special_suspend_equivalent_condition()) { >> ? 417???????? break; >> 418 } else { >> 419 // We've been suspended whilst waiting and so we have to >> 420 // relinquish the raw monitor until we are resumed. Of course >> 421 // after reacquiring we have to re-check for suspension again. >> 422 // Suspension requires we are _thread_blocked, and we also have to >> 423 // recheck for being interrupted. >> ? 424???????? simple_exit(jt); >> 425 { >> 426 ThreadInVMfromNative tivm(jt); >> 427 { >> 428 ThreadBlockInVM tbivm(jt); >> ? 429???????????? jt->java_suspend_self(); >> 430 } >> 431 if (jt->is_interrupted(true)) { >> 432 ret = M_INTERRUPTED; >> 433 } >> 434 } >> ? 435???????? simple_enter(jt); >> ? 436???????? jt->set_suspend_equivalent(); >> ? 437?????? } >> ? ... >> >> This code can be simplified a little bit. >> The line: >> >> 414 jt->set_suspend_equivalent(); >> >> can be placed before line 416. >> Then this line can be removed: >> >> ? 436???????? jt->set_suspend_equivalent(); > > Yes you're right. I was trying to preserve the original loop > structure, but then had to add the additional set_suspend_equivalent > for the first iteration. But I can instead just move the existing one > to the top of the loop. > > Webrev updated in place. > > Thanks, > David > ----- > >> >> Thanks, >> Serguei >> >> >> On 11/11/19 20:52, David Holmes wrote: >>> webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8233549 >>> >>> In JDK-8229516 I moved the interrupted state of a thread from the >>> osThread in the VM to the java.lang.Thread instance. In doing that I >>> overlooked a critical aspect, which is that to access the field of a >>> Java object the JavaThread must not be in a safepoint-safe state** - >>> otherwise the oop, and anything referenced there from could be >>> relocated by the GC whilst the JavaThread is accessing it. This >>> manifested in a number of tests using JVM TI Agent threads and JVM >>> TI RawMonitors because the JavaThread's were marked _thread_blocked >>> and hence safepoint-safe, and we read a non-zero value for the >>> interrupted field even though we had never been interrupted. >>> >>> This problem existed in all the code that checks for interruption >>> when "waiting": >>> >>> - Parker::park (the code underpinning >>> java.util.concurrent.LockSupport.park()) >>> >>> To fix this code I simply deleted a late check of the interrupted >>> field. The check was not needed because if an interrupt has occurred >>> then we will find the ParkEvent in a signalled state. >>> >>> - ObjectMonitor::wait >>> >>> Here the late check of the interrupted state is essential as we >>> reset the ParkEvent after an earlier check of the interrupted state. >>> But the fix was simply achieved by moving the check slightly earlier >>> before we use ThreadBlockInVm to become _thread_blocked. >>> >>> - RawMonitor::wait >>> >>> This fix was much more involved. The RawMonitor code directly >>> transitions the JavaThread from _thread_in_Native to >>> _thread_blocked. This is safe from a safepoint perspective because >>> they are equivalent safepoint-safe states. To allow access to the >>> interrupted field I have to transition from native to _thread_in_vm, >>> and that has to be done by proper thread-state transitions to ensure >>> correct access to the oop and its fields. Having done that I can >>> then use ThreadBlockInVM for the transitions to blocked. However, as >>> the old code noted it can't use proper thread-state transitions as >>> this will lead to deadlocks with the VMThread that can also use >>> RawMonitors when executing various event callbacks. To deal with >>> that we have to note that the real constraint is that the JavaThread >>> cannot block at a safepoint whilst it holds the RawMonitor. Hence >>> the fix was push all the interrupt checking code and the >>> thread-state transitions to the lowest level of RawMonitorWait, >>> around the final park() call, after we have enqueued the waiter and >>> released the monitor. That avoids any deadlock possibility. >>> >>> I also added checks to is_interrupted/interrupted to ensure they are >>> only called by a thread in a suitable state. This should only be the >>> VMThread (as a consequence of the Thread.stop implementation >>> occurring at a safepoint and issuing a JavaThread::interrupt() call >>> to unblock the target); or a JavaThread that is not >>> _thread_in_native or _thread_blocked. >>> >>> Testing: (still finalizing) >>> ?- tiers 1 - 6 (Oracle platforms) >>> ?- Local Linux testing >>> ? - vmTestbase/nsk/monitoring/ >>> ? - vmTestbase/nsk/jdwp >>> ? - vmTestbase/nsk/jdb/ >>> ? - vmTestbase/nsk/jdi/ >>> ? - vmTestbase/nsk/jvmti/ >>> ? - serviceability/jvmti/ >>> ? - serviceability/jdwp >>> ? - JDK: java/lang/management >>> ???????? com/sun/management >>> >>> ** Note that this applies to all accesses we make via code in >>> javaClasses.*. For this particular code I thought about adding a >>> guard in JavaThread::threadObj() but it turns out when we generate a >>> crash report we access the Thread's name() field and that can happen >>> when in any state, so we'd always trigger a secondary assertion >>> failure during error reporting if we did that. Note that accessing >>> name() can still easily lead to secondary assertions failures as I >>> discovered when trying to debug this and print the thread name out - >>> I would see an is_instance assertion fail checking that the Thread >>> name() is an instance of java.lang.String! >>> >>> Thanks, >>> David >>> ----- >> From david.holmes at oracle.com Thu Nov 14 22:40:18 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 15 Nov 2019 08:40:18 +1000 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: References: <00254f6c-7532-a12d-9074-831bf3b69abd@oracle.com> <452c2f0f-9e7c-d8cc-c185-1f3349d0c566@oracle.com> Message-ID: <37120bb3-a7e4-34c4-b0a0-ae1042045214@oracle.com> Thanks Dan! David On 15/11/2019 8:33 am, Daniel D. Daugherty wrote: >> Webrev updated in place. > > Thumbs up. > > Dan > > > > On 11/14/19 5:21 PM, David Holmes wrote: >> Hi Serguei, >> >> Thanks for taking a look. >> >> On 15/11/2019 4:04 am, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> It looks good to me. >>> A couple of nits below. >>> >>> http://cr.openjdk.java.net/~dholmes/8233549/webrev/src/hotspot/share/prims/jvmtiRawMonitor.cpp.frames.html >>> >>> >>> 236 if (self->is_Java_thread()) { >>> 237 JavaThread* jt = (JavaThread*) self; >>> 238 // Transition to VM so we can check interrupt state >>> 239 ThreadInVMfromNative tivm(jt); >>> 240 if (jt->is_interrupted(true)) { >>> 241 ret = M_INTERRUPTED; >>> 242 } else { >>> 243 ThreadBlockInVM tbivm(jt); >>> 244 jt->set_suspend_equivalent(); >>> 245 if (millis <= 0) { >>> 246 self->_ParkEvent->park(); >>> 247 } else { >>> 248 self->_ParkEvent->park(millis); >>> 249 } >>> 250 } >>> 251 // Return to VM before post-check of interrupt state >>> 252 if (jt->is_interrupted(true)) { >>> 253 ret = M_INTERRUPTED; >>> 254 } >>> 255 } else { >>> >>> >>> It seems, the fragment at lines 251-254 needs to bebefore the line 250. >>> It will add more clarity to this code. >> >> No, it has to be after line 250 as that is when we will hit the TBIVM >> destructor and so return to _thread_in_vm which is the state needed to >> read the interrupted field. Dan commented on the above and I changed >> it slightly by moving the comment: >> >> > 250?? // Return to VM before post-check of interrupt state >> > 251 } >> > 252 if (jt->is_interrupted(true)) { >> > 253?? ret = M_INTERRUPTED; >> > 254 } >> >> >>> ? 412?? if (self->is_Java_thread()) { >>> 413 JavaThread* jt = (JavaThread*)self; >>> 414 jt->set_suspend_equivalent(); >>> ? 415???? for (;;) { >>> ? 416?????? if (!jt->handle_special_suspend_equivalent_condition()) { >>> ? 417???????? break; >>> 418 } else { >>> 419 // We've been suspended whilst waiting and so we have to >>> 420 // relinquish the raw monitor until we are resumed. Of course >>> 421 // after reacquiring we have to re-check for suspension again. >>> 422 // Suspension requires we are _thread_blocked, and we also have to >>> 423 // recheck for being interrupted. >>> ? 424???????? simple_exit(jt); >>> 425 { >>> 426 ThreadInVMfromNative tivm(jt); >>> 427 { >>> 428 ThreadBlockInVM tbivm(jt); >>> ? 429???????????? jt->java_suspend_self(); >>> 430 } >>> 431 if (jt->is_interrupted(true)) { >>> 432 ret = M_INTERRUPTED; >>> 433 } >>> 434 } >>> ? 435???????? simple_enter(jt); >>> ? 436???????? jt->set_suspend_equivalent(); >>> ? 437?????? } >>> ? ... >>> >>> This code can be simplified a little bit. >>> The line: >>> >>> 414 jt->set_suspend_equivalent(); >>> >>> can be placed before line 416. >>> Then this line can be removed: >>> >>> ? 436???????? jt->set_suspend_equivalent(); >> >> Yes you're right. I was trying to preserve the original loop >> structure, but then had to add the additional set_suspend_equivalent >> for the first iteration. But I can instead just move the existing one >> to the top of the loop. >> >> Webrev updated in place. >> >> Thanks, >> David >> ----- >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/11/19 20:52, David Holmes wrote: >>>> webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8233549 >>>> >>>> In JDK-8229516 I moved the interrupted state of a thread from the >>>> osThread in the VM to the java.lang.Thread instance. In doing that I >>>> overlooked a critical aspect, which is that to access the field of a >>>> Java object the JavaThread must not be in a safepoint-safe state** - >>>> otherwise the oop, and anything referenced there from could be >>>> relocated by the GC whilst the JavaThread is accessing it. This >>>> manifested in a number of tests using JVM TI Agent threads and JVM >>>> TI RawMonitors because the JavaThread's were marked _thread_blocked >>>> and hence safepoint-safe, and we read a non-zero value for the >>>> interrupted field even though we had never been interrupted. >>>> >>>> This problem existed in all the code that checks for interruption >>>> when "waiting": >>>> >>>> - Parker::park (the code underpinning >>>> java.util.concurrent.LockSupport.park()) >>>> >>>> To fix this code I simply deleted a late check of the interrupted >>>> field. The check was not needed because if an interrupt has occurred >>>> then we will find the ParkEvent in a signalled state. >>>> >>>> - ObjectMonitor::wait >>>> >>>> Here the late check of the interrupted state is essential as we >>>> reset the ParkEvent after an earlier check of the interrupted state. >>>> But the fix was simply achieved by moving the check slightly earlier >>>> before we use ThreadBlockInVm to become _thread_blocked. >>>> >>>> - RawMonitor::wait >>>> >>>> This fix was much more involved. The RawMonitor code directly >>>> transitions the JavaThread from _thread_in_Native to >>>> _thread_blocked. This is safe from a safepoint perspective because >>>> they are equivalent safepoint-safe states. To allow access to the >>>> interrupted field I have to transition from native to _thread_in_vm, >>>> and that has to be done by proper thread-state transitions to ensure >>>> correct access to the oop and its fields. Having done that I can >>>> then use ThreadBlockInVM for the transitions to blocked. However, as >>>> the old code noted it can't use proper thread-state transitions as >>>> this will lead to deadlocks with the VMThread that can also use >>>> RawMonitors when executing various event callbacks. To deal with >>>> that we have to note that the real constraint is that the JavaThread >>>> cannot block at a safepoint whilst it holds the RawMonitor. Hence >>>> the fix was push all the interrupt checking code and the >>>> thread-state transitions to the lowest level of RawMonitorWait, >>>> around the final park() call, after we have enqueued the waiter and >>>> released the monitor. That avoids any deadlock possibility. >>>> >>>> I also added checks to is_interrupted/interrupted to ensure they are >>>> only called by a thread in a suitable state. This should only be the >>>> VMThread (as a consequence of the Thread.stop implementation >>>> occurring at a safepoint and issuing a JavaThread::interrupt() call >>>> to unblock the target); or a JavaThread that is not >>>> _thread_in_native or _thread_blocked. >>>> >>>> Testing: (still finalizing) >>>> ?- tiers 1 - 6 (Oracle platforms) >>>> ?- Local Linux testing >>>> ? - vmTestbase/nsk/monitoring/ >>>> ? - vmTestbase/nsk/jdwp >>>> ? - vmTestbase/nsk/jdb/ >>>> ? - vmTestbase/nsk/jdi/ >>>> ? - vmTestbase/nsk/jvmti/ >>>> ? - serviceability/jvmti/ >>>> ? - serviceability/jdwp >>>> ? - JDK: java/lang/management >>>> ???????? com/sun/management >>>> >>>> ** Note that this applies to all accesses we make via code in >>>> javaClasses.*. For this particular code I thought about adding a >>>> guard in JavaThread::threadObj() but it turns out when we generate a >>>> crash report we access the Thread's name() field and that can happen >>>> when in any state, so we'd always trigger a secondary assertion >>>> failure during error reporting if we did that. Note that accessing >>>> name() can still easily lead to secondary assertions failures as I >>>> discovered when trying to debug this and print the thread name out - >>>> I would see an is_instance assertion fail checking that the Thread >>>> name() is an instance of java.lang.String! >>>> >>>> Thanks, >>>> David >>>> ----- >>> > From coleen.phillimore at oracle.com Fri Nov 15 01:15:05 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 14 Nov 2019 20:15:05 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load Message-ID: Summary: Don't post information which uses metadata from unloaded nmethods Tested tier1-3 and 100 times with test that failed (reproduced failure without the fix). open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8173361 Thanks, Coleen From coleen.phillimore at oracle.com Fri Nov 15 01:17:07 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 14 Nov 2019 20:17:07 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: Message-ID: Please include my email in your replies. Thanks, Coleen On 11/14/19 8:15 PM, coleen.phillimore at oracle.com wrote: > Summary: Don't post information which uses metadata from unloaded > nmethods > > Tested tier1-3 and 100 times with test that failed (reproduced failure > without the fix). > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8173361 > > Thanks, > Coleen From david.holmes at oracle.com Fri Nov 15 01:34:13 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 15 Nov 2019 11:34:13 +1000 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: Message-ID: <2c881c25-0fc2-d9f9-a6c8-1227c74bc004@oracle.com> Hi Coleen, On 15/11/2019 11:15 am, coleen.phillimore at oracle.com wrote: > Summary: Don't post information which uses metadata from unloaded nmethods > > Tested tier1-3 and 100 times with test that failed (reproduced failure > without the fix). > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8173361 Fix looks fine. Do we want assert "nm->method() != NULL" somewhere? Thanks, David > Thanks, > Coleen From chris.plummer at oracle.com Fri Nov 15 02:07:15 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 14 Nov 2019 18:07:15 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: Message-ID: <8b826d49-0a8a-7571-5404-9f5bd2954836@oracle.com> Hi Coleen, Is it ok to end up missing some CompiledMethodLoad events? The spec says: "Sent when a method is compiled and loaded into memory by the VM. If it is unloaded, the CompiledMethodUnload event is sent. If it is moved, the CompiledMethodUnload event is sent, followed by a new CompiledMethodLoad event. Note that a single method may have multiple compiled forms, and that this event will be sent for each form. " So a method was still "compiled and loaded into memory", right? We just didn't get the event out before it was too late. Is the CompiledMethodUnload still sent? thanks, Chris On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: > Summary: Don't post information which uses metadata from unloaded > nmethods > > Tested tier1-3 and 100 times with test that failed (reproduced failure > without the fix). > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8173361 > > Thanks, > Coleen From serguei.spitsyn at oracle.com Fri Nov 15 02:14:03 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 14 Nov 2019 18:14:03 -0800 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: <452c2f0f-9e7c-d8cc-c185-1f3349d0c566@oracle.com> References: <00254f6c-7532-a12d-9074-831bf3b69abd@oracle.com> <452c2f0f-9e7c-d8cc-c185-1f3349d0c566@oracle.com> Message-ID: Hi David, Thank you for the update! It looks good to me. You are right about my first suggestion. The lines need to stay where they are, or additional curly brackets are needed to force the ThreadBlockInVM destructor earlier. Thanks, Serguei On 11/14/19 2:21 PM, David Holmes wrote: > Hi Serguei, > > Thanks for taking a look. > > On 15/11/2019 4:04 am, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> It looks good to me. >> A couple of nits below. >> >> http://cr.openjdk.java.net/~dholmes/8233549/webrev/src/hotspot/share/prims/jvmtiRawMonitor.cpp.frames.html >> >> >> 236 if (self->is_Java_thread()) { >> 237 JavaThread* jt = (JavaThread*) self; >> 238 // Transition to VM so we can check interrupt state >> 239 ThreadInVMfromNative tivm(jt); >> 240 if (jt->is_interrupted(true)) { >> 241 ret = M_INTERRUPTED; >> 242 } else { >> 243 ThreadBlockInVM tbivm(jt); >> 244 jt->set_suspend_equivalent(); >> 245 if (millis <= 0) { >> 246 self->_ParkEvent->park(); >> 247 } else { >> 248 self->_ParkEvent->park(millis); >> 249 } >> 250 } >> 251 // Return to VM before post-check of interrupt state >> 252 if (jt->is_interrupted(true)) { >> 253 ret = M_INTERRUPTED; >> 254 } >> 255 } else { >> >> >> It seems, the fragment at lines 251-254 needs to bebefore the line 250. >> It will add more clarity to this code. > > No, it has to be after line 250 as that is when we will hit the TBIVM > destructor and so return to _thread_in_vm which is the state needed to > read the interrupted field. Dan commented on the above and I changed > it slightly by moving the comment: > > > 250?? // Return to VM before post-check of interrupt state > > 251 } > > 252 if (jt->is_interrupted(true)) { > > 253?? ret = M_INTERRUPTED; > > 254 } > > >> ? 412?? if (self->is_Java_thread()) { >> 413 JavaThread* jt = (JavaThread*)self; >> 414 jt->set_suspend_equivalent(); >> ? 415???? for (;;) { >> ? 416?????? if (!jt->handle_special_suspend_equivalent_condition()) { >> ? 417???????? break; >> 418 } else { >> 419 // We've been suspended whilst waiting and so we have to >> 420 // relinquish the raw monitor until we are resumed. Of course >> 421 // after reacquiring we have to re-check for suspension again. >> 422 // Suspension requires we are _thread_blocked, and we also have to >> 423 // recheck for being interrupted. >> ? 424???????? simple_exit(jt); >> 425 { >> 426 ThreadInVMfromNative tivm(jt); >> 427 { >> 428 ThreadBlockInVM tbivm(jt); >> ? 429???????????? jt->java_suspend_self(); >> 430 } >> 431 if (jt->is_interrupted(true)) { >> 432 ret = M_INTERRUPTED; >> 433 } >> 434 } >> ? 435???????? simple_enter(jt); >> ? 436???????? jt->set_suspend_equivalent(); >> ? 437?????? } >> ? ... >> >> This code can be simplified a little bit. >> The line: >> >> 414 jt->set_suspend_equivalent(); >> >> can be placed before line 416. >> Then this line can be removed: >> >> ? 436???????? jt->set_suspend_equivalent(); > > Yes you're right. I was trying to preserve the original loop > structure, but then had to add the additional set_suspend_equivalent > for the first iteration. But I can instead just move the existing one > to the top of the loop. > > Webrev updated in place. > > Thanks, > David > ----- > >> >> Thanks, >> Serguei >> >> >> On 11/11/19 20:52, David Holmes wrote: >>> webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8233549 >>> >>> In JDK-8229516 I moved the interrupted state of a thread from the >>> osThread in the VM to the java.lang.Thread instance. In doing that I >>> overlooked a critical aspect, which is that to access the field of a >>> Java object the JavaThread must not be in a safepoint-safe state** - >>> otherwise the oop, and anything referenced there from could be >>> relocated by the GC whilst the JavaThread is accessing it. This >>> manifested in a number of tests using JVM TI Agent threads and JVM >>> TI RawMonitors because the JavaThread's were marked _thread_blocked >>> and hence safepoint-safe, and we read a non-zero value for the >>> interrupted field even though we had never been interrupted. >>> >>> This problem existed in all the code that checks for interruption >>> when "waiting": >>> >>> - Parker::park (the code underpinning >>> java.util.concurrent.LockSupport.park()) >>> >>> To fix this code I simply deleted a late check of the interrupted >>> field. The check was not needed because if an interrupt has occurred >>> then we will find the ParkEvent in a signalled state. >>> >>> - ObjectMonitor::wait >>> >>> Here the late check of the interrupted state is essential as we >>> reset the ParkEvent after an earlier check of the interrupted state. >>> But the fix was simply achieved by moving the check slightly earlier >>> before we use ThreadBlockInVm to become _thread_blocked. >>> >>> - RawMonitor::wait >>> >>> This fix was much more involved. The RawMonitor code directly >>> transitions the JavaThread from _thread_in_Native to >>> _thread_blocked. This is safe from a safepoint perspective because >>> they are equivalent safepoint-safe states. To allow access to the >>> interrupted field I have to transition from native to _thread_in_vm, >>> and that has to be done by proper thread-state transitions to ensure >>> correct access to the oop and its fields. Having done that I can >>> then use ThreadBlockInVM for the transitions to blocked. However, as >>> the old code noted it can't use proper thread-state transitions as >>> this will lead to deadlocks with the VMThread that can also use >>> RawMonitors when executing various event callbacks. To deal with >>> that we have to note that the real constraint is that the JavaThread >>> cannot block at a safepoint whilst it holds the RawMonitor. Hence >>> the fix was push all the interrupt checking code and the >>> thread-state transitions to the lowest level of RawMonitorWait, >>> around the final park() call, after we have enqueued the waiter and >>> released the monitor. That avoids any deadlock possibility. >>> >>> I also added checks to is_interrupted/interrupted to ensure they are >>> only called by a thread in a suitable state. This should only be the >>> VMThread (as a consequence of the Thread.stop implementation >>> occurring at a safepoint and issuing a JavaThread::interrupt() call >>> to unblock the target); or a JavaThread that is not >>> _thread_in_native or _thread_blocked. >>> >>> Testing: (still finalizing) >>> ?- tiers 1 - 6 (Oracle platforms) >>> ?- Local Linux testing >>> ? - vmTestbase/nsk/monitoring/ >>> ? - vmTestbase/nsk/jdwp >>> ? - vmTestbase/nsk/jdb/ >>> ? - vmTestbase/nsk/jdi/ >>> ? - vmTestbase/nsk/jvmti/ >>> ? - serviceability/jvmti/ >>> ? - serviceability/jdwp >>> ? - JDK: java/lang/management >>> ???????? com/sun/management >>> >>> ** Note that this applies to all accesses we make via code in >>> javaClasses.*. For this particular code I thought about adding a >>> guard in JavaThread::threadObj() but it turns out when we generate a >>> crash report we access the Thread's name() field and that can happen >>> when in any state, so we'd always trigger a secondary assertion >>> failure during error reporting if we did that. Note that accessing >>> name() can still easily lead to secondary assertions failures as I >>> discovered when trying to debug this and print the thread name out - >>> I would see an is_instance assertion fail checking that the Thread >>> name() is an instance of java.lang.String! >>> >>> Thanks, >>> David >>> ----- >> From david.holmes at oracle.com Fri Nov 15 02:32:04 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 15 Nov 2019 12:32:04 +1000 Subject: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state In-Reply-To: References: <00254f6c-7532-a12d-9074-831bf3b69abd@oracle.com> <452c2f0f-9e7c-d8cc-c185-1f3349d0c566@oracle.com> Message-ID: Thanks again Serguei. David On 15/11/2019 12:14 pm, serguei.spitsyn at oracle.com wrote: > Hi David, > > Thank you for the update! > It looks good to me. > > You are right about my first suggestion. > The lines need to stay where they are, or additional curly brackets > are needed to force the ThreadBlockInVM destructor earlier. > > Thanks, > Serguei > > > On 11/14/19 2:21 PM, David Holmes wrote: >> Hi Serguei, >> >> Thanks for taking a look. >> >> On 15/11/2019 4:04 am, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> It looks good to me. >>> A couple of nits below. >>> >>> http://cr.openjdk.java.net/~dholmes/8233549/webrev/src/hotspot/share/prims/jvmtiRawMonitor.cpp.frames.html >>> >>> >>> 236 if (self->is_Java_thread()) { >>> 237 JavaThread* jt = (JavaThread*) self; >>> 238 // Transition to VM so we can check interrupt state >>> 239 ThreadInVMfromNative tivm(jt); >>> 240 if (jt->is_interrupted(true)) { >>> 241 ret = M_INTERRUPTED; >>> 242 } else { >>> 243 ThreadBlockInVM tbivm(jt); >>> 244 jt->set_suspend_equivalent(); >>> 245 if (millis <= 0) { >>> 246 self->_ParkEvent->park(); >>> 247 } else { >>> 248 self->_ParkEvent->park(millis); >>> 249 } >>> 250 } >>> 251 // Return to VM before post-check of interrupt state >>> 252 if (jt->is_interrupted(true)) { >>> 253 ret = M_INTERRUPTED; >>> 254 } >>> 255 } else { >>> >>> >>> It seems, the fragment at lines 251-254 needs to bebefore the line 250. >>> It will add more clarity to this code. >> >> No, it has to be after line 250 as that is when we will hit the TBIVM >> destructor and so return to _thread_in_vm which is the state needed to >> read the interrupted field. Dan commented on the above and I changed >> it slightly by moving the comment: >> >> > 250?? // Return to VM before post-check of interrupt state >> > 251 } >> > 252 if (jt->is_interrupted(true)) { >> > 253?? ret = M_INTERRUPTED; >> > 254 } >> >> >>> ? 412?? if (self->is_Java_thread()) { >>> 413 JavaThread* jt = (JavaThread*)self; >>> 414 jt->set_suspend_equivalent(); >>> ? 415???? for (;;) { >>> ? 416?????? if (!jt->handle_special_suspend_equivalent_condition()) { >>> ? 417???????? break; >>> 418 } else { >>> 419 // We've been suspended whilst waiting and so we have to >>> 420 // relinquish the raw monitor until we are resumed. Of course >>> 421 // after reacquiring we have to re-check for suspension again. >>> 422 // Suspension requires we are _thread_blocked, and we also have to >>> 423 // recheck for being interrupted. >>> ? 424???????? simple_exit(jt); >>> 425 { >>> 426 ThreadInVMfromNative tivm(jt); >>> 427 { >>> 428 ThreadBlockInVM tbivm(jt); >>> ? 429???????????? jt->java_suspend_self(); >>> 430 } >>> 431 if (jt->is_interrupted(true)) { >>> 432 ret = M_INTERRUPTED; >>> 433 } >>> 434 } >>> ? 435???????? simple_enter(jt); >>> ? 436???????? jt->set_suspend_equivalent(); >>> ? 437?????? } >>> ? ... >>> >>> This code can be simplified a little bit. >>> The line: >>> >>> 414 jt->set_suspend_equivalent(); >>> >>> can be placed before line 416. >>> Then this line can be removed: >>> >>> ? 436???????? jt->set_suspend_equivalent(); >> >> Yes you're right. I was trying to preserve the original loop >> structure, but then had to add the additional set_suspend_equivalent >> for the first iteration. But I can instead just move the existing one >> to the top of the loop. >> >> Webrev updated in place. >> >> Thanks, >> David >> ----- >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/11/19 20:52, David Holmes wrote: >>>> webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/ >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8233549 >>>> >>>> In JDK-8229516 I moved the interrupted state of a thread from the >>>> osThread in the VM to the java.lang.Thread instance. In doing that I >>>> overlooked a critical aspect, which is that to access the field of a >>>> Java object the JavaThread must not be in a safepoint-safe state** - >>>> otherwise the oop, and anything referenced there from could be >>>> relocated by the GC whilst the JavaThread is accessing it. This >>>> manifested in a number of tests using JVM TI Agent threads and JVM >>>> TI RawMonitors because the JavaThread's were marked _thread_blocked >>>> and hence safepoint-safe, and we read a non-zero value for the >>>> interrupted field even though we had never been interrupted. >>>> >>>> This problem existed in all the code that checks for interruption >>>> when "waiting": >>>> >>>> - Parker::park (the code underpinning >>>> java.util.concurrent.LockSupport.park()) >>>> >>>> To fix this code I simply deleted a late check of the interrupted >>>> field. The check was not needed because if an interrupt has occurred >>>> then we will find the ParkEvent in a signalled state. >>>> >>>> - ObjectMonitor::wait >>>> >>>> Here the late check of the interrupted state is essential as we >>>> reset the ParkEvent after an earlier check of the interrupted state. >>>> But the fix was simply achieved by moving the check slightly earlier >>>> before we use ThreadBlockInVm to become _thread_blocked. >>>> >>>> - RawMonitor::wait >>>> >>>> This fix was much more involved. The RawMonitor code directly >>>> transitions the JavaThread from _thread_in_Native to >>>> _thread_blocked. This is safe from a safepoint perspective because >>>> they are equivalent safepoint-safe states. To allow access to the >>>> interrupted field I have to transition from native to _thread_in_vm, >>>> and that has to be done by proper thread-state transitions to ensure >>>> correct access to the oop and its fields. Having done that I can >>>> then use ThreadBlockInVM for the transitions to blocked. However, as >>>> the old code noted it can't use proper thread-state transitions as >>>> this will lead to deadlocks with the VMThread that can also use >>>> RawMonitors when executing various event callbacks. To deal with >>>> that we have to note that the real constraint is that the JavaThread >>>> cannot block at a safepoint whilst it holds the RawMonitor. Hence >>>> the fix was push all the interrupt checking code and the >>>> thread-state transitions to the lowest level of RawMonitorWait, >>>> around the final park() call, after we have enqueued the waiter and >>>> released the monitor. That avoids any deadlock possibility. >>>> >>>> I also added checks to is_interrupted/interrupted to ensure they are >>>> only called by a thread in a suitable state. This should only be the >>>> VMThread (as a consequence of the Thread.stop implementation >>>> occurring at a safepoint and issuing a JavaThread::interrupt() call >>>> to unblock the target); or a JavaThread that is not >>>> _thread_in_native or _thread_blocked. >>>> >>>> Testing: (still finalizing) >>>> ?- tiers 1 - 6 (Oracle platforms) >>>> ?- Local Linux testing >>>> ? - vmTestbase/nsk/monitoring/ >>>> ? - vmTestbase/nsk/jdwp >>>> ? - vmTestbase/nsk/jdb/ >>>> ? - vmTestbase/nsk/jdi/ >>>> ? - vmTestbase/nsk/jvmti/ >>>> ? - serviceability/jvmti/ >>>> ? - serviceability/jdwp >>>> ? - JDK: java/lang/management >>>> ???????? com/sun/management >>>> >>>> ** Note that this applies to all accesses we make via code in >>>> javaClasses.*. For this particular code I thought about adding a >>>> guard in JavaThread::threadObj() but it turns out when we generate a >>>> crash report we access the Thread's name() field and that can happen >>>> when in any state, so we'd always trigger a secondary assertion >>>> failure during error reporting if we did that. Note that accessing >>>> name() can still easily lead to secondary assertions failures as I >>>> discovered when trying to debug this and print the thread name out - >>>> I would see an is_instance assertion fail checking that the Thread >>>> name() is an instance of java.lang.String! >>>> >>>> Thanks, >>>> David >>>> ----- >>> > From coleen.phillimore at oracle.com Fri Nov 15 03:21:15 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 14 Nov 2019 22:21:15 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <2c881c25-0fc2-d9f9-a6c8-1227c74bc004@oracle.com> References: <2c881c25-0fc2-d9f9-a6c8-1227c74bc004@oracle.com> Message-ID: On 11/14/19 8:34 PM, David Holmes wrote: > Hi Coleen, > > On 15/11/2019 11:15 am, coleen.phillimore at oracle.com wrote: >> Summary: Don't post information which uses metadata from unloaded >> nmethods >> >> Tested tier1-3 and 100 times with test that failed (reproduced >> failure without the fix). >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 > > Fix looks fine. Do we want assert "nm->method() != NULL" somewhere? I could add an assert below the if, but the application will crash in a few lines later if it's null. I'd also tested a version that had: ?? if (nm->method() == NULL) { ????? return; ?? } but thought is_alive was a more accurate test. thanks, Coleen > > Thanks, > David > >> Thanks, >> Coleen From coleen.phillimore at oracle.com Fri Nov 15 12:39:41 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 15 Nov 2019 07:39:41 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <8b826d49-0a8a-7571-5404-9f5bd2954836@oracle.com> References: <8b826d49-0a8a-7571-5404-9f5bd2954836@oracle.com> Message-ID: On 11/14/19 9:07 PM, Chris Plummer wrote: > Hi Coleen, > > Is it ok to end up missing some CompiledMethodLoad events? The spec says: > > "Sent when a method is compiled and loaded into memory by the VM. If > it is unloaded, the CompiledMethodUnload event is sent. If it is > moved, the CompiledMethodUnload event is sent, followed by a new > CompiledMethodLoad event. Note that a single method may have multiple > compiled forms, and that this event will be sent for each form. " > > So a method was still "compiled and loaded into memory", right? We > just didn't get the event out before it was too late. Is the > CompiledMethodUnload still sent? Yes, the CompiledMethodUnload event would be sent for this. My first version of my change reported the event without the extra information (inlining and some code blob address location maps). Maybe that would be better.?? Here it is and tested successfully with the testcase that crashed. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.02/webrev The more I look at this code, the more problems I see. get_and_cache_jmethod_id() will send back a jmethodID from when the method was live.? But like the class unloading event, a user cannot trust that the Method* in the jmethodID points to anything valid. So the spec for CompiledMethodLoad event should say for the method, like the CompiledMethodUnload event: Compiled method being unloaded. For identification of the compiled method only -- the class may be unloaded and therefore the method should not be used as an argument to further JNI or JVMTI functions. Or we don't send the event like 01.? Either one doesn't crash. Thanks, Coleen > > thanks, > > Chris > > On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >> Summary: Don't post information which uses metadata from unloaded >> nmethods >> >> Tested tier1-3 and 100 times with test that failed (reproduced >> failure without the fix). >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >> >> Thanks, >> Coleen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Fri Nov 15 12:48:28 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 15 Nov 2019 07:48:28 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: <8b826d49-0a8a-7571-5404-9f5bd2954836@oracle.com> Message-ID: I meant to add myself to the To list. On 11/15/19 7:39 AM, coleen.phillimore at oracle.com wrote: > > > On 11/14/19 9:07 PM, Chris Plummer wrote: >> Hi Coleen, >> >> Is it ok to end up missing some CompiledMethodLoad events? The spec >> says: >> >> "Sent when a method is compiled and loaded into memory by the VM. If >> it is unloaded, the CompiledMethodUnload event is sent. If it is >> moved, the CompiledMethodUnload event is sent, followed by a new >> CompiledMethodLoad event. Note that a single method may have multiple >> compiled forms, and that this event will be sent for each form. " >> >> So a method was still "compiled and loaded into memory", right? We >> just didn't get the event out before it was too late. Is the >> CompiledMethodUnload still sent? > > Yes, the CompiledMethodUnload event would be sent for this. > > My first version of my change reported the event without the extra > information (inlining and some code blob address location maps). Maybe > that would be better.?? Here it is and tested successfully with the > testcase that crashed. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.02/webrev > > The more I look at this code, the more problems I see. > get_and_cache_jmethod_id() will send back a jmethodID from when the > method was live.? But like the class unloading event, a user cannot > trust that the Method* in the jmethodID points to anything valid. > > So the spec for CompiledMethodLoad event should say for the method, > like the CompiledMethodUnload event: > Compiled method being unloaded. For identification of the compiled > method only -- the class may be unloaded and therefore the method > should not be used as an argument to further JNI or JVMTI functions. > > Or we don't send the event like 01.? Either one doesn't crash. > > Thanks, > Coleen > >> >> thanks, >> >> Chris >> >> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>> Summary: Don't post information which uses metadata from unloaded >>> nmethods >>> >>> Tested tier1-3 and 100 times with test that failed (reproduced >>> failure without the fix). >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>> >>> Thanks, >>> Coleen >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Fri Nov 15 16:29:14 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 15 Nov 2019 11:29:14 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: <8b826d49-0a8a-7571-5404-9f5bd2954836@oracle.com> Message-ID: <11b18836-e61a-af76-56b6-a57f8b87b5f6@oracle.com> On 11/15/19 7:48 AM, coleen.phillimore at oracle.com wrote: > I meant to add myself to the To list. > > On 11/15/19 7:39 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 11/14/19 9:07 PM, Chris Plummer wrote: >>> Hi Coleen, >>> >>> Is it ok to end up missing some CompiledMethodLoad events? The spec >>> says: >>> >>> "Sent when a method is compiled and loaded into memory by the VM. If >>> it is unloaded, the CompiledMethodUnload event is sent. If it is >>> moved, the CompiledMethodUnload event is sent, followed by a new >>> CompiledMethodLoad event. Note that a single method may have >>> multiple compiled forms, and that this event will be sent for each >>> form. " >>> >>> So a method was still "compiled and loaded into memory", right? We >>> just didn't get the event out before it was too late. Is the >>> CompiledMethodUnload still sent? >> >> Yes, the CompiledMethodUnload event would be sent for this. >> >> My first version of my change reported the event without the extra >> information (inlining and some code blob address location maps).? >> Maybe that would be better.?? Here it is and tested successfully with >> the testcase that crashed. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.02/webrev src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp ??? No comments. src/hotspot/share/prims/jvmtiExport.cpp ??? L2132: ? // It's not safe to look at metadata for unloaded methods. ??? L2133: ? if (nm->method() == NULL) { ??? L2134: ??? return NULL; ??????? nit - The comment is about why we're returning NULL early so ??????? perhaps the comment should be inside the if-statement (between ??????? L2133 and L2134). ??????? In the previous version (01) you checked (!nm->is_alive()) in a ??????? different place. That function is defined: ????????? bool? is_alive() const { return _state < unloaded; } ??????? And states like unloaded are defined like this: ? enum { not_installed = -1, // in construction, only the owner doing the construction is ???????????????????????????? // allowed to advance state ???????? in_use??????? = 0,? // executable nmethod ???????? not_used????? = 1,? // not entrant, but revivable ???????? not_entrant?? = 2,? // marked for deoptimization but activations may still exist, ???????????????????????????? // will be transformed to zombie when all activations are gone ???????? unloaded????? = 3,? // there should be no activations, should not be called, will be ???????????????????????????? // transformed to zombie by the sweeper, when not "locked in vm". ???????? zombie??????? = 4?? // no activations exist, nmethod is ready for purge ? }; ??????? so by switching to (nm->method() == NULL) you are going more ??????? directly to whether there is useful data available, but it ??????? might be racier. I'm not sure at which state nm->method() ??????? starts to return NULL. I'm also not sure if nm->method() can ??????? return NULL for any of the earlier states, e.g., not_used. ??????? I agree with your earlier comment that nm->is_alive() seems ??????? safer. Your call on which to use. Thumbs up. My only non-theory comment is a nit so I don't need to see a new webrev. Dan >> >> The more I look at this code, the more problems I see. >> get_and_cache_jmethod_id() will send back a jmethodID from when the >> method was live.? But like the class unloading event, a user cannot >> trust that the Method* in the jmethodID points to anything valid. >> >> So the spec for CompiledMethodLoad event should say for the method, >> like the CompiledMethodUnload event: >> Compiled method being unloaded. For identification of the compiled >> method only -- the class may be unloaded and therefore the method >> should not be used as an argument to further JNI or JVMTI functions. >> >> Or we don't send the event like 01.? Either one doesn't crash. >> >> Thanks, >> Coleen >> >>> >>> thanks, >>> >>> Chris >>> >>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: Don't post information which uses metadata from unloaded >>>> nmethods >>>> >>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>> failure without the fix). >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>> >>>> Thanks, >>>> Coleen >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Nov 15 19:13:07 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 15 Nov 2019 11:13:07 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: <8b826d49-0a8a-7571-5404-9f5bd2954836@oracle.com> Message-ID: <4849fd97-10f6-1d9c-435f-192a5ee7b72d@oracle.com> On 11/15/19 4:39 AM, coleen.phillimore at oracle.com wrote: > > > On 11/14/19 9:07 PM, Chris Plummer wrote: >> Hi Coleen, >> >> Is it ok to end up missing some CompiledMethodLoad events? The spec >> says: >> >> "Sent when a method is compiled and loaded into memory by the VM. If >> it is unloaded, the CompiledMethodUnload event is sent. If it is >> moved, the CompiledMethodUnload event is sent, followed by a new >> CompiledMethodLoad event. Note that a single method may have multiple >> compiled forms, and that this event will be sent for each form. " >> >> So a method was still "compiled and loaded into memory", right? We >> just didn't get the event out before it was too late. Is the >> CompiledMethodUnload still sent? > > Yes, the CompiledMethodUnload event would be sent for this. > > My first version of my change reported the event without the extra > information (inlining and some code blob address location maps). Maybe > that would be better.?? Here it is and tested successfully with the > testcase that crashed. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.02/webrev > > The more I look at this code, the more problems I see. > get_and_cache_jmethod_id() will send back a jmethodID from when the > method was live.? But like the class unloading event, a user cannot > trust that the Method* in the jmethodID points to anything valid. > > So the spec for CompiledMethodLoad event should say for the method, > like the CompiledMethodUnload event: > Compiled method being unloaded. For identification of the compiled > method only -- the class may be unloaded and therefore the method > should not be used as an argument to further JNI or JVMTI functions. Yes, I agree that with this approach a spec clarification is needed. I just wonder about compatibility for agents that assume it is a valid jmethodID. I suppose if an agent treated it as valid and crashed as a result, this is just moving the crash from the jvmti impl to the agent. We also have to consider that agents might currently treat it as valid for functional purposes, and likely never run into this problem, but with the spec update technically that would mean that they would no longer be able to. However, what's likely is that any existing agent would just continue to be ignorant of this spec change, and continue to run with no issue. > > Or we don't send the event like 01.? Either one doesn't crash. Yeah, I guess I don't have a good answer here. Seems like both approaches have issues. Maybe the correct fix is to keep the nm live until the deferred event can be sent. thanks, Chris > > Thanks, > Coleen > >> >> thanks, >> >> Chris >> >> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>> Summary: Don't post information which uses metadata from unloaded >>> nmethods >>> >>> Tested tier1-3 and 100 times with test that failed (reproduced >>> failure without the fix). >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>> >>> Thanks, >>> Coleen >> > From serguei.spitsyn at oracle.com Fri Nov 15 21:45:34 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 15 Nov 2019 13:45:34 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: Message-ID: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> Hi Coleen, I have some questions. Both the compiler method load and unload are posted as deferred events. Both events keep the nmethod alive until the ServiceThread processes the event. The implementation is: JvmtiDeferredEvent JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { ? . . . ? // Keep the nmethod alive until the ServiceThread can process ? // this deferred event. ? nmethodLocker::lock_nmethod(nm); ? return event; } JvmtiDeferredEvent JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, jmethodID id, const void* code) { ? . . . ? // Keep the nmethod alive until the ServiceThread can process ? // this deferred event. This will keep the memory for the ? // generated code from being reused too early. We pass ? // zombie_ok == true here so that our nmethod that was just ? // made into a zombie can be locked. ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); ? return event; } void JvmtiDeferredEvent::post() { ? assert(ServiceThread::is_service_thread(Thread::current()), ???????? "Service thread must post enqueued events"); ? switch(_type) { ??? case TYPE_COMPILED_METHOD_LOAD: { ????? nmethod* nm = _event_data.compiled_method_load; ????? JvmtiExport::post_compiled_method_load(nm); ????? // done with the deferred event so unlock the nmethod ????? nmethodLocker::unlock_nmethod(nm); ????? break; ??? } ??? case TYPE_COMPILED_METHOD_UNLOAD: { ????? nmethod* nm = _event_data.compiled_method_unload.nm; ????? JvmtiExport::post_compiled_method_unload( ??????? _event_data.compiled_method_unload.method_id, ??????? _event_data.compiled_method_unload.code_begin); ????? // done with the deferred event so unlock the nmethod ????? nmethodLocker::unlock_nmethod(nm); ????? break; ??? } ??? . . . ? } } Then I wonder how is it possible for the nmethod to be not alive here?: 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { . . . 2173 // It's not safe to look at metadata for unloaded methods. 2174 if (!nm->is_alive()) { 2175 return; 2176 } At least, it lokks like something else is broken. Do I miss something important here? Thanks, Serguei On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: > Summary: Don't post information which uses metadata from unloaded > nmethods > > Tested tier1-3 and 100 times with test that failed (reproduced failure > without the fix). > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8173361 > > Thanks, > Coleen -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Nov 15 22:07:04 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 15 Nov 2019 14:07:04 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> Message-ID: <750c4e4e-9c04-700d-98be-073062809bf8@oracle.com> Some additional details on this... The side effect of a nmethodLocker::lock_nmethod(nm) call is the nmethod::is_locked_by_vm() returns true. It is checked in the nmethod::flush(), nmethod::can_convert_to_zombie() and some other places. If it is still not safe to look at metadata when the nmethod is locked then there is some code which does not honor this locking convention. Thanks, Serguei On 11/15/19 1:45 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > I have some questions. > > Both the compiler method load and unload are posted as deferred events. > Both events keep the nmethod alive until the ServiceThread processes > the event. > > The implementation is: > > JvmtiDeferredEvent > JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { > ? . . . > ? // Keep the nmethod alive until the ServiceThread can process > ? // this deferred event. > ? nmethodLocker::lock_nmethod(nm); > ? return event; > } > > JvmtiDeferredEvent > JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, > jmethodID id, const void* code) { > ? . . . > ? // Keep the nmethod alive until the ServiceThread can process > ? // this deferred event. This will keep the memory for the > ? // generated code from being reused too early. We pass > ? // zombie_ok == true here so that our nmethod that was just > ? // made into a zombie can be locked. > ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); > ? return event; > } > > void JvmtiDeferredEvent::post() { > ? assert(ServiceThread::is_service_thread(Thread::current()), > ???????? "Service thread must post enqueued events"); > ? switch(_type) { > ??? case TYPE_COMPILED_METHOD_LOAD: { > ????? nmethod* nm = _event_data.compiled_method_load; > ????? JvmtiExport::post_compiled_method_load(nm); > ????? // done with the deferred event so unlock the nmethod > ????? nmethodLocker::unlock_nmethod(nm); > ????? break; > ??? } > ??? case TYPE_COMPILED_METHOD_UNLOAD: { > ????? nmethod* nm = _event_data.compiled_method_unload.nm; > ????? JvmtiExport::post_compiled_method_unload( > ??????? _event_data.compiled_method_unload.method_id, > ??????? _event_data.compiled_method_unload.code_begin); > ????? // done with the deferred event so unlock the nmethod > ????? nmethodLocker::unlock_nmethod(nm); > ????? break; > ??? } > ??? . . . > ? } > } > > Then I wonder how is it possible for the nmethod to be not alive here?: > 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { > . . . > 2173 // It's not safe to look at metadata for unloaded methods. > 2174 if (!nm->is_alive()) { > 2175 return; > 2176 } > At least, it lokks like something else is broken. > Do I miss something important here? > > Thanks, > Serguei > > > On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >> Summary: Don't post information which uses metadata from unloaded >> nmethods >> >> Tested tier1-3 and 100 times with test that failed (reproduced >> failure without the fix). >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >> >> Thanks, >> Coleen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Fri Nov 15 22:12:04 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 15 Nov 2019 17:12:04 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> Message-ID: <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> Hi, I've been working on answers to these questions, so I'll start with this one. The nmethodLocker keeps the nmethod from being reclaimed (made_zombie or memory released) by the sweeper, but the nmethod could be unloaded.? Unloading the nmethod clears the Method* _method field. The post_compiled_method_load event needs the _method field to look at things like inlining and ScopeDesc fields.?? If the nmethod is unloaded, some of the oops are dead.? There are "holder" oops that correspond to the metadata in the nmethod.? If these oops are dead, causing the nmethod to get unloaded, then the metadata may not be valid. So my change 02 looks for a NULL nmethod._method field to tell whether we can post information about the nmethod. There's code in nmethod.cpp like: jmethodID nmethod::get_and_cache_jmethod_id() { ? if (_jmethod_id == NULL) { ??? // Cache the jmethod_id since it can no longer be looked up once the ??? // method itself has been marked for unloading. ??? _jmethod_id = method()->jmethod_id(); ? } ? return _jmethod_id; } Which was added when post_method_load and unload were turned into deferred events. I put more debugging in the bug to show this crash was from an unloaded nmethod. Coleen On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > I have some questions. > > Both the compiler method load and unload are posted as deferred events. > Both events keep the nmethod alive until the ServiceThread processes > the event. > > The implementation is: > > JvmtiDeferredEvent > JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { > ? . . . > ? // Keep the nmethod alive until the ServiceThread can process > ? // this deferred event. > ? nmethodLocker::lock_nmethod(nm); > ? return event; > } > > JvmtiDeferredEvent > JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, > jmethodID id, const void* code) { > ? . . . > ? // Keep the nmethod alive until the ServiceThread can process > ? // this deferred event. This will keep the memory for the > ? // generated code from being reused too early. We pass > ? // zombie_ok == true here so that our nmethod that was just > ? // made into a zombie can be locked. > ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); > ? return event; > } > > void JvmtiDeferredEvent::post() { > ? assert(ServiceThread::is_service_thread(Thread::current()), > ???????? "Service thread must post enqueued events"); > ? switch(_type) { > ??? case TYPE_COMPILED_METHOD_LOAD: { > ????? nmethod* nm = _event_data.compiled_method_load; > ????? JvmtiExport::post_compiled_method_load(nm); > ????? // done with the deferred event so unlock the nmethod > ????? nmethodLocker::unlock_nmethod(nm); > ????? break; > ??? } > ??? case TYPE_COMPILED_METHOD_UNLOAD: { > ????? nmethod* nm = _event_data.compiled_method_unload.nm; > ????? JvmtiExport::post_compiled_method_unload( > ??????? _event_data.compiled_method_unload.method_id, > ??????? _event_data.compiled_method_unload.code_begin); > ????? // done with the deferred event so unlock the nmethod > ????? nmethodLocker::unlock_nmethod(nm); > ????? break; > ??? } > ??? . . . > ? } > } > > Then I wonder how is it possible for the nmethod to be not alive here?: > 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { > . . . > 2173 // It's not safe to look at metadata for unloaded methods. > 2174 if (!nm->is_alive()) { > 2175 return; > 2176 } > At least, it lokks like something else is broken. > Do I miss something important here? > > Thanks, > Serguei > > > On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >> Summary: Don't post information which uses metadata from unloaded >> nmethods >> >> Tested tier1-3 and 100 times with test that failed (reproduced >> failure without the fix). >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >> >> Thanks, >> Coleen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Fri Nov 15 22:21:17 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 15 Nov 2019 17:21:17 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <4849fd97-10f6-1d9c-435f-192a5ee7b72d@oracle.com> References: <8b826d49-0a8a-7571-5404-9f5bd2954836@oracle.com> <4849fd97-10f6-1d9c-435f-192a5ee7b72d@oracle.com> Message-ID: <7f26b79c-23c6-5262-816f-eb1a97233b17@oracle.com> On 11/15/19 2:13 PM, Chris Plummer wrote: > On 11/15/19 4:39 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 11/14/19 9:07 PM, Chris Plummer wrote: >>> Hi Coleen, >>> >>> Is it ok to end up missing some CompiledMethodLoad events? The spec >>> says: >>> >>> "Sent when a method is compiled and loaded into memory by the VM. If >>> it is unloaded, the CompiledMethodUnload event is sent. If it is >>> moved, the CompiledMethodUnload event is sent, followed by a new >>> CompiledMethodLoad event. Note that a single method may have >>> multiple compiled forms, and that this event will be sent for each >>> form. " >>> >>> So a method was still "compiled and loaded into memory", right? We >>> just didn't get the event out before it was too late. Is the >>> CompiledMethodUnload still sent? >> >> Yes, the CompiledMethodUnload event would be sent for this. >> >> My first version of my change reported the event without the extra >> information (inlining and some code blob address location maps). >> Maybe that would be better.?? Here it is and tested successfully with >> the testcase that crashed. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8173361.02/webrev >> >> The more I look at this code, the more problems I see. >> get_and_cache_jmethod_id() will send back a jmethodID from when the >> method was live.? But like the class unloading event, a user cannot >> trust that the Method* in the jmethodID points to anything valid. >> >> So the spec for CompiledMethodLoad event should say for the method, >> like the CompiledMethodUnload event: >> Compiled method being unloaded. For identification of the compiled >> method only -- the class may be unloaded and therefore the method >> should not be used as an argument to further JNI or JVMTI functions. > Yes, I agree that with this approach a spec clarification is needed. I > just wonder about compatibility for agents that assume it is a valid > jmethodID. I suppose if an agent treated it as valid and crashed as a > result, this is just moving the crash from the jvmti impl to the > agent. We also have to consider that agents might currently treat it > as valid for functional purposes, and likely never run into this > problem, but with the spec update technically that would mean that > they would no longer be able to. However, what's likely is that any > existing agent would just continue to be ignorant of this spec change, > and continue to run with no issue. I'm not sure I'm following this.? Yes, if an agent tries to do something with the jmethodID, like call it, in the case where it has been unloaded, it will crash. The nmethod was unloaded because some oop in it was no longer valid.? It may be that the class is unloaded, in which case the Method* in the jmethodID is a bad pointer.? So that was a preexisting problem. The CompiledMethodLoad event saves the jmethodID from when the nmethod was created, so it's good at this point.? If the nmethod is unloaded because some *other* oop in it is bad, the Method* would still be valid.?? My change retrieves the one saved when the load event was created.? It's only the case if the nmethod is unloaded because the class containing the Method* in _method is unloaded that would get a crash. The specification should still be clarified to say this though. > >> >> Or we don't send the event like 01.? Either one doesn't crash. > Yeah, I guess I don't have a good answer here. Seems like both > approaches have issues. Maybe the correct fix is to keep the nm live > until the deferred event can be sent. This is not feasable, since the nmethod may have bad oops in it. Coleen > > thanks, > > Chris >> >> Thanks, >> Coleen >> >>> >>> thanks, >>> >>> Chris >>> >>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: Don't post information which uses metadata from unloaded >>>> nmethods >>>> >>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>> failure without the fix). >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>> >>>> Thanks, >>>> Coleen >>> >> > > From coleen.phillimore at oracle.com Fri Nov 15 22:27:27 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 15 Nov 2019 17:27:27 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <11b18836-e61a-af76-56b6-a57f8b87b5f6@oracle.com> References: <8b826d49-0a8a-7571-5404-9f5bd2954836@oracle.com> <11b18836-e61a-af76-56b6-a57f8b87b5f6@oracle.com> Message-ID: <2c5f8637-84c5-5ecf-ca7f-ae242f320b06@oracle.com> Thanks Dan for looking at this. On 11/15/19 11:29 AM, Daniel D. Daugherty wrote: > On 11/15/19 7:48 AM, coleen.phillimore at oracle.com wrote: >> I meant to add myself to the To list. >> >> On 11/15/19 7:39 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 11/14/19 9:07 PM, Chris Plummer wrote: >>>> Hi Coleen, >>>> >>>> Is it ok to end up missing some CompiledMethodLoad events? The spec >>>> says: >>>> >>>> "Sent when a method is compiled and loaded into memory by the VM. >>>> If it is unloaded, the CompiledMethodUnload event is sent. If it is >>>> moved, the CompiledMethodUnload event is sent, followed by a new >>>> CompiledMethodLoad event. Note that a single method may have >>>> multiple compiled forms, and that this event will be sent for each >>>> form. " >>>> >>>> So a method was still "compiled and loaded into memory", right? We >>>> just didn't get the event out before it was too late. Is the >>>> CompiledMethodUnload still sent? >>> >>> Yes, the CompiledMethodUnload event would be sent for this. >>> >>> My first version of my change reported the event without the extra >>> information (inlining and some code blob address location maps).? >>> Maybe that would be better.?? Here it is and tested successfully >>> with the testcase that crashed. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8173361.02/webrev > > src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp > ??? No comments. > > src/hotspot/share/prims/jvmtiExport.cpp > ??? L2132: ? // It's not safe to look at metadata for unloaded methods. > ??? L2133: ? if (nm->method() == NULL) { > ??? L2134: ??? return NULL; > ??????? nit - The comment is about why we're returning NULL early so > ??????? perhaps the comment should be inside the if-statement (between > ??????? L2133 and L2134). > Sure, I can change that. > ??????? In the previous version (01) you checked (!nm->is_alive()) in a > ??????? different place. That function is defined: > > ????????? bool? is_alive() const { return _state < unloaded; } > > ??????? And states like unloaded are defined like this: > > ? enum { not_installed = -1, // in construction, only the owner doing > the construction is > ???????????????????????????? // allowed to advance state > ???????? in_use??????? = 0,? // executable nmethod > ???????? not_used????? = 1,? // not entrant, but revivable > ???????? not_entrant?? = 2,? // marked for deoptimization but > activations may still exist, > ???????????????????????????? // will be transformed to zombie when all > activations are gone > ???????? unloaded????? = 3,? // there should be no activations, should > not be called, will be > ???????????????????????????? // transformed to zombie by the sweeper, > when not "locked in vm". > ???????? zombie??????? = 4?? // no activations exist, nmethod is ready > for purge > ? }; > > ??????? so by switching to (nm->method() == NULL) you are going more > ??????? directly to whether there is useful data available, but it > ??????? might be racier. I'm not sure at which state nm->method() > ??????? starts to return NULL. I'm also not sure if nm->method() can > ??????? return NULL for any of the earlier states, e.g., not_used. > > ??????? I agree with your earlier comment that nm->is_alive() seems > ??????? safer. Your call on which to use. I think my earlier comment was wrong.? If the method is unloaded, the _method field is zeroed before the unloaded _state is set. With concurrent class unloading, this event posting can run at the same time as make_unloaded(), if I'm reading this correctly. I should actually use Atomic::load to get the _method, in both of the places where I get the field. The _method is also zeroed with make_zombie call but this is blocked out by the nmethodLocker. Thanks, Coleen > > Thumbs up. My only non-theory comment is a nit so I don't need to > see a new webrev. > > Dan > > >>> >>> The more I look at this code, the more problems I see. >>> get_and_cache_jmethod_id() will send back a jmethodID from when the >>> method was live.? But like the class unloading event, a user cannot >>> trust that the Method* in the jmethodID points to anything valid. >>> >>> So the spec for CompiledMethodLoad event should say for the method, >>> like the CompiledMethodUnload event: >>> Compiled method being unloaded. For identification of the compiled >>> method only -- the class may be unloaded and therefore the method >>> should not be used as an argument to further JNI or JVMTI functions. >>> >>> Or we don't send the event like 01.? Either one doesn't crash. >>> >>> Thanks, >>> Coleen >>> >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>> Summary: Don't post information which uses metadata from unloaded >>>>> nmethods >>>>> >>>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>>> failure without the fix). >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Fri Nov 15 23:58:33 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 15 Nov 2019 15:58:33 -0800 Subject: RFR(XS): JDK-8187143: JDI crash in ~BufferBlob::MethodHandles adapters Message-ID: <38d122c1-bf71-ddbc-1e59-904e3d571909@oracle.com> Hi all, Please review small fix for https://bugs.openjdk.java.net/browse/JDK-8187143 The issue is not reproducible from JDK10, but now test fails NashornPopFrameTest fails with "MemberName required for invokeVirtual etc" error, i.e. this is a dup of https://bugs.openjdk.java.net/browse/JDK-8225620 The fix updates ProblemList.txt to point to 8225620 The diff: --- a/test/jdk/ProblemList.txt Fri Nov 15 14:22:24 2019 -0800 +++ b/test/jdk/ProblemList.txt Fri Nov 15 15:21:53 2019 -0800 @@ -845,7 +845,7 @@ com/sun/jdi/RepStep.java 8043571 generic-all -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all ############################################################################ --alex From daniil.x.titov at oracle.com Sat Nov 16 00:12:35 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Fri, 15 Nov 2019 16:12:35 -0800 Subject: RFR(XS): JDK-8187143: JDI crash in ~BufferBlob::MethodHandles adapters In-Reply-To: <54E425A3-35EA-48D5-9F7B-54DD1F213EFF@oracle.com> References: <54E425A3-35EA-48D5-9F7B-54DD1F213EFF@oracle.com> Message-ID: <2A962022-B945-4B4E-9263-5C36D46CBB77@oracle.com> Hi Alex, The change looks good to me. Thanks! Best regards, Daniil ?On 11/15/19, 3:58 PM, "serviceability-dev on behalf of Alex Menkov" wrote: Hi all, Please review small fix for https://bugs.openjdk.java.net/browse/JDK-8187143 The issue is not reproducible from JDK10, but now test fails NashornPopFrameTest fails with "MemberName required for invokeVirtual etc" error, i.e. this is a dup of https://bugs.openjdk.java.net/browse/JDK-8225620 The fix updates ProblemList.txt to point to 8225620 The diff: --- a/test/jdk/ProblemList.txt Fri Nov 15 14:22:24 2019 -0800 +++ b/test/jdk/ProblemList.txt Fri Nov 15 15:21:53 2019 -0800 @@ -845,7 +845,7 @@ com/sun/jdi/RepStep.java 8043571 generic-all -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all ############################################################################ --alex From chris.plummer at oracle.com Sat Nov 16 01:26:59 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 15 Nov 2019 17:26:59 -0800 Subject: RFR(XS): JDK-8187143: JDI crash in ~BufferBlob::MethodHandles adapters In-Reply-To: <38d122c1-bf71-ddbc-1e59-904e3d571909@oracle.com> References: <38d122c1-bf71-ddbc-1e59-904e3d571909@oracle.com> Message-ID: <4f1ee4ad-745a-d248-1e90-e79090234e94@oracle.com> Looks good. Chris On 11/15/19 3:58 PM, Alex Menkov wrote: > Hi all, > > Please review small fix for > https://bugs.openjdk.java.net/browse/JDK-8187143 > > The issue is not reproducible from JDK10, but now test fails > NashornPopFrameTest fails with "MemberName required for invokeVirtual > etc" error, i.e. this is a dup of > https://bugs.openjdk.java.net/browse/JDK-8225620 > > The fix updates ProblemList.txt to point to 8225620 > > The diff: > --- a/test/jdk/ProblemList.txt??? Fri Nov 15 14:22:24 2019 -0800 > +++ b/test/jdk/ProblemList.txt??? Fri Nov 15 15:21:53 2019 -0800 > @@ -845,7 +845,7 @@ > > ?com/sun/jdi/RepStep.java 8043571 generic-all > > -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all > +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all > > > ############################################################################ > > > > --alex From chris.plummer at oracle.com Sat Nov 16 01:31:30 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 15 Nov 2019 17:31:30 -0800 Subject: RFR(XS): JDK-8187143: JDI crash in ~BufferBlob::MethodHandles adapters In-Reply-To: <4f1ee4ad-745a-d248-1e90-e79090234e94@oracle.com> References: <38d122c1-bf71-ddbc-1e59-904e3d571909@oracle.com> <4f1ee4ad-745a-d248-1e90-e79090234e94@oracle.com> Message-ID: Hi Alex, Actually I take that back. Removing from the problem list should either be done as part of a bug fix, for which there is none here, or done for a bug specifically for removing from the problem list. Since you are not actually fixing JDK-8187143 (it seems to no longer be a bug), I suggest you close it as CNR and file a new bug for the problem list update. thanks, Chris On 11/15/19 5:26 PM, Chris Plummer wrote: > Looks good. > > Chris > > On 11/15/19 3:58 PM, Alex Menkov wrote: >> Hi all, >> >> Please review small fix for >> https://bugs.openjdk.java.net/browse/JDK-8187143 >> >> The issue is not reproducible from JDK10, but now test fails >> NashornPopFrameTest fails with "MemberName required for invokeVirtual >> etc" error, i.e. this is a dup of >> https://bugs.openjdk.java.net/browse/JDK-8225620 >> >> The fix updates ProblemList.txt to point to 8225620 >> >> The diff: >> --- a/test/jdk/ProblemList.txt??? Fri Nov 15 14:22:24 2019 -0800 >> +++ b/test/jdk/ProblemList.txt??? Fri Nov 15 15:21:53 2019 -0800 >> @@ -845,7 +845,7 @@ >> >> ?com/sun/jdi/RepStep.java 8043571 generic-all >> >> -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all >> +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all >> >> >> ############################################################################ >> >> >> >> --alex > > From serguei.spitsyn at oracle.com Sat Nov 16 04:17:01 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 15 Nov 2019 20:17:01 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> Message-ID: <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> Hi Coleen, On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: > > Hi, I've been working on answers to these questions, so I'll start > with this one. > > The nmethodLocker keeps the nmethod from being reclaimed (made_zombie > or memory released) by the sweeper, but the nmethod could be > unloaded.? Unloading the nmethod clears the Method* _method field. Yes, I see it is done in the nmethod::make_unloaded(). > The post_compiled_method_load event needs the _method field to look at > things like inlining and ScopeDesc fields.?? If the nmethod is > unloaded, some of the oops are dead.? There are "holder" oops that > correspond to the metadata in the nmethod.? If these oops are dead, > causing the nmethod to get unloaded, then the metadata may not be valid. > > So my change 02 looks for a NULL nmethod._method field to tell whether > we can post information about the nmethod. > > There's code in nmethod.cpp like: > > jmethodID nmethod::get_and_cache_jmethod_id() { > ? if (_jmethod_id == NULL) { > ??? // Cache the jmethod_id since it can no longer be looked up once the > ??? // method itself has been marked for unloading. > ??? _jmethod_id = method()->jmethod_id(); > ? } > ? return _jmethod_id; > } > > Which was added when post_method_load and unload were turned into > deferred events. Could we cache the jmethodID in the JvmtiDeferredEvent::compiled_method_load_event similarly as we do in the JvmtiDeferredEvent::compiled_method_unload_event? This would help to get rid of the dependency on the nmethod::_method. Do we depend on any other nmethod fields? Thanks, Serguei > I put more debugging in the bug to show this crash was from an > unloaded nmethod. > > Coleen > > > On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >> Hi Coleen, >> >> I have some questions. >> >> Both the compiler method load and unload are posted as deferred events. >> Both events keep the nmethod alive until the ServiceThread processes >> the event. >> >> The implementation is: >> >> JvmtiDeferredEvent >> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >> ? . . . >> ? // Keep the nmethod alive until the ServiceThread can process >> ? // this deferred event. >> ? nmethodLocker::lock_nmethod(nm); >> ? return event; >> } >> >> JvmtiDeferredEvent >> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >> jmethodID id, const void* code) { >> ? . . . >> ? // Keep the nmethod alive until the ServiceThread can process >> ? // this deferred event. This will keep the memory for the >> ? // generated code from being reused too early. We pass >> ? // zombie_ok == true here so that our nmethod that was just >> ? // made into a zombie can be locked. >> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >> ? return event; >> } >> >> void JvmtiDeferredEvent::post() { >> ? assert(ServiceThread::is_service_thread(Thread::current()), >> ???????? "Service thread must post enqueued events"); >> ? switch(_type) { >> ??? case TYPE_COMPILED_METHOD_LOAD: { >> ????? nmethod* nm = _event_data.compiled_method_load; >> ????? JvmtiExport::post_compiled_method_load(nm); >> ????? // done with the deferred event so unlock the nmethod >> ????? nmethodLocker::unlock_nmethod(nm); >> ????? break; >> ??? } >> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >> ????? JvmtiExport::post_compiled_method_unload( >> ??????? _event_data.compiled_method_unload.method_id, >> ??????? _event_data.compiled_method_unload.code_begin); >> ????? // done with the deferred event so unlock the nmethod >> ????? nmethodLocker::unlock_nmethod(nm); >> ????? break; >> ??? } >> ??? . . . >> ? } >> } >> >> Then I wonder how is it possible for the nmethod to be not alive here?: >> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >> . . . >> 2173 // It's not safe to look at metadata for unloaded methods. >> 2174 if (!nm->is_alive()) { >> 2175 return; >> 2176 } >> At least, it lokks like something else is broken. >> Do I miss something important here? >> >> Thanks, >> Serguei >> >> >> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>> Summary: Don't post information which uses metadata from unloaded >>> nmethods >>> >>> Tested tier1-3 and 100 times with test that failed (reproduced >>> failure without the fix). >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>> >>> Thanks, >>> Coleen >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Sat Nov 16 07:47:20 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 15 Nov 2019 23:47:20 -0800 Subject: RFR(S) : 8233462 : serviceability/tmtools/jstat tests times out with -Xcomp Message-ID: <2B7FD259-FBC2-4850-B1B3-81D1669F150B@oracle.com> http://cr.openjdk.java.net/~iignatyev//8233462/webrev.00 > 33 lines changed: 1 ins; 14 del; 18 mod; Hi all, could you please review this small fix for tmtools testlibrary? tmtools tests are believed to fail due to a deadlock-like situation b/w main test process and tmtools process: (from JBS) > it seems these tests attach jstat to the main test process, the same process which reads the tool's stdout/stderr, so there is a possibility that this will deadlock: jstat-process produces more output than the buffer can hold, so it blocks till someone (the main process reads it), while the main process waits till jstat completes. the patch changes serviceability/tmtools/share/common library (used by all serviceability/tmtools) to redirect tmtool's stdout and stderr into files instead of using jdk.test.lib.process.OutputAnalyzer; I've also added a bit of diagnostic output, so it will be easier to analyze future failures. webrev: http://cr.openjdk.java.net/~iignatyev//8233462/webrev.00 JBS: https://bugs.openjdk.java.net/browse/JDK-8233462 testing: - serviceability/tmtools on windows-x64,linux-x64,macosx-x64,solaris-sparcv9 - serviceability/tmtools 100 times on linux-x64-debug w/ '-Xcomp -ea -esa -XX:+TieredCompilation -XX:+DeoptimizeALot' (most of failures have been seen on this configuration) Thanks, -- Igor From coleen.phillimore at oracle.com Sat Nov 16 12:55:10 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Sat, 16 Nov 2019 07:55:10 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> Message-ID: <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >> >> Hi, I've been working on answers to these questions, so I'll start >> with this one. >> >> The nmethodLocker keeps the nmethod from being reclaimed (made_zombie >> or memory released) by the sweeper, but the nmethod could be >> unloaded.? Unloading the nmethod clears the Method* _method field. > > Yes, I see it is done in the nmethod::make_unloaded(). > >> The post_compiled_method_load event needs the _method field to look >> at things like inlining and ScopeDesc fields.?? If the nmethod is >> unloaded, some of the oops are dead.? There are "holder" oops that >> correspond to the metadata in the nmethod.? If these oops are dead, >> causing the nmethod to get unloaded, then the metadata may not be valid. >> >> So my change 02 looks for a NULL nmethod._method field to tell >> whether we can post information about the nmethod. >> >> There's code in nmethod.cpp like: >> >> jmethodID nmethod::get_and_cache_jmethod_id() { >> ? if (_jmethod_id == NULL) { >> ??? // Cache the jmethod_id since it can no longer be looked up once the >> ??? // method itself has been marked for unloading. >> ??? _jmethod_id = method()->jmethod_id(); >> ? } >> ? return _jmethod_id; >> } >> >> Which was added when post_method_load and unload were turned into >> deferred events. > > Could we cache the jmethodID in the > JvmtiDeferredEvent::compiled_method_load_event > similarly as we do in the > JvmtiDeferredEvent::compiled_method_unload_event? > This would help to get rid of the dependency on the nmethod::_method. > Do we depend on any other nmethod fields? Yes, there are other nmethod metadata that we rely on to print inline information, and this function JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses the ScopeDesc data in the nmethod. We do cache the jmethodID but that's not good enough.? See my last comment in the bug report.? The jmethodID can point to an unloaded method. I tried a version of keeping the nmethod alive, but the GC folks will hate it.? And it doesn't work and I hate it. My version 01 is the best, with the caveat that maybe it should check for _method == NULL instead of nmethod->is_alive().? I have to talk to Erik to see if there's a race with concurrent class unloading. Any application that depends on a compiled method loading event on a class that could be unloaded is a buggy application.? Applications should not rely on when the JIT compiler decides to compile a method!? This happens to us for a stress test.? Most applications will get most of their compiled method loading events as they normally do. Thanks, Coleen > > > Thanks, > Serguei > >> I put more debugging in the bug to show this crash was from an >> unloaded nmethod. >> >> Coleen >> >> >> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Coleen, >>> >>> I have some questions. >>> >>> Both the compiler method load and unload are posted as deferred events. >>> Both events keep the nmethod alive until the ServiceThread processes >>> the event. >>> >>> The implementation is: >>> >>> JvmtiDeferredEvent >>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>> ? . . . >>> ? // Keep the nmethod alive until the ServiceThread can process >>> ? // this deferred event. >>> ? nmethodLocker::lock_nmethod(nm); >>> ? return event; >>> } >>> >>> JvmtiDeferredEvent >>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>> jmethodID id, const void* code) { >>> ? . . . >>> ? // Keep the nmethod alive until the ServiceThread can process >>> ? // this deferred event. This will keep the memory for the >>> ? // generated code from being reused too early. We pass >>> ? // zombie_ok == true here so that our nmethod that was just >>> ? // made into a zombie can be locked. >>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>> ? return event; >>> } >>> >>> void JvmtiDeferredEvent::post() { >>> ? assert(ServiceThread::is_service_thread(Thread::current()), >>> ???????? "Service thread must post enqueued events"); >>> ? switch(_type) { >>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>> ????? nmethod* nm = _event_data.compiled_method_load; >>> ????? JvmtiExport::post_compiled_method_load(nm); >>> ????? // done with the deferred event so unlock the nmethod >>> ????? nmethodLocker::unlock_nmethod(nm); >>> ????? break; >>> ??? } >>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>> ????? JvmtiExport::post_compiled_method_unload( >>> ??????? _event_data.compiled_method_unload.method_id, >>> ??????? _event_data.compiled_method_unload.code_begin); >>> ????? // done with the deferred event so unlock the nmethod >>> ????? nmethodLocker::unlock_nmethod(nm); >>> ????? break; >>> ??? } >>> ??? . . . >>> ? } >>> } >>> >>> Then I wonder how is it possible for the nmethod to be not alive here?: >>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>> . . . >>> 2173 // It's not safe to look at metadata for unloaded methods. >>> 2174 if (!nm->is_alive()) { >>> 2175 return; >>> 2176 } >>> At least, it lokks like something else is broken. >>> Do I miss something important here? >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: Don't post information which uses metadata from unloaded >>>> nmethods >>>> >>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>> failure without the fix). >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>> >>>> Thanks, >>>> Coleen >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Nov 18 02:30:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 18 Nov 2019 12:30:48 +1000 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ This was a very difficult bug to track down and I want to publicly acknowledge and thank the jemalloc folk (users and developers) for continuing to investigate this issue from their side. Without their persistence this issue would have languished. The thread stack_base() is the first address above the thread's stack. However, the "in stack" checks performed by Thread::on_local_stack and Thread::is_in_stack allowed the checked address to be equal to the stack_base() - which is not correct. Here's how this manifests as the bug: - Let a JavaThread instance, T2, be allocated at the end of thread T1's stack i.e. at T1->stack_base() [This seems to be why this only reproduced with jemalloc.] - Let T2 lock an inflated monitor - Let T1 try to lock the same monitor - T1 would consider the _owner field value (T2) as being in its stack and so consider the monitor stack-locked by T1 - And so both T1 and T2 would have ownership of the monitor allowing the monitor state (and application state) to be corrupted. This results in a range of hangs and crashes depending on the exact interleaving. Interestingly Thread::is_in_usable_stack does not have this bug. The bug can be tracked way back to JDK-6699669 as explained in the bug report. That issue also showed that the same bug existed in the SA implementations of these "on stack" checks. Testing: - The reproducer from the bug report, using jemalloc, ran over 5000 times without failing in any way. - tiers 1-3 on all Oracle platforms - serviceability/sa tests Thanks, David ----- From robbin.ehn at oracle.com Mon Nov 18 11:07:37 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 18 Nov 2019 12:07:37 +0100 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: References: Message-ID: <831a0a84-b6a5-34d2-f6bd-4bacc2fa812c@oracle.com> Looks good, thanks David! /Robbin On 11/18/19 3:30 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 > webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ > > This was a very difficult bug to track down and I want to publicly acknowledge > and thank the jemalloc folk (users and developers) for continuing to investigate > this issue from their side. Without their persistence this issue would have > languished. > > The thread stack_base() is the first address above the thread's stack. However, > the "in stack" checks performed by Thread::on_local_stack and > Thread::is_in_stack allowed the checked address to be equal to the stack_base() > - which is not correct. Here's how this manifests as the bug: > > - Let a JavaThread instance, T2, be allocated at the end of thread T1's stack > i.e. at T1->stack_base() > ? [This seems to be why this only reproduced with jemalloc.] > - Let T2 lock an inflated monitor > - Let T1 try to lock the same monitor > ? - T1 would consider the _owner field value (T2) as being in its stack and so > consider the monitor stack-locked by T1 > ? - And so both T1 and T2 would have ownership of the monitor allowing the > monitor state (and application state) to be corrupted. This results in a range > of hangs and crashes depending on the exact interleaving. > > Interestingly Thread::is_in_usable_stack does not have this bug. > > The bug can be tracked way back to JDK-6699669 as explained in the bug report. > That issue also showed that the same bug existed in the SA implementations of > these "on stack" checks. > > Testing: > ? - The reproducer from the bug report, using jemalloc, ran over 5000 times > without failing in any way. > ? - tiers 1-3 on all Oracle platforms > ? - serviceability/sa tests > > Thanks, > David > ----- From david.holmes at oracle.com Mon Nov 18 11:32:15 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 18 Nov 2019 21:32:15 +1000 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: <831a0a84-b6a5-34d2-f6bd-4bacc2fa812c@oracle.com> References: <831a0a84-b6a5-34d2-f6bd-4bacc2fa812c@oracle.com> Message-ID: Thanks Robbin! David On 18/11/2019 9:07 pm, Robbin Ehn wrote: > Looks good, thanks David! > > /Robbin > > On 11/18/19 3:30 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 >> webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ >> >> This was a very difficult bug to track down and I want to publicly >> acknowledge and thank the jemalloc folk (users and developers) for >> continuing to investigate this issue from their side. Without their >> persistence this issue would have languished. >> >> The thread stack_base() is the first address above the thread's stack. >> However, the "in stack" checks performed by Thread::on_local_stack and >> Thread::is_in_stack allowed the checked address to be equal to the >> stack_base() - which is not correct. Here's how this manifests as the >> bug: >> >> - Let a JavaThread instance, T2, be allocated at the end of thread >> T1's stack i.e. at T1->stack_base() >> ?? [This seems to be why this only reproduced with jemalloc.] >> - Let T2 lock an inflated monitor >> - Let T1 try to lock the same monitor >> ?? - T1 would consider the _owner field value (T2) as being in its >> stack and so consider the monitor stack-locked by T1 >> ?? - And so both T1 and T2 would have ownership of the monitor >> allowing the monitor state (and application state) to be corrupted. >> This results in a range of hangs and crashes depending on the exact >> interleaving. >> >> Interestingly Thread::is_in_usable_stack does not have this bug. >> >> The bug can be tracked way back to JDK-6699669 as explained in the bug >> report. That issue also showed that the same bug existed in the SA >> implementations of these "on stack" checks. >> >> Testing: >> ?? - The reproducer from the bug report, using jemalloc, ran over 5000 >> times without failing in any way. >> ?? - tiers 1-3 on all Oracle platforms >> ?? - serviceability/sa tests >> >> Thanks, >> David >> ----- From thomas.stuefe at gmail.com Mon Nov 18 11:58:59 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 18 Nov 2019 12:58:59 +0100 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: References: Message-ID: This is evil :) There might be more cases like this, e.g. frame_x86.cpp frame::is_interpreted_frame_valid(): if (locals > thread->stack_base() || locals < (address) fp()) return false; Also, I would have thought the little alloca() dance we do at the start of thread_native_entry() would push the first real frame down the stack. The fix looks good. Cheers, Thomas On Mon, Nov 18, 2019 at 3:31 AM David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 > webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ > > This was a very difficult bug to track down and I want to publicly > acknowledge and thank the jemalloc folk (users and developers) for > continuing to investigate this issue from their side. Without their > persistence this issue would have languished. > > The thread stack_base() is the first address above the thread's stack. > However, the "in stack" checks performed by Thread::on_local_stack and > Thread::is_in_stack allowed the checked address to be equal to the > stack_base() - which is not correct. Here's how this manifests as the bug: > > - Let a JavaThread instance, T2, be allocated at the end of thread T1's > stack i.e. at T1->stack_base() > [This seems to be why this only reproduced with jemalloc.] > - Let T2 lock an inflated monitor > - Let T1 try to lock the same monitor > - T1 would consider the _owner field value (T2) as being in its stack > and so consider the monitor stack-locked by T1 > - And so both T1 and T2 would have ownership of the monitor allowing > the monitor state (and application state) to be corrupted. This results > in a range of hangs and crashes depending on the exact interleaving. > > Interestingly Thread::is_in_usable_stack does not have this bug. > > The bug can be tracked way back to JDK-6699669 as explained in the bug > report. That issue also showed that the same bug existed in the SA > implementations of these "on stack" checks. > > Testing: > - The reproducer from the bug report, using jemalloc, ran over 5000 > times without failing in any way. > - tiers 1-3 on all Oracle platforms > - serviceability/sa tests > > Thanks, > David > ----- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adinn at redhat.com Mon Nov 18 12:16:56 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 18 Nov 2019 12:16:56 +0000 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: References: Message-ID: <709d92c8-64ed-a12d-46e9-054b203f0f72@redhat.com> On 18/11/2019 02:30, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 > webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ > > This was a very difficult bug to track down and I want to publicly > acknowledge and thank the jemalloc folk (users and developers) for > continuing to investigate this issue from their side. Without their > persistence this issue would have languished. > . . . Wow, nice work tracking that one down! regards, Andrew Dinn ----------- From david.holmes at oracle.com Mon Nov 18 13:25:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 18 Nov 2019 23:25:48 +1000 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: References: Message-ID: Hi Thomas, Thanks for taking a look. On 18/11/2019 9:58 pm, Thomas St?fe wrote: > This is evil :) > > There might be more cases like this, e.g. > > frame_x86.cpp ?frame::is_interpreted_frame_valid(): > > if (locals > thread->stack_base() || locals < (address) fp()) return false; Yes that might be a case where >= should be in use. I'll file another bug to check uses of stack_base(). > Also, I would have thought the little alloca() dance we do at the start > of?thread_native_entry() would push the first real frame down the stack. I know nothing of that code. :) > The fix looks good. Thanks! David ----- > Cheers, Thomas > > > > On Mon, Nov 18, 2019 at 3:31 AM David Holmes > wrote: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 > webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ > > This was a very difficult bug to track down and I want to publicly > acknowledge and thank the jemalloc folk (users and developers) for > continuing to investigate this issue from their side. Without their > persistence this issue would have languished. > > The thread stack_base() is the first address above the thread's stack. > However, the "in stack" checks performed by Thread::on_local_stack and > Thread::is_in_stack allowed the checked address to be equal to the > stack_base() - which is not correct. Here's how this manifests as > the bug: > > - Let a JavaThread instance, T2, be allocated at the end of thread T1's > stack i.e. at T1->stack_base() > ? ?[This seems to be why this only reproduced with jemalloc.] > - Let T2 lock an inflated monitor > - Let T1 try to lock the same monitor > ? ?- T1 would consider the _owner field value (T2) as being in its > stack > and so consider the monitor stack-locked by T1 > ? ?- And so both T1 and T2 would have ownership of the monitor > allowing > the monitor state (and application state) to be corrupted. This results > in a range of hangs and crashes depending on the exact interleaving. > > Interestingly Thread::is_in_usable_stack does not have this bug. > > The bug can be tracked way back to JDK-6699669 as explained in the bug > report. That issue also showed that the same bug existed in the SA > implementations of these "on stack" checks. > > Testing: > ? ?- The reproducer from the bug report, using jemalloc, ran over 5000 > times without failing in any way. > ? ?- tiers 1-3 on all Oracle platforms > ? ?- serviceability/sa tests > > Thanks, > David > ----- > From thomas.stuefe at gmail.com Mon Nov 18 13:31:13 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 18 Nov 2019 14:31:13 +0100 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: References: Message-ID: Hi David, On Mon, Nov 18, 2019 at 2:26 PM David Holmes wrote: > Hi Thomas, > > Thanks for taking a look. > > On 18/11/2019 9:58 pm, Thomas St?fe wrote: > > This is evil :) > > > > There might be more cases like this, e.g. > > > > frame_x86.cpp frame::is_interpreted_frame_valid(): > > > > if (locals > thread->stack_base() || locals < (address) fp()) return > false; > > Yes that might be a case where >= should be in use. I'll file another > bug to check uses of stack_base(). > > Many of them could use Thread::in_usable_stack(), I assume. > > Also, I would have thought the little alloca() dance we do at the start > > of thread_native_entry() would push the first real frame down the stack. > > I know nothing of that code. :) > See os_linux.cpp: ... // Try to randomize the cache line index of hot stack frames. // This helps when threads of the same stack traces evict each other's // cache lines. The threads can be either from the same JVM instance, or // from different JVM instances. The benefit is especially true for // processors with hyperthreading technology. static int counter = 0; int pid = os::current_process_id(); alloca(((pid ^ counter++) & 7) * 128); > > The fix looks good. > > Thanks! > > David > ----- > > Cheers, Thomas > > Cheers, Thomas > > > > > > > > On Mon, Nov 18, 2019 at 3:31 AM David Holmes > > wrote: > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 > > webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ > > > > This was a very difficult bug to track down and I want to publicly > > acknowledge and thank the jemalloc folk (users and developers) for > > continuing to investigate this issue from their side. Without their > > persistence this issue would have languished. > > > > The thread stack_base() is the first address above the thread's > stack. > > However, the "in stack" checks performed by Thread::on_local_stack > and > > Thread::is_in_stack allowed the checked address to be equal to the > > stack_base() - which is not correct. Here's how this manifests as > > the bug: > > > > - Let a JavaThread instance, T2, be allocated at the end of thread > T1's > > stack i.e. at T1->stack_base() > > [This seems to be why this only reproduced with jemalloc.] > > - Let T2 lock an inflated monitor > > - Let T1 try to lock the same monitor > > - T1 would consider the _owner field value (T2) as being in its > > stack > > and so consider the monitor stack-locked by T1 > > - And so both T1 and T2 would have ownership of the monitor > > allowing > > the monitor state (and application state) to be corrupted. This > results > > in a range of hangs and crashes depending on the exact interleaving. > > > > Interestingly Thread::is_in_usable_stack does not have this bug. > > > > The bug can be tracked way back to JDK-6699669 as explained in the > bug > > report. That issue also showed that the same bug existed in the SA > > implementations of these "on stack" checks. > > > > Testing: > > - The reproducer from the bug report, using jemalloc, ran over > 5000 > > times without failing in any way. > > - tiers 1-3 on all Oracle platforms > > - serviceability/sa tests > > > > Thanks, > > David > > ----- > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Mon Nov 18 17:36:46 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 18 Nov 2019 12:36:46 -0500 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: References: Message-ID: Hi David, On 11/17/19 9:30 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 > webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ src/hotspot/share/runtime/thread.hpp ??? Nice catch! src/hotspot/share/runtime/thread.cpp ??? Nice catch! ??? Not your issue, but these two lines feel strange/wrong: ?? ? ?? L1008: ? // Allow non Java threads to call this without stack_base ??????? L1009: ? if (_stack_base == NULL) return true; ??? When _stack_base is NULL, any 'adr' is in the caller's stack? The ??? comment is not helping understand why this is so... src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/JavaThread.java ??? Nice catch! ??? Again, not your issue, but these four lines are questionable: ? ? ? ? L383???? Address sp????? = lastSPDbg(); ??????? L384???? Address stackBase = getStackBase(); ??????? L385???? // Be robust ??????? L386???? if (sp == null) return false; ??? I can see why a NULL sp would cause a "false" return since obviously ??? something is a amiss in the frame. However, the C++ code doesn't make ??? this check so why does the SA code? ??? And this code doesn't check stackBase == NULL so it's not matching ??? the C++ code either. Thumbs up on the change itself. My queries above and below might warrant new bugs or RFEs to be filed. > > This was a very difficult bug to track down and I want to publicly > acknowledge and thank the jemalloc folk (users and developers) for > continuing to investigate this issue from their side. Without their > persistence this issue would have languished. You also deserve thanks for sticking with this bug: Thanks David!! > The thread stack_base() is the first address above the thread's stack. > However, the "in stack" checks performed by Thread::on_local_stack and > Thread::is_in_stack allowed the checked address to be equal to the > stack_base() - which is not correct. Here's how this manifests as the > bug: > > - Let a JavaThread instance, T2, be allocated at the end of thread > T1's stack i.e. at T1->stack_base() > ? [This seems to be why this only reproduced with jemalloc.] > - Let T2 lock an inflated monitor > - Let T1 try to lock the same monitor > ? - T1 would consider the _owner field value (T2) as being in its > stack and so consider the monitor stack-locked by T1 > ? - And so both T1 and T2 would have ownership of the monitor allowing > the monitor state (and application state) to be corrupted. This > results in a range of hangs and crashes depending on the exact > interleaving. Ouch! So I was wondering how this bug could happen with the thread alignment logic that we have in place... search for the _real_malloc_address stuff... And then I noticed that the logic only kicks in when UseBiasedLocking == true (and this bug says it doesn't happen with -XX:-UseBiasedLocking): src/hotspot/share/runtime/thread.cpp: // ======= Thread ======== // Support for forcing alignment of thread objects for biased locking void* Thread::allocate(size_t size, bool throw_excpt, MEMFLAGS flags) { ? if (UseBiasedLocking) { ??? const size_t alignment = markWord::biased_lock_alignment; ??? size_t aligned_size = size + (alignment - sizeof(intptr_t)); ??? void* real_malloc_addr = throw_excpt? AllocateHeap(aligned_size, flags, CURRENT_PC) ????????????????????????????????????????? : AllocateHeap(aligned_size, flags, CURRENT_PC, AllocFailStrategy::RETURN_NULL); ??? void* aligned_addr???? = align_up(real_malloc_addr, alignment); ??? assert(((uintptr_t) aligned_addr + (uintptr_t) size) <= ?????????? ((uintptr_t) real_malloc_addr + (uintptr_t) aligned_size), ?????????? "JavaThread alignment code overflowed allocated storage"); ??? if (aligned_addr != real_malloc_addr) { ????? log_info(biasedlocking)("Aligned thread " INTPTR_FORMAT " to " INTPTR_FORMAT, ????????????????????????????? p2i(real_malloc_addr), ????????????????????????????? p2i(aligned_addr)); ??? } ??? ((Thread*) aligned_addr)->_real_malloc_address = real_malloc_addr; ??? return aligned_addr; ? } else { ??? return throw_excpt? AllocateHeap(size, flags, CURRENT_PC) ?????????????????????? : AllocateHeap(size, flags, CURRENT_PC, AllocFailStrategy::RETURN_NULL); ? } } The logging logic above: ??? if (aligned_addr != real_malloc_addr) { ????? log_info(biasedlocking)("Aligned thread " INTPTR_FORMAT " to " INTPTR_FORMAT, ????????????????????????????? p2i(real_malloc_addr), ????????????????????????????? p2i(aligned_addr)); ??? } allows for real_malloc_addr to be the same as aligned_addr sometimes (and no log message is issued), but I'm not sure from spelunking in code whether it's really possible for: ??? void* aligned_addr???? = align_up(real_malloc_addr, alignment); to return aligned_addr == real_malloc_addr. In other words, if real_malloc_addr is already aligned perfectly, does align_up() still change that value? If it is possible for (aligned_addr == real_malloc_addr), then it is possible for this bug to happen without jemalloc. I've convinced myself that this is possible because of this line: ??? size_t aligned_size = size + (alignment - sizeof(intptr_t)); If real_malloc_addr is already aligned perfectly and align_up() always changed the input address, then the aligned_size would be too small by sizeof(intptr_t) and we would have seen a buffer overwrite like that over the many, many years. So my conclusion is that it should be possible for this bug to happen without jemalloc, but it would have to be rare. > Interestingly Thread::is_in_usable_stack does not have this bug. So we have Thread::is_in_usable_stack(), Thread::on_local_stack() and Thread::is_in_stack()? I haven't compared all three side by side, but there might be some cleanup work that can be done here (in a different bug). > > The bug can be tracked way back to JDK-6699669 as explained in the bug > report. That issue also showed that the same bug existed in the SA > implementations of these "on stack" checks. Ouch! JDK-6699669 was fixed in jdk7-B56 and looks like it was pushed to the jdk6u train... so this bug goes back quite a ways... Outstanding hunt David! Dan > > Testing: > ? - The reproducer from the bug report, using jemalloc, ran over 5000 > times without failing in any way. > ? - tiers 1-3 on all Oracle platforms > ? - serviceability/sa tests > > Thanks, > David > ----- From alexey.menkov at oracle.com Mon Nov 18 19:30:14 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 18 Nov 2019 11:30:14 -0800 Subject: RFR(XS): JDK-8234358: Update ProblemList entry for NashornPopFrameTest In-Reply-To: References: <38d122c1-bf71-ddbc-1e59-904e3d571909@oracle.com> <4f1ee4ad-745a-d248-1e90-e79090234e94@oracle.com> Message-ID: Hi Chris, It makes sense. I created new issue and closed JDK-8187143 as a dup of JDK-8225620 So changing this RFR to be RFR for the new issue: https://bugs.openjdk.java.net/browse/JDK-8234358 The changes is the same: --- a/test/jdk/ProblemList.txt Fri Nov 15 14:22:24 2019 -0800 +++ b/test/jdk/ProblemList.txt Fri Nov 15 15:20:28 2019 -0800 @@ -845,7 +845,7 @@ com/sun/jdi/RepStep.java 8043571 generic-all -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all ############################################################################ --alex On 11/15/2019 17:31, Chris Plummer wrote: > Hi Alex, > > Actually I take that back. Removing from the problem list should either > be done as part of a bug fix, for which there is none here, or done for > a bug specifically for removing from the problem list. Since you are not > actually fixing JDK-8187143 (it seems to no longer be a bug), I suggest > you close it as CNR and file a new bug for the problem list update. > > thanks, > > Chris > > On 11/15/19 5:26 PM, Chris Plummer wrote: >> Looks good. >> >> Chris >> >> On 11/15/19 3:58 PM, Alex Menkov wrote: >>> Hi all, >>> >>> Please review small fix for >>> https://bugs.openjdk.java.net/browse/JDK-8187143 >>> >>> The issue is not reproducible from JDK10, but now test fails >>> NashornPopFrameTest fails with "MemberName required for invokeVirtual >>> etc" error, i.e. this is a dup of >>> https://bugs.openjdk.java.net/browse/JDK-8225620 >>> >>> The fix updates ProblemList.txt to point to 8225620 >>> >>> The diff: >>> --- a/test/jdk/ProblemList.txt??? Fri Nov 15 14:22:24 2019 -0800 >>> +++ b/test/jdk/ProblemList.txt??? Fri Nov 15 15:21:53 2019 -0800 >>> @@ -845,7 +845,7 @@ >>> >>> ?com/sun/jdi/RepStep.java 8043571 generic-all >>> >>> -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all >>> +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all >>> >>> >>> ############################################################################ >>> >>> >>> >>> --alex >> >> > > From chris.plummer at oracle.com Mon Nov 18 20:47:39 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 18 Nov 2019 12:47:39 -0800 Subject: RFR(XS): JDK-8234358: Update ProblemList entry for NashornPopFrameTest In-Reply-To: References: <38d122c1-bf71-ddbc-1e59-904e3d571909@oracle.com> <4f1ee4ad-745a-d248-1e90-e79090234e94@oracle.com> Message-ID: Looks good. Chris On 11/18/19 11:30 AM, Alex Menkov wrote: > Hi Chris, > > It makes sense. > I created new issue and closed JDK-8187143 as a dup of JDK-8225620 > > So changing this RFR to be RFR for the new issue: > https://bugs.openjdk.java.net/browse/JDK-8234358 > > The changes is the same: > > --- a/test/jdk/ProblemList.txt? Fri Nov 15 14:22:24 2019 -0800 > +++ b/test/jdk/ProblemList.txt? Fri Nov 15 15:20:28 2019 -0800 > @@ -845,7 +845,7 @@ > > ?com/sun/jdi/RepStep.java 8043571 generic-all > > -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all > +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all > > > ############################################################################ > > > > --alex > > On 11/15/2019 17:31, Chris Plummer wrote: >> Hi Alex, >> >> Actually I take that back. Removing from the problem list should >> either be done as part of a bug fix, for which there is none here, or >> done for a bug specifically for removing from the problem list. Since >> you are not actually fixing JDK-8187143 (it seems to no longer be a >> bug), I suggest you close it as CNR and file a new bug for the >> problem list update. >> >> thanks, >> >> Chris >> >> On 11/15/19 5:26 PM, Chris Plummer wrote: >>> Looks good. >>> >>> Chris >>> >>> On 11/15/19 3:58 PM, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review small fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8187143 >>>> >>>> The issue is not reproducible from JDK10, but now test fails >>>> NashornPopFrameTest fails with "MemberName required for >>>> invokeVirtual etc" error, i.e. this is a dup of >>>> https://bugs.openjdk.java.net/browse/JDK-8225620 >>>> >>>> The fix updates ProblemList.txt to point to 8225620 >>>> >>>> The diff: >>>> --- a/test/jdk/ProblemList.txt??? Fri Nov 15 14:22:24 2019 -0800 >>>> +++ b/test/jdk/ProblemList.txt??? Fri Nov 15 15:21:53 2019 -0800 >>>> @@ -845,7 +845,7 @@ >>>> >>>> ?com/sun/jdi/RepStep.java 8043571 generic-all >>>> >>>> -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all >>>> +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all >>>> >>>> >>>> ############################################################################ >>>> >>>> >>>> >>>> --alex >>> >>> >> >> From serguei.spitsyn at oracle.com Mon Nov 18 23:56:53 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 18 Nov 2019 15:56:53 -0800 Subject: RFR(S) : 8233462 : serviceability/tmtools/jstat tests times out with -Xcomp In-Reply-To: <2B7FD259-FBC2-4850-B1B3-81D1669F150B@oracle.com> References: <2B7FD259-FBC2-4850-B1B3-81D1669F150B@oracle.com> Message-ID: <728f5be0-2988-02e4-43b1-64a65c7b322e@oracle.com> Hi Igor, Looks good. Thank you for taking care about this! Thanks, Serguei On 11/15/19 23:47, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8233462/webrev.00 >> 33 lines changed: 1 ins; 14 del; 18 mod; > Hi all, > > could you please review this small fix for tmtools testlibrary? > tmtools tests are believed to fail due to a deadlock-like situation b/w main test process and tmtools process: > (from JBS) >> it seems these tests attach jstat to the main test process, the same process which reads the tool's stdout/stderr, so there is a possibility that this will deadlock: jstat-process produces more output than the buffer can hold, so it blocks till someone (the main process reads it), while the main process waits till jstat completes. > the patch changes serviceability/tmtools/share/common library (used by all serviceability/tmtools) to redirect tmtool's stdout and stderr into files instead of using jdk.test.lib.process.OutputAnalyzer; I've also added a bit of diagnostic output, so it will be easier to analyze future failures. > > webrev: http://cr.openjdk.java.net/~iignatyev//8233462/webrev.00 > JBS: https://bugs.openjdk.java.net/browse/JDK-8233462 > testing: > - serviceability/tmtools on windows-x64,linux-x64,macosx-x64,solaris-sparcv9 > - serviceability/tmtools 100 times on linux-x64-debug w/ '-Xcomp -ea -esa -XX:+TieredCompilation -XX:+DeoptimizeALot' (most of failures have been seen on this configuration) > > Thanks, > -- Igor From serguei.spitsyn at oracle.com Tue Nov 19 00:00:36 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 18 Nov 2019 16:00:36 -0800 Subject: RFR(XS): JDK-8234358: Update ProblemList entry for NashornPopFrameTest In-Reply-To: References: <38d122c1-bf71-ddbc-1e59-904e3d571909@oracle.com> <4f1ee4ad-745a-d248-1e90-e79090234e94@oracle.com> Message-ID: <887da04b-7cb1-c801-f40f-05278039fdd8@oracle.com> +1 Thanks, Serguei On 11/18/19 12:47, Chris Plummer wrote: > Looks good. > > Chris > > On 11/18/19 11:30 AM, Alex Menkov wrote: >> Hi Chris, >> >> It makes sense. >> I created new issue and closed JDK-8187143 as a dup of JDK-8225620 >> >> So changing this RFR to be RFR for the new issue: >> https://bugs.openjdk.java.net/browse/JDK-8234358 >> >> The changes is the same: >> >> --- a/test/jdk/ProblemList.txt? Fri Nov 15 14:22:24 2019 -0800 >> +++ b/test/jdk/ProblemList.txt? Fri Nov 15 15:20:28 2019 -0800 >> @@ -845,7 +845,7 @@ >> >> ?com/sun/jdi/RepStep.java 8043571 generic-all >> >> -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all >> +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all >> >> >> ############################################################################ >> >> >> >> --alex >> >> On 11/15/2019 17:31, Chris Plummer wrote: >>> Hi Alex, >>> >>> Actually I take that back. Removing from the problem list should >>> either be done as part of a bug fix, for which there is none here, >>> or done for a bug specifically for removing from the problem list. >>> Since you are not actually fixing JDK-8187143 (it seems to no longer >>> be a bug), I suggest you close it as CNR and file a new bug for the >>> problem list update. >>> >>> thanks, >>> >>> Chris >>> >>> On 11/15/19 5:26 PM, Chris Plummer wrote: >>>> Looks good. >>>> >>>> Chris >>>> >>>> On 11/15/19 3:58 PM, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> Please review small fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8187143 >>>>> >>>>> The issue is not reproducible from JDK10, but now test fails >>>>> NashornPopFrameTest fails with "MemberName required for >>>>> invokeVirtual etc" error, i.e. this is a dup of >>>>> https://bugs.openjdk.java.net/browse/JDK-8225620 >>>>> >>>>> The fix updates ProblemList.txt to point to 8225620 >>>>> >>>>> The diff: >>>>> --- a/test/jdk/ProblemList.txt??? Fri Nov 15 14:22:24 2019 -0800 >>>>> +++ b/test/jdk/ProblemList.txt??? Fri Nov 15 15:21:53 2019 -0800 >>>>> @@ -845,7 +845,7 @@ >>>>> >>>>> ?com/sun/jdi/RepStep.java 8043571 generic-all >>>>> >>>>> -com/sun/jdi/NashornPopFrameTest.java 8187143 generic-all >>>>> +com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all >>>>> >>>>> >>>>> ############################################################################ >>>>> >>>>> >>>>> >>>>> --alex >>>> >>>> >>> >>> > > From igor.ignatyev at oracle.com Tue Nov 19 00:01:41 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 18 Nov 2019 16:01:41 -0800 Subject: RFR(S) : 8233462 : serviceability/tmtools/jstat tests times out with -Xcomp In-Reply-To: <728f5be0-2988-02e4-43b1-64a65c7b322e@oracle.com> References: <2B7FD259-FBC2-4850-B1B3-81D1669F150B@oracle.com> <728f5be0-2988-02e4-43b1-64a65c7b322e@oracle.com> Message-ID: Hi Serguei, Thank you for your review and discussion around this issue. -- Igor > On Nov 18, 2019, at 3:56 PM, serguei.spitsyn at oracle.com wrote: > > Hi Igor, > > Looks good. > Thank you for taking care about this! > > Thanks, > Serguei > > > On 11/15/19 23:47, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8233462/webrev.00 >>> 33 lines changed: 1 ins; 14 del; 18 mod; >> Hi all, >> >> could you please review this small fix for tmtools testlibrary? >> tmtools tests are believed to fail due to a deadlock-like situation b/w main test process and tmtools process: >> (from JBS) >>> it seems these tests attach jstat to the main test process, the same process which reads the tool's stdout/stderr, so there is a possibility that this will deadlock: jstat-process produces more output than the buffer can hold, so it blocks till someone (the main process reads it), while the main process waits till jstat completes. >> the patch changes serviceability/tmtools/share/common library (used by all serviceability/tmtools) to redirect tmtool's stdout and stderr into files instead of using jdk.test.lib.process.OutputAnalyzer; I've also added a bit of diagnostic output, so it will be easier to analyze future failures. >> >> webrev: http://cr.openjdk.java.net/~iignatyev//8233462/webrev.00 >> JBS: https://bugs.openjdk.java.net/browse/JDK-8233462 >> testing: >> - serviceability/tmtools on windows-x64,linux-x64,macosx-x64,solaris-sparcv9 >> - serviceability/tmtools 100 times on linux-x64-debug w/ '-Xcomp -ea -esa -XX:+TieredCompilation -XX:+DeoptimizeALot' (most of failures have been seen on this configuration) >> >> Thanks, >> -- Igor > From david.holmes at oracle.com Tue Nov 19 01:59:38 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 19 Nov 2019 11:59:38 +1000 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: References: Message-ID: <9188fde8-734c-965a-f392-fad3bf04204c@oracle.com> Hi Dan, Thanks for taking a look at this. On 19/11/2019 3:36 am, Daniel D. Daugherty wrote: > Hi David, > > > On 11/17/19 9:30 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 >> webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ > > src/hotspot/share/runtime/thread.hpp > ??? Nice catch! > > src/hotspot/share/runtime/thread.cpp > ??? Nice catch! > > ??? Not your issue, but these two lines feel strange/wrong: > > ?? ? ?? L1008: ? // Allow non Java threads to call this without stack_base > ??????? L1009: ? if (_stack_base == NULL) return true; > > ??? When _stack_base is NULL, any 'adr' is in the caller's stack? The > ??? comment is not helping understand why this is so... > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/JavaThread.java > ??? Nice catch! > > ??? Again, not your issue, but these four lines are questionable: > > ? ? ? ? L383???? Address sp????? = lastSPDbg(); > ??????? L384???? Address stackBase = getStackBase(); > ??????? L385???? // Be robust > ??????? L386???? if (sp == null) return false; > > ??? I can see why a NULL sp would cause a "false" return since obviously > ??? something is a amiss in the frame. However, the C++ code doesn't make > ??? this check so why does the SA code? > > ??? And this code doesn't check stackBase == NULL so it's not matching > ??? the C++ code either. > > > Thumbs up on the change itself. My queries above and below might warrant > new bugs or RFEs to be filed. I have filed a bug to examine this and the issue Thomas flagged: https://bugs.openjdk.java.net/browse/JDK-8234372 "Investigate use of Thread::stack_base() and queries for "in stack"" >> >> This was a very difficult bug to track down and I want to publicly >> acknowledge and thank the jemalloc folk (users and developers) for >> continuing to investigate this issue from their side. Without their >> persistence this issue would have languished. > > You also deserve thanks for sticking with this bug: Thanks David!! Thanks, but I had written this off as a jemalloc issue until they provided the additional data. >> The thread stack_base() is the first address above the thread's stack. >> However, the "in stack" checks performed by Thread::on_local_stack and >> Thread::is_in_stack allowed the checked address to be equal to the >> stack_base() - which is not correct. Here's how this manifests as the >> bug: >> >> - Let a JavaThread instance, T2, be allocated at the end of thread >> T1's stack i.e. at T1->stack_base() >> ? [This seems to be why this only reproduced with jemalloc.] >> - Let T2 lock an inflated monitor >> - Let T1 try to lock the same monitor >> ? - T1 would consider the _owner field value (T2) as being in its >> stack and so consider the monitor stack-locked by T1 >> ? - And so both T1 and T2 would have ownership of the monitor allowing >> the monitor state (and application state) to be corrupted. This >> results in a range of hangs and crashes depending on the exact >> interleaving. > > Ouch! > > So I was wondering how this bug could happen with the thread alignment > logic that we have in place... search for the _real_malloc_address stuff... > > And then I noticed that the logic only kicks in when UseBiasedLocking == > true > (and this bug says it doesn't happen with -XX:-UseBiasedLocking): Actually that is a false claim. As per my comment on "2019-10-09 14:09" it does reproduce with biased-locking disabled but much more rarely. > src/hotspot/share/runtime/thread.cpp: > > // ======= Thread ======== > // Support for forcing alignment of thread objects for biased locking > void* Thread::allocate(size_t size, bool throw_excpt, MEMFLAGS flags) { > ? if (UseBiasedLocking) { > ??? const size_t alignment = markWord::biased_lock_alignment; > ??? size_t aligned_size = size + (alignment - sizeof(intptr_t)); > ??? void* real_malloc_addr = throw_excpt? AllocateHeap(aligned_size, > flags, CURRENT_PC) > ????????????????????????????????????????? : AllocateHeap(aligned_size, > flags, CURRENT_PC, > AllocFailStrategy::RETURN_NULL); > ??? void* aligned_addr???? = align_up(real_malloc_addr, alignment); > ??? assert(((uintptr_t) aligned_addr + (uintptr_t) size) <= > ?????????? ((uintptr_t) real_malloc_addr + (uintptr_t) aligned_size), > ?????????? "JavaThread alignment code overflowed allocated storage"); > ??? if (aligned_addr != real_malloc_addr) { > ????? log_info(biasedlocking)("Aligned thread " INTPTR_FORMAT " to " > INTPTR_FORMAT, > ????????????????????????????? p2i(real_malloc_addr), > ????????????????????????????? p2i(aligned_addr)); > ??? } > ??? ((Thread*) aligned_addr)->_real_malloc_address = real_malloc_addr; > ??? return aligned_addr; > ? } else { > ??? return throw_excpt? AllocateHeap(size, flags, CURRENT_PC) > ?????????????????????? : AllocateHeap(size, flags, CURRENT_PC, > AllocFailStrategy::RETURN_NULL); > ? } > } > > > The logging logic above: > > ??? if (aligned_addr != real_malloc_addr) { > ????? log_info(biasedlocking)("Aligned thread " INTPTR_FORMAT " to " > INTPTR_FORMAT, > ????????????????????????????? p2i(real_malloc_addr), > ????????????????????????????? p2i(aligned_addr)); > ??? } > > allows for real_malloc_addr to be the same as aligned_addr sometimes > (and no log message is issued), but I'm not sure from spelunking in > code whether it's really possible for: > > ??? void* aligned_addr???? = align_up(real_malloc_addr, alignment); > > to return aligned_addr == real_malloc_addr. In other words, if > real_malloc_addr is already aligned perfectly, does align_up() still > change that value? > > If it is possible for (aligned_addr == real_malloc_addr), then it is > possible for this bug to happen without jemalloc. > > I've convinced myself that this is possible because of this line: > > ??? size_t aligned_size = size + (alignment - sizeof(intptr_t)); > > If real_malloc_addr is already aligned perfectly and align_up() > always changed the input address, then the aligned_size would be > too small by sizeof(intptr_t) and we would have seen a buffer > overwrite like that over the many, many years. > > So my conclusion is that it should be possible for this bug to > happen without jemalloc, but it would have to be rare. I'm a little surprised that we specialize this way as I thought the 128/256 byte alignment was necessary regardless of biased-locking. Further even if running without biased-locking we still have alignment requirements for the lock-bits, age-bits etc, that do not seem to be captured by the above code unless AllocateHeap somehow already provides such alignment by default. (I'm also unclear why this doesn't fail in debug builds but just assume the allocation patterns are different.) Anyway, if the allocator already returns a suitably aligned block of memory then I am assuming the above code doesn't actually need to do anything. So theoretically, without having advance knowledge of the details of the allocator, yes this bug could happen for any allocator. Thanks, David ----- > >> Interestingly Thread::is_in_usable_stack does not have this bug. > > So we have Thread::is_in_usable_stack(), Thread::on_local_stack() and > Thread::is_in_stack()? I haven't compared all three side by side, but > there might be some cleanup work that can be done here (in a different > bug). > > >> >> The bug can be tracked way back to JDK-6699669 as explained in the bug >> report. That issue also showed that the same bug existed in the SA >> implementations of these "on stack" checks. > > Ouch! JDK-6699669 was fixed in jdk7-B56 and looks like it was pushed > to the jdk6u train... so this bug goes back quite a ways... > > Outstanding hunt David! > > Dan > > >> >> Testing: >> ? - The reproducer from the bug report, using jemalloc, ran over 5000 >> times without failing in any way. >> ? - tiers 1-3 on all Oracle platforms >> ? - serviceability/sa tests >> >> Thanks, >> David >> ----- > From serguei.spitsyn at oracle.com Tue Nov 19 03:03:43 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 18 Nov 2019 19:03:43 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Tue Nov 19 03:09:25 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 18 Nov 2019 22:09:25 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> Message-ID: <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> Hi Serguei, Sorry for not sending an update.? I talked to Erik and am working on a version that keeps the nmethod from being unloaded while it's in the deferred event queue, with a version that the GC people will like, and I like.? I'm testing it out now. Thanks! Coleen On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > Sorry for the latency, I had to investigate it a little bit. > I still have some doubt your fix is right thing to do. > > > On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >> >> >> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Coleen, >>> >>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi, I've been working on answers to these questions, so I'll start >>>> with this one. >>>> >>>> The nmethodLocker keeps the nmethod from being reclaimed >>>> (made_zombie or memory released) by the sweeper, but the nmethod >>>> could be unloaded.? Unloading the nmethod clears the Method* >>>> _method field. >>> >>> Yes, I see it is done in the nmethod::make_unloaded(). >>> >>>> The post_compiled_method_load event needs the _method field to look >>>> at things like inlining and ScopeDesc fields.?? If the nmethod is >>>> unloaded, some of the oops are dead.? There are "holder" oops that >>>> correspond to the metadata in the nmethod.? If these oops are dead, >>>> causing the nmethod to get unloaded, then the metadata may not be >>>> valid. >>>> >>>> So my change 02 looks for a NULL nmethod._method field to tell >>>> whether we can post information about the nmethod. >>>> >>>> There's code in nmethod.cpp like: >>>> >>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>> ? if (_jmethod_id == NULL) { >>>> ??? // Cache the jmethod_id since it can no longer be looked up >>>> once the >>>> ??? // method itself has been marked for unloading. >>>> ??? _jmethod_id = method()->jmethod_id(); >>>> ? } >>>> ? return _jmethod_id; >>>> } >>>> >>>> Which was added when post_method_load and unload were turned into >>>> deferred events. >>> >>> Could we cache the jmethodID in the >>> JvmtiDeferredEvent::compiled_method_load_event >>> similarly as we do in the >>> JvmtiDeferredEvent::compiled_method_unload_event? >>> This would help to get rid of the dependency on the nmethod::_method. >>> Do we depend on any other nmethod fields? >> >> Yes, there are other nmethod metadata that we rely on to print inline >> information, and this function >> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses >> the ScopeDesc data in the nmethod. > > One possible approach is to prepare and cache all this information > in the nmethod::post_compiled_method_load_event() before the > JvmtiDeferredEvent::compiled_method_load_event() is called. > The event parameters are: > typedef struct { > const void* start_address; > jlocation location; > } jvmtiAddrLocationMap; > CompiledMethodLoad(jvmtiEnv *jvmti_env, > jmethodID method, > jint code_size, > const void* code_addr, > jint map_length, > const jvmtiAddrLocationMap* map, > const void* compile_info) > Some of these addresses above could be not accessible when an event is > posted. > Not sure yet if it is Okay. > The question is if this kind of refactoring is worth and right thing > to do. > >> >> We do cache the jmethodID but that's not good enough.? See my last >> comment in the bug report.? The jmethodID can point to an unloaded >> method. > > This looks like it is done a little bit late. > It'd better to do it before the event is deferred (see above). > >> I tried a version of keeping the nmethod alive, but the GC folks will >> hate it.? And it doesn't work and I hate it. > > From serviceability point of view this is the best and most consistent > approach. > I seems to me, it was initially designed this way. > The downside is it adds some extra complexity to the GC. > >> My version 01 is the best, with the caveat that maybe it should check >> for _method == NULL instead of nmethod->is_alive().? I have to talk >> to Erik to see if there's a race with concurrent class unloading. >> >> Any application that depends on a compiled method loading event on a >> class that could be unloaded is a buggy application. Applications >> should not rely on when the JIT compiler decides to compile a >> method!? This happens to us for a stress test.? Most applications >> will get most of their compiled method loading events as they >> normally do. > > It is not an application that relies on the compiled method loading event. > It is about profiling tools to be able to get correct information > about what is going on with compilations. > My concern is that if we skip such compiled method load events then > profilers have no way > to find out there many unneeded compilations that are thrown away > without any real use. > Also, it is not clear what happens with the subsequent compiled method > unload events. > Are they going to be skipped as well or they can appear and confuse > profilers? > > > Thanks, > Serguei >> >> Thanks, >> Coleen >> >>> >>> >>> Thanks, >>> Serguei >>> >>>> I put more debugging in the bug to show this crash was from an >>>> unloaded nmethod. >>>> >>>> Coleen >>>> >>>> >>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Coleen, >>>>> >>>>> I have some questions. >>>>> >>>>> Both the compiler method load and unload are posted as deferred >>>>> events. >>>>> Both events keep the nmethod alive until the ServiceThread >>>>> processes the event. >>>>> >>>>> The implementation is: >>>>> >>>>> JvmtiDeferredEvent >>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>> ? . . . >>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>> ? // this deferred event. >>>>> ? nmethodLocker::lock_nmethod(nm); >>>>> ? return event; >>>>> } >>>>> >>>>> JvmtiDeferredEvent >>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>> jmethodID id, const void* code) { >>>>> ? . . . >>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>> ? // this deferred event. This will keep the memory for the >>>>> ? // generated code from being reused too early. We pass >>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>> ? // made into a zombie can be locked. >>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>> ? return event; >>>>> } >>>>> >>>>> void JvmtiDeferredEvent::post() { >>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>> ???????? "Service thread must post enqueued events"); >>>>> ? switch(_type) { >>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>> ????? JvmtiExport::post_compiled_method_load(nm); >>>>> ????? // done with the deferred event so unlock the nmethod >>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>> ????? break; >>>>> ??? } >>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>> ??????? _event_data.compiled_method_unload.method_id, >>>>> ??????? _event_data.compiled_method_unload.code_begin); >>>>> ????? // done with the deferred event so unlock the nmethod >>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>> ????? break; >>>>> ??? } >>>>> ??? . . . >>>>> ? } >>>>> } >>>>> >>>>> Then I wonder how is it possible for the nmethod to be not alive >>>>> here?: >>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>> . . . >>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>> 2174 if (!nm->is_alive()) { >>>>> 2175 return; >>>>> 2176 } >>>>> At least, it lokks like something else is broken. >>>>> Do I miss something important here? >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Don't post information which uses metadata from unloaded >>>>>> nmethods >>>>>> >>>>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>>>> failure without the fix). >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Nov 19 03:14:30 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 18 Nov 2019 19:14:30 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Nov 19 04:34:39 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 18 Nov 2019 20:34:39 -0800 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: References: Message-ID: <6fe10178-cb74-aa9f-7052-5c577e6e10bf@oracle.com> Hi David, The fix looks good. It is besides the platform-dependent code that Thomas flagged. There can be similar broken code on other platforms. For instance, there is a suspicious spot in cpu/ppc/frame_ppc.cpp: ??? // sender_fp must be within the stack and above (but not ??? // equal) current frame's fp. ??? if (sender_fp > thread->stack_base() || sender_fp <= fp) { ??????? return false; ??? } Thanks, Serguei On 11/17/19 18:30, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 > webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ > > This was a very difficult bug to track down and I want to publicly > acknowledge and thank the jemalloc folk (users and developers) for > continuing to investigate this issue from their side. Without their > persistence this issue would have languished. > > The thread stack_base() is the first address above the thread's stack. > However, the "in stack" checks performed by Thread::on_local_stack and > Thread::is_in_stack allowed the checked address to be equal to the > stack_base() - which is not correct. Here's how this manifests as the > bug: > > - Let a JavaThread instance, T2, be allocated at the end of thread > T1's stack i.e. at T1->stack_base() > ? [This seems to be why this only reproduced with jemalloc.] > - Let T2 lock an inflated monitor > - Let T1 try to lock the same monitor > ? - T1 would consider the _owner field value (T2) as being in its > stack and so consider the monitor stack-locked by T1 > ? - And so both T1 and T2 would have ownership of the monitor allowing > the monitor state (and application state) to be corrupted. This > results in a range of hangs and crashes depending on the exact > interleaving. > > Interestingly Thread::is_in_usable_stack does not have this bug. > > The bug can be tracked way back to JDK-6699669 as explained in the bug > report. That issue also showed that the same bug existed in the SA > implementations of these "on stack" checks. > > Testing: > ? - The reproducer from the bug report, using jemalloc, ran over 5000 > times without failing in any way. > ? - tiers 1-3 on all Oracle platforms > ? - serviceability/sa tests > > Thanks, > David > ----- From david.holmes at oracle.com Tue Nov 19 04:37:24 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 19 Nov 2019 14:37:24 +1000 Subject: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1) In-Reply-To: <6fe10178-cb74-aa9f-7052-5c577e6e10bf@oracle.com> References: <6fe10178-cb74-aa9f-7052-5c577e6e10bf@oracle.com> Message-ID: Hi Serguei, On 19/11/2019 2:34 pm, serguei.spitsyn at oracle.com wrote: > Hi David, > > The fix looks good. Thanks for taking a look! > It is besides the platform-dependent code that Thomas flagged. > > There can be similar broken code on other platforms. > For instance, there is a suspicious spot in cpu/ppc/frame_ppc.cpp: > > ??? // sender_fp must be within the stack and above (but not > ??? // equal) current frame's fp. > ??? if (sender_fp > thread->stack_base() || sender_fp <= fp) { > ??????? return false; > ??? } I have filed: https://bugs.openjdk.java.net/browse/JDK-8234372 "Investigate use of Thread::stack_base() and queries for "in stack"" to look at all uses of stack_base(). Thanks, David > Thanks, > Serguei > > > On 11/17/19 18:30, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 >> webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ >> >> This was a very difficult bug to track down and I want to publicly >> acknowledge and thank the jemalloc folk (users and developers) for >> continuing to investigate this issue from their side. Without their >> persistence this issue would have languished. >> >> The thread stack_base() is the first address above the thread's stack. >> However, the "in stack" checks performed by Thread::on_local_stack and >> Thread::is_in_stack allowed the checked address to be equal to the >> stack_base() - which is not correct. Here's how this manifests as the >> bug: >> >> - Let a JavaThread instance, T2, be allocated at the end of thread >> T1's stack i.e. at T1->stack_base() >> ? [This seems to be why this only reproduced with jemalloc.] >> - Let T2 lock an inflated monitor >> - Let T1 try to lock the same monitor >> ? - T1 would consider the _owner field value (T2) as being in its >> stack and so consider the monitor stack-locked by T1 >> ? - And so both T1 and T2 would have ownership of the monitor allowing >> the monitor state (and application state) to be corrupted. This >> results in a range of hangs and crashes depending on the exact >> interleaving. >> >> Interestingly Thread::is_in_usable_stack does not have this bug. >> >> The bug can be tracked way back to JDK-6699669 as explained in the bug >> report. That issue also showed that the same bug existed in the SA >> implementations of these "on stack" checks. >> >> Testing: >> ? - The reproducer from the bug report, using jemalloc, ran over 5000 >> times without failing in any way. >> ? - tiers 1-3 on all Oracle platforms >> ? - serviceability/sa tests >> >> Thanks, >> David >> ----- > From markus.gronlund at oracle.com Tue Nov 19 14:38:26 2019 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Tue, 19 Nov 2019 06:38:26 -0800 (PST) Subject: 8233197(S): Invert JvmtiExport::post_vm_initialized() and Jfr:on_vm_start() start-up order for correct option parsing Message-ID: Greetings, (apologies for the wide distribution) Kindly asking for reviews for the following changeset: Bug: https://bugs.openjdk.java.net/browse/JDK-8233197 Webrev: http://cr.openjdk.java.net/~mgronlun/8233197/webrev01/ Testing: serviceability/jvmti, jdk_jfr, tier1-5 Summary: please see bug for description. For Runtime / Serviceability folks: This change slightly modifies the relative order in Threads::create_vm(); please see threads.cpp. There is an upcall as part of Jfr::on_vm_start() that delivers global JFR command-line options to Java (only if set). The behavioral change amounts to a few classes loaded as part of establishing this upcall (all internal JFR classes and/or java.base classes, loaded by the bootloader) no longer being visible to the ClassFileLoadHook's of agents. These classes are visible to agents that work with "early_start" JVMTI environments however. The major part of JFR startup with associated class loading still happens as part of Jfr::on_vm_live() with no behavioral change in relation to agents. Thank you Markus From serguei.spitsyn at oracle.com Tue Nov 19 23:36:52 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 19 Nov 2019 15:36:52 -0800 Subject: RFR(S): 8169467: GetLocalInstance returns JVMTI_ERROR_TYPE_MISMATCH (rather than JVMTI_ERROR_INVALID_SLOT) on static method Message-ID: <2df9eec5-b695-2f4e-8c25-65c5177f2208@oracle.com> Please, review a fix for: ? https://bugs.openjdk.java.net/browse/JDK-8169467 Webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2019/8169467-jvmti-local-instance.1/ Summary: ? The JVMTI GetLocalInstance function should return JVMTI_ERROR_INVALID_SLOT for static method frames. ? Instead, it returns the JVMTI_ERROR_TYPE_MISMATCH error. ? The fix adds necessary checks into the implementation. Testing in progress: ? - Locally on Linux-x64 with: vmTestbase_nsk_jvmti, vmTestbase_nsk_jdi, jdk_jdi ? - All mach5 hs-tier5 Thanks, Serguei From serguei.spitsyn at oracle.com Wed Nov 20 03:09:41 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 19 Nov 2019 19:09:41 -0800 Subject: 8233197(S): Invert JvmtiExport::post_vm_initialized() and Jfr:on_vm_start() start-up order for correct option parsing In-Reply-To: References: Message-ID: <1ca7ae34-41fe-fad1-4bd2-57cdf9667bd9@oracle.com> An HTML attachment was scrubbed... URL: From markus.gronlund at oracle.com Wed Nov 20 20:54:05 2019 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Wed, 20 Nov 2019 12:54:05 -0800 (PST) Subject: 8233197(S): Invert JvmtiExport::post_vm_initialized() and Jfr:on_vm_start() start-up order for correct option parsing In-Reply-To: <1ca7ae34-41fe-fad1-4bd2-57cdf9667bd9@oracle.com> References: <1ca7ae34-41fe-fad1-4bd2-57cdf9667bd9@oracle.com> Message-ID: <62407a3d-f6a2-400b-9311-9ab7e32d85f7@default> Hi Serguei, thanks for taking a look. ? "It does not look as a good idea to change the JVMTI phase like above. ? If you need the ONLOAD phase just to enable capabilities then it is better to do it in the real ONLOAD phase. ? Do I miss anything important here? ? Please, ask questions if you have any problems with it." ? Yes, so the reason for the phase transition is not so much to do with capabilities, but that an agent can only register, i.e. call GetEnv(), in phases JVMTI_PHASE_ONLOAD and JVMTI_PHASE_LIVE. create_vm_init_agents() is where the (temporary) JVMTI_PHASE_PRIMORDIAL to JVMTI_PHASE_ONLOAD happens during the callouts to Agent_OnLoad(), and then the state is returned to JVMTI_PHASE_PRIMORDIAL. It is hard to find an unconditional hook point there since create_vm_init_agents() is made conditional on Arguments::init_agents_at_startup(), with a listing populated from "real agents" (on command-line). The JFR JVMTI agent itself is also conditional, installed only if JFR is actively started (i.e. a starting a recording). Hence, the phase transition mechanism merely replicates the state changes in create_vm_init_agents() to have the agent register properly. This is a moot point now however as I have taken another pass. I now found a way to only have the agent register during the JVMTI_PHASE_LIVE phase, so the phase transition mechanism is not needed. ? "The Jfr::on_vm_init() is confusing as there is a mismatch with the JVMTI phases order. ? It fills like it means JFR init event (not VM init) or something like this. ? Or maybe it denotes the VM initialization start. :) ? I'll be happy if you could explain it a little bit." ? Yes, this is confusing, I agree. Of course, JFR has a tight relation to the JVMTI phases, but only in so far as to coordinate agent registration. The JFR calls are not intended to reflect the JVMTI phases per se but a more general initialization order state description, like you say "VM initialization start and completion". However, it is very hard to encode proper semantics into the JFR calls in Threads::create_vm() to reflect the concepts of "stages"; they are simply not well-defined. In addition, there are so many of them J. For example, I always get confused that VM initialization is reflected in JVMTI by the VMStart event and the completion by the VMInit event (representing VM initialization complete). At the same time, the DTRACE macros have both HOTSPOT_VM_INIT_BEGIN() HOTSPOT_VM_INIT_END() placed before both... ? I abandoned the attempt to encode anything meaningful into the JFR calls trying to represent a certain "VM initialization stage". Instead, I will just have syntactic JFR calls reflecting some relative order (on_create_vm_1(), on_create_vm_2(),.. _3()) etc. Looks like there are precedents of this style. ? ?Not sure, if your agent needs to enable these capabilities (introduced in JDK 9 with modules): ? can_generate_early_vmstart ? can_generate_early_class_hook_events? ? Thanks for the suggestion Serguei, but these capabilities are not yet needed. ? Here is the updated webrev: http://cr.openjdk.java.net/~mgronlun/8233197/webrev02/ ? Thanks again Markus ? ? From: Serguei Spitsyn Sent: den 20 november 2019 04:10 To: Markus Gronlund ; hotspot-jfr-dev ; hotspot-runtime-dev at openjdk.java.net; serviceability-dev at openjdk.java.net Subject: Re: 8233197(S): Invert JvmtiExport::post_vm_initialized() and Jfr:on_vm_start() start-up order for correct option parsing ? Hi Marcus, It looks good in general. A couple of comments though. http://cr.openjdk.java.net/~mgronlun/8233197/webrev01/src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp.frames.html 258 class JvmtiPhaseTransition { 259? private: 260?? bool _transition; 261? public: 262?? JvmtiPhaseTransition() : _transition(JvmtiEnvBase::get_phase() == JVMTI_PHASE_PRIMORDIAL) { 263???? if (_transition) { 264?????? JvmtiEnvBase::set_phase(JVMTI_PHASE_ONLOAD); 265???? } 266?? } 267?? ~JvmtiPhaseTransition() { 268???? if (_transition) { 269?????? assert(JvmtiEnvBase::get_phase() == JVMTI_PHASE_ONLOAD, "invariant"); 270?????? JvmtiEnvBase::set_phase(JVMTI_PHASE_PRIMORDIAL); 271???? } 272?? } 273 }; 274 ?275 static bool initialize() { 276?? JavaThread* const jt = current_java_thread(); 277?? assert(jt != NULL, "invariant"); 278?? assert(jt->thread_state() == _thread_in_vm, "invariant"); 279?? DEBUG_ONLY(JfrJavaSupport::check_java_thread_in_vm(jt)); 280?? JvmtiPhaseTransition jvmti_phase_transition; 281?? ThreadToNativeFromVM transition(jt); 282?? if (create_jvmti_env(jt) != JNI_OK) { 283???? assert(jfr_jvmti_env == NULL, "invariant"); 284???? return false; 285?? } 286?? assert(jfr_jvmti_env != NULL, "invariant"); 287?? if (!register_capabilities(jt)) { 288???? return false; 289?? } 290?? if (!register_callbacks(jt)) { 291???? return false; 292?? } 293?? return update_class_file_load_hook_event(JVMTI_ENABLE); 294 } It does not look as a good idea to change the JVMTI phase like above. If you need the ONLOAD phase just to enable capabilities then it is better to do it in the real ONLOAD phase. Do I miss anything important here? Please, ask questions if you have any problems with it. The Jfr::on_vm_init() is confusing as there is a mismatch with the JVMTI phases order. It fills like it means JFR init event (not VM init) or something like this. Or maybe it denotes the VM initialization start. :) I'll be happy if you could explain it a little bit. Not sure, if your agent needs to enable these capabilities (introduced in JDK 9 with modules): ? can_generate_early_vmstart ? can_generate_early_class_hook_events Thanks, Serguei On 11/19/19 06:38, Markus Gronlund wrote: Greetings, ? (apologies for the wide distribution) ? Kindly asking for reviews for the following changeset: ? Bug: https://bugs.openjdk.java.net/browse/JDK-8233197 Webrev: http://cr.openjdk.java.net/~mgronlun/8233197/webrev01/ Testing: serviceability/jvmti, jdk_jfr, tier1-5 Summary: please see bug for description. ? For Runtime / Serviceability folks: This change slightly modifies the relative order in Threads::create_vm(); please see threads.cpp. There is an upcall as part of Jfr::on_vm_start() that delivers global JFR command-line options to Java (only if set). The behavioral change amounts to a few classes loaded as part of establishing this upcall (all internal JFR classes and/or java.base classes, loaded by the bootloader) no longer being visible to the ClassFileLoadHook's of agents. These classes are visible to agents that work with "early_start" JVMTI environments however. ? The major part of JFR startup with associated class loading still happens as part of Jfr::on_vm_live() with no behavioral change in relation to agents. ? Thank you Markus ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Wed Nov 20 23:32:10 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 20 Nov 2019 15:32:10 -0800 Subject: RFR(S): 8169467: GetLocalInstance returns JVMTI_ERROR_TYPE_MISMATCH (rather than JVMTI_ERROR_INVALID_SLOT) on static method In-Reply-To: <2df9eec5-b695-2f4e-8c25-65c5177f2208@oracle.com> References: <2df9eec5-b695-2f4e-8c25-65c5177f2208@oracle.com> Message-ID: Looks good. --alex On 11/19/2019 15:36, serguei.spitsyn at oracle.com wrote: > Please, review a fix for: > ? https://bugs.openjdk.java.net/browse/JDK-8169467 > > Webrev: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2019/8169467-jvmti-local-instance.1/ > > > Summary: > ? The JVMTI GetLocalInstance function should return > JVMTI_ERROR_INVALID_SLOT for static method frames. > ? Instead, it returns the JVMTI_ERROR_TYPE_MISMATCH error. > ? The fix adds necessary checks into the implementation. > > Testing in progress: > ? - Locally on Linux-x64 with: vmTestbase_nsk_jvmti, > vmTestbase_nsk_jdi, jdk_jdi > ? - All mach5 hs-tier5 > > Thanks, > Serguei From serguei.spitsyn at oracle.com Thu Nov 21 00:20:46 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 20 Nov 2019 16:20:46 -0800 Subject: RFR(S): 8169467: GetLocalInstance returns JVMTI_ERROR_TYPE_MISMATCH (rather than JVMTI_ERROR_INVALID_SLOT) on static method In-Reply-To: References: <2df9eec5-b695-2f4e-8c25-65c5177f2208@oracle.com> Message-ID: <56bb4af5-9a65-3b26-9d41-6399060ae8ef@oracle.com> Thank you, Alex! Serguei On 11/20/19 3:32 PM, Alex Menkov wrote: > Looks good. > > --alex > > On 11/19/2019 15:36, serguei.spitsyn at oracle.com wrote: >> Please, review a fix for: >> ?? https://bugs.openjdk.java.net/browse/JDK-8169467 >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2019/8169467-jvmti-local-instance.1/ >> >> >> Summary: >> ?? The JVMTI GetLocalInstance function should return >> JVMTI_ERROR_INVALID_SLOT for static method frames. >> ?? Instead, it returns the JVMTI_ERROR_TYPE_MISMATCH error. >> ?? The fix adds necessary checks into the implementation. >> >> Testing in progress: >> ?? - Locally on Linux-x64 with: vmTestbase_nsk_jvmti, >> vmTestbase_nsk_jdi, jdk_jdi >> ?? - All mach5 hs-tier5 >> >> Thanks, >> Serguei From serguei.spitsyn at oracle.com Thu Nov 21 00:52:41 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 20 Nov 2019 16:52:41 -0800 Subject: 8233197(S): Invert JvmtiExport::post_vm_initialized() and Jfr:on_vm_start() start-up order for correct option parsing In-Reply-To: <62407a3d-f6a2-400b-9311-9ab7e32d85f7@default> References: <1ca7ae34-41fe-fad1-4bd2-57cdf9667bd9@oracle.com> <62407a3d-f6a2-400b-9311-9ab7e32d85f7@default> Message-ID: <2ae4f9d7-7415-8c99-874c-97b6612ac272@oracle.com> Hi Marcus, Thank you for the answers! The update looks good to me. A couple of minor minor comments. http://cr.openjdk.java.net/~mgronlun/8233197/webrev02/src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp.frames.html 57 static bool set_event_notification_mode(jvmtiEventMode mode, 58 jvmtiEvent event, 59 jthread event_thread, 60 ...) { ? You may want to align arguments. 126 size_t length = sizeof base_error_msg ; // includes terminating null ? Unneeded space before ';'. ? Would it better to use this form: sizeof(base_error_msg)? No need in another webrev. Thanks, Serguei On 11/20/19 12:54 PM, Markus Gronlund wrote: > > Hi Serguei, > > thanks for taking a look. > > "It does not look as a good idea to change the JVMTI phase like above. > > ? If you need the ONLOAD phase just to enable capabilities then it is > better to do it in the real ONLOAD phase. > > ? Do I miss anything important here? > > ? Please, ask questions if you have any problems with it." > > Yes, so the reason for the phase transition is not so much to do with > capabilities, but that an agent can only register, i.e. call GetEnv(), > in phases JVMTI_PHASE_ONLOAD and JVMTI_PHASE_LIVE. > > create_vm_init_agents() is where the (temporary) > JVMTI_PHASE_PRIMORDIAL to JVMTI_PHASE_ONLOAD happens during the > callouts to Agent_OnLoad(), and then the state is returned to > JVMTI_PHASE_PRIMORDIAL. It is hard to find an unconditional hook point > there since create_vm_init_agents() is made conditional on > Arguments::init_agents_at_startup(), with a listing populated from > "real agents" (on command-line). > > The JFR JVMTI agent itself is also conditional, installed only if JFR > is actively started (i.e. a starting a recording). Hence, the phase > transition mechanism merely replicates the state changes in > create_vm_init_agents() to have the agent register properly. This is a > moot point now however as I have taken another pass. I now found a way > to only have the agent register during the JVMTI_PHASE_LIVE phase, so > the phase transition mechanism is not needed. > > "The Jfr::on_vm_init() is confusing as there is a mismatch with the > JVMTI phases order. > > ? It fills like it means JFR init event (not VM init) or something > like this. > > ? Or maybe it denotes the VM initialization start. :) > > ? I'll be happy if you could explain it a little bit." > > Yes, this is confusing, I agree. Of course, JFR has a tight relation > to the JVMTI phases, but only in so far as to coordinate agent > registration. The JFR calls are not intended to reflect the JVMTI > phases per se but a more general initialization order state > description, like you say "VM initialization start and completion". > However, it is very hard to encode proper semantics into the JFR calls > in Threads::create_vm() to reflect the concepts of "stages"; they are > simply not well-defined. In addition, there are so many of them J. For > example, I always get confused that VM initialization is reflected in > JVMTI by the VMStart event and the completion by the VMInit event > (representing VM initialization complete). At the same time, the > DTRACE macros have both HOTSPOT_VM_INIT_BEGIN() HOTSPOT_VM_INIT_END() > placed before both... > > I abandoned the attempt to encode anything meaningful into the JFR > calls trying to represent a certain "VM initialization stage". > > Instead, I will just have syntactic JFR calls reflecting some relative > order (on_create_vm_1(), on_create_vm_2(),.. _3()) etc. Looks like > there are precedents of this style. > > ?Not sure, if your agent needs to enable these capabilities > (introduced in JDK 9 with modules): > ? can_generate_early_vmstart > ? can_generate_early_class_hook_events? > > Thanks for the suggestion Serguei, but these capabilities are not yet > needed. > > Here is the updated webrev: > http://cr.openjdk.java.net/~mgronlun/8233197/webrev02/ > > Thanks again > > Markus > > *From:*Serguei Spitsyn > *Sent:* den 20 november 2019 04:10 > *To:* Markus Gronlund ; hotspot-jfr-dev > ; > hotspot-runtime-dev at openjdk.java.net; serviceability-dev at openjdk.java.net > *Subject:* Re: 8233197(S): Invert JvmtiExport::post_vm_initialized() > and Jfr:on_vm_start() start-up order for correct option parsing > > Hi Marcus, > > It looks good in general. > > A couple of comments though. > > http://cr.openjdk.java.net/~mgronlun/8233197/webrev01/src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp.frames.html > > 258 class JvmtiPhaseTransition { > 259? private: > 260?? bool _transition; > 261? public: > 262?? JvmtiPhaseTransition() : _transition(JvmtiEnvBase::get_phase() > == JVMTI_PHASE_PRIMORDIAL) { > 263???? if (_transition) { > 264?????? JvmtiEnvBase::set_phase(JVMTI_PHASE_ONLOAD); > 265???? } > 266?? } > 267?? ~JvmtiPhaseTransition() { > 268???? if (_transition) { > 269?????? assert(JvmtiEnvBase::get_phase() == JVMTI_PHASE_ONLOAD, > "invariant"); > 270?????? JvmtiEnvBase::set_phase(JVMTI_PHASE_PRIMORDIAL); > 271???? } > 272?? } > 273 }; > 274 > ?275 static bool initialize() { > 276?? JavaThread* const jt = current_java_thread(); > 277?? assert(jt != NULL, "invariant"); > 278?? assert(jt->thread_state() == _thread_in_vm, "invariant"); > 279?? DEBUG_ONLY(JfrJavaSupport::check_java_thread_in_vm(jt)); > *280?? JvmtiPhaseTransition jvmti_phase_transition;* > 281?? ThreadToNativeFromVM transition(jt); > 282?? if (create_jvmti_env(jt) != JNI_OK) { > 283???? assert(jfr_jvmti_env == NULL, "invariant"); > 284???? return false; > 285?? } > 286?? assert(jfr_jvmti_env != NULL, "invariant"); > 287?? if (!register_capabilities(jt)) { > 288???? return false; > 289?? } > 290?? if (!register_callbacks(jt)) { > 291???? return false; > 292?? } > 293?? return update_class_file_load_hook_event(JVMTI_ENABLE); > 294 } > > > It does not look as a good idea to change the JVMTI phase like above. > If you need the ONLOAD phase just to enable capabilities then it is > better to do it in the real ONLOAD phase. > Do I miss anything important here? > Please, ask questions if you have any problems with it. > > The Jfr::on_vm_init() is confusing as there is a mismatch with the > JVMTI phases order. > It fills like it means JFR init event (not VM init) or something like > this. > Or maybe it denotes the VM initialization start. :) > I'll be happy if you could explain it a little bit. > > Not sure, if your agent needs to enable these capabilities (introduced > in JDK 9 with modules): > ? can_generate_early_vmstart > ? can_generate_early_class_hook_events > > Thanks, > Serguei > > > On 11/19/19 06:38, Markus Gronlund wrote: > > Greetings, > > (apologies for the wide distribution) > > Kindly asking for reviews for the following changeset: > > Bug:https://bugs.openjdk.java.net/browse/JDK-8233197 > > Webrev:http://cr.openjdk.java.net/~mgronlun/8233197/webrev01/ > > Testing: serviceability/jvmti, jdk_jfr, tier1-5 > > Summary: please see bug for description. > > For Runtime / Serviceability folks: > > This change slightly modifies the relative order in Threads::create_vm(); please see threads.cpp. > > There is an upcall as part of Jfr::on_vm_start() that delivers global JFR command-line options to Java (only if set). > > The behavioral change amounts to a few classes loaded as part of establishing this upcall (all internal JFR classes and/or java.base classes, loaded by the bootloader) no longer being visible to the ClassFileLoadHook's of agents. These classes are visible to agents that work with "early_start" JVMTI environments however. > > The major part of JFR startup with associated class loading still happens as part of Jfr::on_vm_live() with no behavioral change in relation to agents. > > Thank you > > Markus > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Nov 21 02:22:57 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 20 Nov 2019 18:22:57 -0800 Subject: RFR(S): 8169467: GetLocalInstance returns JVMTI_ERROR_TYPE_MISMATCH (rather than JVMTI_ERROR_INVALID_SLOT) on static method In-Reply-To: References: <2df9eec5-b695-2f4e-8c25-65c5177f2208@oracle.com> Message-ID: +1 On 11/20/19 3:32 PM, Alex Menkov wrote: > Looks good. > > --alex > > On 11/19/2019 15:36, serguei.spitsyn at oracle.com wrote: >> Please, review a fix for: >> ?? https://bugs.openjdk.java.net/browse/JDK-8169467 >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2019/8169467-jvmti-local-instance.1/ >> >> >> Summary: >> ?? The JVMTI GetLocalInstance function should return >> JVMTI_ERROR_INVALID_SLOT for static method frames. >> ?? Instead, it returns the JVMTI_ERROR_TYPE_MISMATCH error. >> ?? The fix adds necessary checks into the implementation. >> >> Testing in progress: >> ?? - Locally on Linux-x64 with: vmTestbase_nsk_jvmti, >> vmTestbase_nsk_jdi, jdk_jdi >> ?? - All mach5 hs-tier5 >> >> Thanks, >> Serguei From coleen.phillimore at oracle.com Thu Nov 21 14:12:00 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 21 Nov 2019 09:12:00 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> Message-ID: <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> Please review a new version of this change that keeps the nmethod from being unloaded, after it is added to the deferred event queue: http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html Ran the test that failed 100 times without failure, tier1 on Oracle supported platforms, and tier2-3 including jvmti and jdi tests locally. See bug for more details about the crash. https://bugs.openjdk.java.net/browse/JDK-8173361 Thanks, Coleen On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: > > Hi Serguei, > > Sorry for not sending an update.? I talked to Erik and am working on a > version that keeps the nmethod from being unloaded while it's in the > deferred event queue, with a version that the GC people will like, and > I like.? I'm testing it out now. > > Thanks! > Coleen > > > On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >> Hi Coleen, >> >> Sorry for the latency, I had to investigate it a little bit. >> I still have some doubt your fix is right thing to do. >> >> >> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Coleen, >>>> >>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Hi, I've been working on answers to these questions, so I'll start >>>>> with this one. >>>>> >>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>> (made_zombie or memory released) by the sweeper, but the nmethod >>>>> could be unloaded.? Unloading the nmethod clears the Method* >>>>> _method field. >>>> >>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>> >>>>> The post_compiled_method_load event needs the _method field to >>>>> look at things like inlining and ScopeDesc fields.?? If the >>>>> nmethod is unloaded, some of the oops are dead.? There are >>>>> "holder" oops that correspond to the metadata in the nmethod.? If >>>>> these oops are dead, causing the nmethod to get unloaded, then the >>>>> metadata may not be valid. >>>>> >>>>> So my change 02 looks for a NULL nmethod._method field to tell >>>>> whether we can post information about the nmethod. >>>>> >>>>> There's code in nmethod.cpp like: >>>>> >>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>> ? if (_jmethod_id == NULL) { >>>>> ??? // Cache the jmethod_id since it can no longer be looked up >>>>> once the >>>>> ??? // method itself has been marked for unloading. >>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>> ? } >>>>> ? return _jmethod_id; >>>>> } >>>>> >>>>> Which was added when post_method_load and unload were turned into >>>>> deferred events. >>>> >>>> Could we cache the jmethodID in the >>>> JvmtiDeferredEvent::compiled_method_load_event >>>> similarly as we do in the >>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>> This would help to get rid of the dependency on the nmethod::_method. >>>> Do we depend on any other nmethod fields? >>> >>> Yes, there are other nmethod metadata that we rely on to print >>> inline information, and this function >>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses >>> the ScopeDesc data in the nmethod. >> >> One possible approach is to prepare and cache all this information >> in the nmethod::post_compiled_method_load_event() before the >> JvmtiDeferredEvent::compiled_method_load_event() is called. >> The event parameters are: >> typedef struct { >> const void* start_address; >> jlocation location; >> } jvmtiAddrLocationMap; >> CompiledMethodLoad(jvmtiEnv *jvmti_env, >> jmethodID method, >> jint code_size, >> const void* code_addr, >> jint map_length, >> const jvmtiAddrLocationMap* map, >> const void* compile_info) >> Some of these addresses above could be not accessible when an event >> is posted. >> Not sure yet if it is Okay. >> The question is if this kind of refactoring is worth and right thing >> to do. >> >>> >>> We do cache the jmethodID but that's not good enough.? See my last >>> comment in the bug report.? The jmethodID can point to an unloaded >>> method. >> >> This looks like it is done a little bit late. >> It'd better to do it before the event is deferred (see above). >> >>> I tried a version of keeping the nmethod alive, but the GC folks >>> will hate it.? And it doesn't work and I hate it. >> >> From serviceability point of view this is the best and most >> consistent approach. >> I seems to me, it was initially designed this way. >> The downside is it adds some extra complexity to the GC. >> >>> My version 01 is the best, with the caveat that maybe it should >>> check for _method == NULL instead of nmethod->is_alive(). I have to >>> talk to Erik to see if there's a race with concurrent class unloading. >>> >>> Any application that depends on a compiled method loading event on a >>> class that could be unloaded is a buggy application.? Applications >>> should not rely on when the JIT compiler decides to compile a >>> method!? This happens to us for a stress test.? Most applications >>> will get most of their compiled method loading events as they >>> normally do. >> >> It is not an application that relies on the compiled method loading >> event. >> It is about profiling tools to be able to get correct information >> about what is going on with compilations. >> My concern is that if we skip such compiled method load events then >> profilers have no way >> to find out there many unneeded compilations that are thrown away >> without any real use. >> Also, it is not clear what happens with the subsequent compiled >> method unload events. >> Are they going to be skipped as well or they can appear and confuse >> profilers? >> >> >> Thanks, >> Serguei >>> >>> Thanks, >>> Coleen >>> >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> I put more debugging in the bug to show this crash was from an >>>>> unloaded nmethod. >>>>> >>>>> Coleen >>>>> >>>>> >>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> I have some questions. >>>>>> >>>>>> Both the compiler method load and unload are posted as deferred >>>>>> events. >>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>> processes the event. >>>>>> >>>>>> The implementation is: >>>>>> >>>>>> JvmtiDeferredEvent >>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>> ? . . . >>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>> ? // this deferred event. >>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>> ? return event; >>>>>> } >>>>>> >>>>>> JvmtiDeferredEvent >>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>>> jmethodID id, const void* code) { >>>>>> ? . . . >>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>> ? // this deferred event. This will keep the memory for the >>>>>> ? // generated code from being reused too early. We pass >>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>> ? // made into a zombie can be locked. >>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>> ? return event; >>>>>> } >>>>>> >>>>>> void JvmtiDeferredEvent::post() { >>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>> ???????? "Service thread must post enqueued events"); >>>>>> ? switch(_type) { >>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>> ????? JvmtiExport::post_compiled_method_load(nm); >>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>> ????? break; >>>>>> ??? } >>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>>> ??????? _event_data.compiled_method_unload.method_id, >>>>>> ??????? _event_data.compiled_method_unload.code_begin); >>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>> ????? break; >>>>>> ??? } >>>>>> ??? . . . >>>>>> ? } >>>>>> } >>>>>> >>>>>> Then I wonder how is it possible for the nmethod to be not alive >>>>>> here?: >>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>> . . . >>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>> 2174 if (!nm->is_alive()) { >>>>>> 2175 return; >>>>>> 2176 } >>>>>> At least, it lokks like something else is broken. >>>>>> Do I miss something important here? >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Don't post information which uses metadata from >>>>>>> unloaded nmethods >>>>>>> >>>>>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>>>>> failure without the fix). >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.osterlund at oracle.com Thu Nov 21 15:32:24 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Thu, 21 Nov 2019 16:32:24 +0100 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> Message-ID: <339020c0-eb23-1d9f-60d3-26faace925b3@oracle.com> Hi Coleen, Thanks for removing the nmethodLocker. I'm on a mission to remove all nmethod lockers, and this one is really nasty. Looks good. Thanks, /Erik On 11/21/19 3:12 PM, coleen.phillimore at oracle.com wrote: > > Please review a new version of this change that keeps the nmethod from > being unloaded, after it is added to the deferred event queue: > > http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html > > Ran the test that failed 100 times without failure, tier1 on Oracle > supported platforms, and tier2-3 including jvmti and jdi tests locally. > > See bug for more details about the crash. > > https://bugs.openjdk.java.net/browse/JDK-8173361 > > Thanks, > Coleen > > On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >> >> Hi Serguei, >> >> Sorry for not sending an update.? I talked to Erik and am working on >> a version that keeps the nmethod from being unloaded while it's in >> the deferred event queue, with a version that the GC people will >> like, and I like.? I'm testing it out now. >> >> Thanks! >> Coleen >> >> >> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Coleen, >>> >>> Sorry for the latency, I had to investigate it a little bit. >>> I still have some doubt your fix is right thing to do. >>> >>> >>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Coleen, >>>>> >>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>> start with this one. >>>>>> >>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>> (made_zombie or memory released) by the sweeper, but the nmethod >>>>>> could be unloaded.? Unloading the nmethod clears the Method* >>>>>> _method field. >>>>> >>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>> >>>>>> The post_compiled_method_load event needs the _method field to >>>>>> look at things like inlining and ScopeDesc fields.?? If the >>>>>> nmethod is unloaded, some of the oops are dead.? There are >>>>>> "holder" oops that correspond to the metadata in the nmethod.? If >>>>>> these oops are dead, causing the nmethod to get unloaded, then >>>>>> the metadata may not be valid. >>>>>> >>>>>> So my change 02 looks for a NULL nmethod._method field to tell >>>>>> whether we can post information about the nmethod. >>>>>> >>>>>> There's code in nmethod.cpp like: >>>>>> >>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>> ? if (_jmethod_id == NULL) { >>>>>> ??? // Cache the jmethod_id since it can no longer be looked up >>>>>> once the >>>>>> ??? // method itself has been marked for unloading. >>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>> ? } >>>>>> ? return _jmethod_id; >>>>>> } >>>>>> >>>>>> Which was added when post_method_load and unload were turned into >>>>>> deferred events. >>>>> >>>>> Could we cache the jmethodID in the >>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>> similarly as we do in the >>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>> This would help to get rid of the dependency on the nmethod::_method. >>>>> Do we depend on any other nmethod fields? >>>> >>>> Yes, there are other nmethod metadata that we rely on to print >>>> inline information, and this function >>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses >>>> the ScopeDesc data in the nmethod. >>> >>> One possible approach is to prepare and cache all this information >>> in the nmethod::post_compiled_method_load_event() before the >>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>> The event parameters are: >>> typedef struct { >>> const void* start_address; >>> jlocation location; >>> } jvmtiAddrLocationMap; >>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>> jmethodID method, >>> jint code_size, >>> const void* code_addr, >>> jint map_length, >>> const jvmtiAddrLocationMap* map, >>> const void* compile_info) >>> Some of these addresses above could be not accessible when an event >>> is posted. >>> Not sure yet if it is Okay. >>> The question is if this kind of refactoring is worth and right thing >>> to do. >>> >>>> >>>> We do cache the jmethodID but that's not good enough.? See my last >>>> comment in the bug report.? The jmethodID can point to an unloaded >>>> method. >>> >>> This looks like it is done a little bit late. >>> It'd better to do it before the event is deferred (see above). >>> >>>> I tried a version of keeping the nmethod alive, but the GC folks >>>> will hate it.? And it doesn't work and I hate it. >>> >>> From serviceability point of view this is the best and most >>> consistent approach. >>> I seems to me, it was initially designed this way. >>> The downside is it adds some extra complexity to the GC. >>> >>>> My version 01 is the best, with the caveat that maybe it should >>>> check for _method == NULL instead of nmethod->is_alive().? I have >>>> to talk to Erik to see if there's a race with concurrent class >>>> unloading. >>>> >>>> Any application that depends on a compiled method loading event on >>>> a class that could be unloaded is a buggy application.? >>>> Applications should not rely on when the JIT compiler decides to >>>> compile a method!? This happens to us for a stress test.? Most >>>> applications will get most of their compiled method loading events >>>> as they normally do. >>> >>> It is not an application that relies on the compiled method loading >>> event. >>> It is about profiling tools to be able to get correct information >>> about what is going on with compilations. >>> My concern is that if we skip such compiled method load events then >>> profilers have no way >>> to find out there many unneeded compilations that are thrown away >>> without any real use. >>> Also, it is not clear what happens with the subsequent compiled >>> method unload events. >>> Are they going to be skipped as well or they can appear and confuse >>> profilers? >>> >>> >>> Thanks, >>> Serguei >>>> >>>> Thanks, >>>> Coleen >>>> >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> I put more debugging in the bug to show this crash was from an >>>>>> unloaded nmethod. >>>>>> >>>>>> Coleen >>>>>> >>>>>> >>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> I have some questions. >>>>>>> >>>>>>> Both the compiler method load and unload are posted as deferred >>>>>>> events. >>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>> processes the event. >>>>>>> >>>>>>> The implementation is: >>>>>>> >>>>>>> JvmtiDeferredEvent >>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>> ? . . . >>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>> ? // this deferred event. >>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>> ? return event; >>>>>>> } >>>>>>> >>>>>>> JvmtiDeferredEvent >>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>>>> jmethodID id, const void* code) { >>>>>>> ? . . . >>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>> ? // generated code from being reused too early. We pass >>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>> ? // made into a zombie can be locked. >>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>> ? return event; >>>>>>> } >>>>>>> >>>>>>> void JvmtiDeferredEvent::post() { >>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>> ? switch(_type) { >>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>> ????? JvmtiExport::post_compiled_method_load(nm); >>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>> ????? break; >>>>>>> ??? } >>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>>>> ??????? _event_data.compiled_method_unload.method_id, >>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>> ????? break; >>>>>>> ??? } >>>>>>> ??? . . . >>>>>>> ? } >>>>>>> } >>>>>>> >>>>>>> Then I wonder how is it possible for the nmethod to be not alive >>>>>>> here?: >>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>> . . . >>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>> 2174 if (!nm->is_alive()) { >>>>>>> 2175 return; >>>>>>> 2176 } >>>>>>> At least, it lokks like something else is broken. >>>>>>> Do I miss something important here? >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>> unloaded nmethods >>>>>>>> >>>>>>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>>>>>> failure without the fix). >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Thu Nov 21 18:22:53 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 21 Nov 2019 13:22:53 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <339020c0-eb23-1d9f-60d3-26faace925b3@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> <339020c0-eb23-1d9f-60d3-26faace925b3@oracle.com> Message-ID: <4aeaf9f8-b9f8-0f1f-97bd-349fea17bc3d@oracle.com> On 11/21/19 10:32 AM, erik.osterlund at oracle.com wrote: > Hi Coleen, > > Thanks for removing the nmethodLocker. I'm on a mission to remove all > nmethod lockers, and this one is really nasty. Thanks Erik, and thank you for the help. I'm trying to help you get rid of nmethodLockers but there's another nasty one. Coleen > Looks good. > > Thanks, > /Erik > > On 11/21/19 3:12 PM, coleen.phillimore at oracle.com wrote: >> >> Please review a new version of this change that keeps the nmethod >> from being unloaded, after it is added to the deferred event queue: >> >> http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html >> >> Ran the test that failed 100 times without failure, tier1 on Oracle >> supported platforms, and tier2-3 including jvmti and jdi tests locally. >> >> See bug for more details about the crash. >> >> https://bugs.openjdk.java.net/browse/JDK-8173361 >> >> Thanks, >> Coleen >> >> On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >>> >>> Hi Serguei, >>> >>> Sorry for not sending an update.? I talked to Erik and am working on >>> a version that keeps the nmethod from being unloaded while it's in >>> the deferred event queue, with a version that the GC people will >>> like, and I like.? I'm testing it out now. >>> >>> Thanks! >>> Coleen >>> >>> >>> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Coleen, >>>> >>>> Sorry for the latency, I had to investigate it a little bit. >>>> I still have some doubt your fix is right thing to do. >>>> >>>> >>>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>>> start with this one. >>>>>>> >>>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>>> (made_zombie or memory released) by the sweeper, but the nmethod >>>>>>> could be unloaded.? Unloading the nmethod clears the Method* >>>>>>> _method field. >>>>>> >>>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>>> >>>>>>> The post_compiled_method_load event needs the _method field to >>>>>>> look at things like inlining and ScopeDesc fields.?? If the >>>>>>> nmethod is unloaded, some of the oops are dead.? There are >>>>>>> "holder" oops that correspond to the metadata in the nmethod.? >>>>>>> If these oops are dead, causing the nmethod to get unloaded, >>>>>>> then the metadata may not be valid. >>>>>>> >>>>>>> So my change 02 looks for a NULL nmethod._method field to tell >>>>>>> whether we can post information about the nmethod. >>>>>>> >>>>>>> There's code in nmethod.cpp like: >>>>>>> >>>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>>> ? if (_jmethod_id == NULL) { >>>>>>> ??? // Cache the jmethod_id since it can no longer be looked up >>>>>>> once the >>>>>>> ??? // method itself has been marked for unloading. >>>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>>> ? } >>>>>>> ? return _jmethod_id; >>>>>>> } >>>>>>> >>>>>>> Which was added when post_method_load and unload were turned >>>>>>> into deferred events. >>>>>> >>>>>> Could we cache the jmethodID in the >>>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>>> similarly as we do in the >>>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>>> This would help to get rid of the dependency on the nmethod::_method. >>>>>> Do we depend on any other nmethod fields? >>>>> >>>>> Yes, there are other nmethod metadata that we rely on to print >>>>> inline information, and this function >>>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses >>>>> the ScopeDesc data in the nmethod. >>>> >>>> One possible approach is to prepare and cache all this information >>>> in the nmethod::post_compiled_method_load_event() before the >>>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>>> The event parameters are: >>>> typedef struct { >>>> const void* start_address; >>>> jlocation location; >>>> } jvmtiAddrLocationMap; >>>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>>> jmethodID method, >>>> jint code_size, >>>> const void* code_addr, >>>> jint map_length, >>>> const jvmtiAddrLocationMap* map, >>>> const void* compile_info) >>>> Some of these addresses above could be not accessible when an event >>>> is posted. >>>> Not sure yet if it is Okay. >>>> The question is if this kind of refactoring is worth and right >>>> thing to do. >>>> >>>>> >>>>> We do cache the jmethodID but that's not good enough.? See my last >>>>> comment in the bug report.? The jmethodID can point to an unloaded >>>>> method. >>>> >>>> This looks like it is done a little bit late. >>>> It'd better to do it before the event is deferred (see above). >>>> >>>>> I tried a version of keeping the nmethod alive, but the GC folks >>>>> will hate it.? And it doesn't work and I hate it. >>>> >>>> From serviceability point of view this is the best and most >>>> consistent approach. >>>> I seems to me, it was initially designed this way. >>>> The downside is it adds some extra complexity to the GC. >>>> >>>>> My version 01 is the best, with the caveat that maybe it should >>>>> check for _method == NULL instead of nmethod->is_alive().? I have >>>>> to talk to Erik to see if there's a race with concurrent class >>>>> unloading. >>>>> >>>>> Any application that depends on a compiled method loading event on >>>>> a class that could be unloaded is a buggy application.? >>>>> Applications should not rely on when the JIT compiler decides to >>>>> compile a method!? This happens to us for a stress test.? Most >>>>> applications will get most of their compiled method loading events >>>>> as they normally do. >>>> >>>> It is not an application that relies on the compiled method loading >>>> event. >>>> It is about profiling tools to be able to get correct information >>>> about what is going on with compilations. >>>> My concern is that if we skip such compiled method load events then >>>> profilers have no way >>>> to find out there many unneeded compilations that are thrown away >>>> without any real use. >>>> Also, it is not clear what happens with the subsequent compiled >>>> method unload events. >>>> Are they going to be skipped as well or they can appear and confuse >>>> profilers? >>>> >>>> >>>> Thanks, >>>> Serguei >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> I put more debugging in the bug to show this crash was from an >>>>>>> unloaded nmethod. >>>>>>> >>>>>>> Coleen >>>>>>> >>>>>>> >>>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> I have some questions. >>>>>>>> >>>>>>>> Both the compiler method load and unload are posted as deferred >>>>>>>> events. >>>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>>> processes the event. >>>>>>>> >>>>>>>> The implementation is: >>>>>>>> >>>>>>>> JvmtiDeferredEvent >>>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>>> ? . . . >>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>> ? // this deferred event. >>>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>>> ? return event; >>>>>>>> } >>>>>>>> >>>>>>>> JvmtiDeferredEvent >>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>>>>> jmethodID id, const void* code) { >>>>>>>> ? . . . >>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>>> ? // generated code from being reused too early. We pass >>>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>>> ? // made into a zombie can be locked. >>>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>>> ? return event; >>>>>>>> } >>>>>>>> >>>>>>>> void JvmtiDeferredEvent::post() { >>>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>>> ? switch(_type) { >>>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>>> ????? JvmtiExport::post_compiled_method_load(nm); >>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>> ????? break; >>>>>>>> ??? } >>>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>>>>> _event_data.compiled_method_unload.method_id, >>>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>> ????? break; >>>>>>>> ??? } >>>>>>>> ??? . . . >>>>>>>> ? } >>>>>>>> } >>>>>>>> >>>>>>>> Then I wonder how is it possible for the nmethod to be not >>>>>>>> alive here?: >>>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>>> . . . >>>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>>> 2174 if (!nm->is_alive()) { >>>>>>>> 2175 return; >>>>>>>> 2176 } >>>>>>>> At least, it lokks like something else is broken. >>>>>>>> Do I miss something important here? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>>> unloaded nmethods >>>>>>>>> >>>>>>>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>>>>>>> failure without the fix). >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Nov 22 01:33:08 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 21 Nov 2019 17:33:08 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> Message-ID: <65499c3c-de5e-6913-22b0-451730d3e7dd@oracle.com> Hi Coleen, Looks good in general. Nice approach, thank you for working on this! It is great to get rid of the nmethodLocker's in JvmtiDeferredEvent class. I have some questions/comments. http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/runtime/serviceThread.cpp.frames.html 49 JvmtiDeferredEvent* ServiceThread::_jvmti_event = NULL; ?The ServiceThread processes only one JVMTI event at each iteration. ?So, having this static field is Okay. http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/runtime/mutexLocker.cpp.udiff.html - def(JmethodIdCreation_lock , PaddedMutex , leaf, true, _safepoint_check_always); // used for creating jmethodIDs. + def(JmethodIdCreation_lock , PaddedMutex , leaf, true, _safepoint_check_never); // used for creating jmethodIDs. ? This needs a good testing coverage to be safe (everything ? that is transitively using InstanceKlass::get_jmethod_id). ? It would be easier to just run all the JVMTI, JDI and j.l.instrument tests. ? The hs-tier5 should have most of them. http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html -void JvmtiDeferredEventQueue::enqueue(const JvmtiDeferredEvent& event) { +void JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) { ? What was the motivation for this change? ? Why the copy semantics is needed here? Thanks, Serguei On 11/21/19 6:12 AM, coleen.phillimore at oracle.com wrote: > > Please review a new version of this change that keeps the nmethod from > being unloaded, after it is added to the deferred event queue: > > http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html > > Ran the test that failed 100 times without failure, tier1 on Oracle > supported platforms, and tier2-3 including jvmti and jdi tests locally. > > See bug for more details about the crash. > > https://bugs.openjdk.java.net/browse/JDK-8173361 > > Thanks, > Coleen > > On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >> >> Hi Serguei, >> >> Sorry for not sending an update.? I talked to Erik and am working on >> a version that keeps the nmethod from being unloaded while it's in >> the deferred event queue, with a version that the GC people will >> like, and I like.? I'm testing it out now. >> >> Thanks! >> Coleen >> >> >> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Coleen, >>> >>> Sorry for the latency, I had to investigate it a little bit. >>> I still have some doubt your fix is right thing to do. >>> >>> >>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Coleen, >>>>> >>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>> start with this one. >>>>>> >>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>> (made_zombie or memory released) by the sweeper, but the nmethod >>>>>> could be unloaded.? Unloading the nmethod clears the Method* >>>>>> _method field. >>>>> >>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>> >>>>>> The post_compiled_method_load event needs the _method field to >>>>>> look at things like inlining and ScopeDesc fields.?? If the >>>>>> nmethod is unloaded, some of the oops are dead.? There are >>>>>> "holder" oops that correspond to the metadata in the nmethod.? If >>>>>> these oops are dead, causing the nmethod to get unloaded, then >>>>>> the metadata may not be valid. >>>>>> >>>>>> So my change 02 looks for a NULL nmethod._method field to tell >>>>>> whether we can post information about the nmethod. >>>>>> >>>>>> There's code in nmethod.cpp like: >>>>>> >>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>> ? if (_jmethod_id == NULL) { >>>>>> ??? // Cache the jmethod_id since it can no longer be looked up >>>>>> once the >>>>>> ??? // method itself has been marked for unloading. >>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>> ? } >>>>>> ? return _jmethod_id; >>>>>> } >>>>>> >>>>>> Which was added when post_method_load and unload were turned into >>>>>> deferred events. >>>>> >>>>> Could we cache the jmethodID in the >>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>> similarly as we do in the >>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>> This would help to get rid of the dependency on the nmethod::_method. >>>>> Do we depend on any other nmethod fields? >>>> >>>> Yes, there are other nmethod metadata that we rely on to print >>>> inline information, and this function >>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses >>>> the ScopeDesc data in the nmethod. >>> >>> One possible approach is to prepare and cache all this information >>> in the nmethod::post_compiled_method_load_event() before the >>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>> The event parameters are: >>> typedef struct { >>> const void* start_address; >>> jlocation location; >>> } jvmtiAddrLocationMap; >>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>> jmethodID method, >>> jint code_size, >>> const void* code_addr, >>> jint map_length, >>> const jvmtiAddrLocationMap* map, >>> const void* compile_info) >>> Some of these addresses above could be not accessible when an event >>> is posted. >>> Not sure yet if it is Okay. >>> The question is if this kind of refactoring is worth and right thing >>> to do. >>> >>>> >>>> We do cache the jmethodID but that's not good enough.? See my last >>>> comment in the bug report.? The jmethodID can point to an unloaded >>>> method. >>> >>> This looks like it is done a little bit late. >>> It'd better to do it before the event is deferred (see above). >>> >>>> I tried a version of keeping the nmethod alive, but the GC folks >>>> will hate it.? And it doesn't work and I hate it. >>> >>> From serviceability point of view this is the best and most >>> consistent approach. >>> I seems to me, it was initially designed this way. >>> The downside is it adds some extra complexity to the GC. >>> >>>> My version 01 is the best, with the caveat that maybe it should >>>> check for _method == NULL instead of nmethod->is_alive().? I have >>>> to talk to Erik to see if there's a race with concurrent class >>>> unloading. >>>> >>>> Any application that depends on a compiled method loading event on >>>> a class that could be unloaded is a buggy application.? >>>> Applications should not rely on when the JIT compiler decides to >>>> compile a method!? This happens to us for a stress test.? Most >>>> applications will get most of their compiled method loading events >>>> as they normally do. >>> >>> It is not an application that relies on the compiled method loading >>> event. >>> It is about profiling tools to be able to get correct information >>> about what is going on with compilations. >>> My concern is that if we skip such compiled method load events then >>> profilers have no way >>> to find out there many unneeded compilations that are thrown away >>> without any real use. >>> Also, it is not clear what happens with the subsequent compiled >>> method unload events. >>> Are they going to be skipped as well or they can appear and confuse >>> profilers? >>> >>> >>> Thanks, >>> Serguei >>>> >>>> Thanks, >>>> Coleen >>>> >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> I put more debugging in the bug to show this crash was from an >>>>>> unloaded nmethod. >>>>>> >>>>>> Coleen >>>>>> >>>>>> >>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> I have some questions. >>>>>>> >>>>>>> Both the compiler method load and unload are posted as deferred >>>>>>> events. >>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>> processes the event. >>>>>>> >>>>>>> The implementation is: >>>>>>> >>>>>>> JvmtiDeferredEvent >>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>> ? . . . >>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>> ? // this deferred event. >>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>> ? return event; >>>>>>> } >>>>>>> >>>>>>> JvmtiDeferredEvent >>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>>>> jmethodID id, const void* code) { >>>>>>> ? . . . >>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>> ? // generated code from being reused too early. We pass >>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>> ? // made into a zombie can be locked. >>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>> ? return event; >>>>>>> } >>>>>>> >>>>>>> void JvmtiDeferredEvent::post() { >>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>> ? switch(_type) { >>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>> ????? JvmtiExport::post_compiled_method_load(nm); >>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>> ????? break; >>>>>>> ??? } >>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>>>> ??????? _event_data.compiled_method_unload.method_id, >>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>> ????? break; >>>>>>> ??? } >>>>>>> ??? . . . >>>>>>> ? } >>>>>>> } >>>>>>> >>>>>>> Then I wonder how is it possible for the nmethod to be not alive >>>>>>> here?: >>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>> . . . >>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>> 2174 if (!nm->is_alive()) { >>>>>>> 2175 return; >>>>>>> 2176 } >>>>>>> At least, it lokks like something else is broken. >>>>>>> Do I miss something important here? >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>> unloaded nmethods >>>>>>>> >>>>>>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>>>>>> failure without the fix). >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Fri Nov 22 05:00:44 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 22 Nov 2019 00:00:44 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <65499c3c-de5e-6913-22b0-451730d3e7dd@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> <65499c3c-de5e-6913-22b0-451730d3e7dd@oracle.com> Message-ID: On 11/21/19 8:33 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > Looks good in general. Serguei,? Thank you for reviewing this. > > Nice approach, thank you for working on this! > It is great to get rid of the nmethodLocker's in JvmtiDeferredEvent class. > > I have some questions/comments. > > http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/runtime/serviceThread.cpp.frames.html > 49 JvmtiDeferredEvent* ServiceThread::_jvmti_event = NULL; > ?The ServiceThread processes only one JVMTI event at each iteration. > ?So, having this static field is Okay. Yes, the ServiceThread has locked these fields or is running so doesn't modify the current jvmti_event until it's done. > > http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/runtime/mutexLocker.cpp.udiff.html > - def(JmethodIdCreation_lock , PaddedMutex , leaf, true, > _safepoint_check_always); // used for creating jmethodIDs. > + def(JmethodIdCreation_lock , PaddedMutex , leaf, true, > _safepoint_check_never); // used for creating jmethodIDs. > ? This needs a good testing coverage to be safe (everything > ? that is transitively using InstanceKlass::get_jmethod_id). > ? It would be easier to just run all the JVMTI, JDI and j.l.instrument > tests. > ? The hs-tier5 should have most of them. I ran these tests locally and mach5 1-3, but I'll run through hs-tier5 tonight. > > http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html > -void JvmtiDeferredEventQueue::enqueue(const JvmtiDeferredEvent& event) { > +void JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) { > ? What was the motivation for this change? > ? Why the copy semantics is needed here? The event is copied to the QueueNode _event field when it is enqueued.? I took out too many consts and meant to put them back in.? The only one that couldn't be const is: ??? JvmtiDeferredEvent& event() { return _event; } because we call oops_do and nmethods_do on it. I'm rerunning tests and will send out another patch if it differs too much from the original. Thanks for reviewing. Coleen > > Thanks, > Serguei > > > On 11/21/19 6:12 AM, coleen.phillimore at oracle.com wrote: >> >> Please review a new version of this change that keeps the nmethod >> from being unloaded, after it is added to the deferred event queue: >> >> http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html >> >> Ran the test that failed 100 times without failure, tier1 on Oracle >> supported platforms, and tier2-3 including jvmti and jdi tests locally. >> >> See bug for more details about the crash. >> >> https://bugs.openjdk.java.net/browse/JDK-8173361 >> >> Thanks, >> Coleen >> >> On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >>> >>> Hi Serguei, >>> >>> Sorry for not sending an update.? I talked to Erik and am working on >>> a version that keeps the nmethod from being unloaded while it's in >>> the deferred event queue, with a version that the GC people will >>> like, and I like.? I'm testing it out now. >>> >>> Thanks! >>> Coleen >>> >>> >>> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Coleen, >>>> >>>> Sorry for the latency, I had to investigate it a little bit. >>>> I still have some doubt your fix is right thing to do. >>>> >>>> >>>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>>> start with this one. >>>>>>> >>>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>>> (made_zombie or memory released) by the sweeper, but the nmethod >>>>>>> could be unloaded.? Unloading the nmethod clears the Method* >>>>>>> _method field. >>>>>> >>>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>>> >>>>>>> The post_compiled_method_load event needs the _method field to >>>>>>> look at things like inlining and ScopeDesc fields.?? If the >>>>>>> nmethod is unloaded, some of the oops are dead.? There are >>>>>>> "holder" oops that correspond to the metadata in the nmethod.? >>>>>>> If these oops are dead, causing the nmethod to get unloaded, >>>>>>> then the metadata may not be valid. >>>>>>> >>>>>>> So my change 02 looks for a NULL nmethod._method field to tell >>>>>>> whether we can post information about the nmethod. >>>>>>> >>>>>>> There's code in nmethod.cpp like: >>>>>>> >>>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>>> ? if (_jmethod_id == NULL) { >>>>>>> ??? // Cache the jmethod_id since it can no longer be looked up >>>>>>> once the >>>>>>> ??? // method itself has been marked for unloading. >>>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>>> ? } >>>>>>> ? return _jmethod_id; >>>>>>> } >>>>>>> >>>>>>> Which was added when post_method_load and unload were turned >>>>>>> into deferred events. >>>>>> >>>>>> Could we cache the jmethodID in the >>>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>>> similarly as we do in the >>>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>>> This would help to get rid of the dependency on the nmethod::_method. >>>>>> Do we depend on any other nmethod fields? >>>>> >>>>> Yes, there are other nmethod metadata that we rely on to print >>>>> inline information, and this function >>>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses >>>>> the ScopeDesc data in the nmethod. >>>> >>>> One possible approach is to prepare and cache all this information >>>> in the nmethod::post_compiled_method_load_event() before the >>>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>>> The event parameters are: >>>> typedef struct { >>>> const void* start_address; >>>> jlocation location; >>>> } jvmtiAddrLocationMap; >>>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>>> jmethodID method, >>>> jint code_size, >>>> const void* code_addr, >>>> jint map_length, >>>> const jvmtiAddrLocationMap* map, >>>> const void* compile_info) >>>> Some of these addresses above could be not accessible when an event >>>> is posted. >>>> Not sure yet if it is Okay. >>>> The question is if this kind of refactoring is worth and right >>>> thing to do. >>>> >>>>> >>>>> We do cache the jmethodID but that's not good enough.? See my last >>>>> comment in the bug report.? The jmethodID can point to an unloaded >>>>> method. >>>> >>>> This looks like it is done a little bit late. >>>> It'd better to do it before the event is deferred (see above). >>>> >>>>> I tried a version of keeping the nmethod alive, but the GC folks >>>>> will hate it.? And it doesn't work and I hate it. >>>> >>>> From serviceability point of view this is the best and most >>>> consistent approach. >>>> I seems to me, it was initially designed this way. >>>> The downside is it adds some extra complexity to the GC. >>>> >>>>> My version 01 is the best, with the caveat that maybe it should >>>>> check for _method == NULL instead of nmethod->is_alive().? I have >>>>> to talk to Erik to see if there's a race with concurrent class >>>>> unloading. >>>>> >>>>> Any application that depends on a compiled method loading event on >>>>> a class that could be unloaded is a buggy application.? >>>>> Applications should not rely on when the JIT compiler decides to >>>>> compile a method!? This happens to us for a stress test.? Most >>>>> applications will get most of their compiled method loading events >>>>> as they normally do. >>>> >>>> It is not an application that relies on the compiled method loading >>>> event. >>>> It is about profiling tools to be able to get correct information >>>> about what is going on with compilations. >>>> My concern is that if we skip such compiled method load events then >>>> profilers have no way >>>> to find out there many unneeded compilations that are thrown away >>>> without any real use. >>>> Also, it is not clear what happens with the subsequent compiled >>>> method unload events. >>>> Are they going to be skipped as well or they can appear and confuse >>>> profilers? >>>> >>>> >>>> Thanks, >>>> Serguei >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> I put more debugging in the bug to show this crash was from an >>>>>>> unloaded nmethod. >>>>>>> >>>>>>> Coleen >>>>>>> >>>>>>> >>>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> I have some questions. >>>>>>>> >>>>>>>> Both the compiler method load and unload are posted as deferred >>>>>>>> events. >>>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>>> processes the event. >>>>>>>> >>>>>>>> The implementation is: >>>>>>>> >>>>>>>> JvmtiDeferredEvent >>>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>>> ? . . . >>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>> ? // this deferred event. >>>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>>> ? return event; >>>>>>>> } >>>>>>>> >>>>>>>> JvmtiDeferredEvent >>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>>>>> jmethodID id, const void* code) { >>>>>>>> ? . . . >>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>>> ? // generated code from being reused too early. We pass >>>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>>> ? // made into a zombie can be locked. >>>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>>> ? return event; >>>>>>>> } >>>>>>>> >>>>>>>> void JvmtiDeferredEvent::post() { >>>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>>> ? switch(_type) { >>>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>>> ????? JvmtiExport::post_compiled_method_load(nm); >>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>> ????? break; >>>>>>>> ??? } >>>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>>>>> _event_data.compiled_method_unload.method_id, >>>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>> ????? break; >>>>>>>> ??? } >>>>>>>> ??? . . . >>>>>>>> ? } >>>>>>>> } >>>>>>>> >>>>>>>> Then I wonder how is it possible for the nmethod to be not >>>>>>>> alive here?: >>>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>>> . . . >>>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>>> 2174 if (!nm->is_alive()) { >>>>>>>> 2175 return; >>>>>>>> 2176 } >>>>>>>> At least, it lokks like something else is broken. >>>>>>>> Do I miss something important here? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>>> unloaded nmethods >>>>>>>>> >>>>>>>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>>>>>>> failure without the fix). >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Fri Nov 22 07:42:46 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 22 Nov 2019 16:42:46 +0900 Subject: 8234624: jstack mixed mode should refer DWARF Message-ID: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> Hi all, I tried to get mixed stack via `jhsdb jstack --mixed`, but I couldn't. (See JBS for details) https://bugs.openjdk.java.net/browse/JDK-8234624 I think it is caused by DWARF. AMD64 needs DWARF for stack unwinding, but SA does not handle it. So I created a patch. It works fine on my Fedora 31 x64 box, but it failed on submit repo. http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" was not found. I wonder why it failed, and why my serviceability/sa tests (with fastdebug build) was succeeded. Can you share details for this test? mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 Thanks, Yasumasa From david.holmes at oracle.com Fri Nov 22 08:08:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 22 Nov 2019 18:08:55 +1000 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> Message-ID: <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: > Hi all, > > I tried to get mixed stack via `jhsdb jstack --mixed`, but I couldn't. > (See JBS for details) > > ? https://bugs.openjdk.java.net/browse/JDK-8234624 > > I think it is caused by DWARF. AMD64 needs DWARF for stack unwinding, > but SA does not handle it. > So I created a patch. It works fine on my Fedora 31 x64 box, but it > failed on submit repo. > > ? http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 > > Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" was > not found. > I wonder why it failed, and why my serviceability/sa tests (with > fastdebug build) was succeeded. > Can you share details for this test? > mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 I can't really shed any light on it, there were lots of failures - see below for example. The issue is with the VM that was being inspected but there's no output from that VM. David ----- ----------System.out:(10/413)---------- Starting TestUniverse Started LingeredApp with G1GC and pid 31111 Starting clhsdb against 31111 [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 [2019-11-22T07:03:44.395815Z] Waiting for completion finished for process 31133 hsdb> Command not valid until attached to a VM hsdb> 'Heap Parameters' missing from stdout/stderr ----------System.err:(53/3915)---------- Command line: ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' 'jdk.test.lib.apps.LingeredApp' '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] Attaching to process 31111, please wait... Unable to connect to process ID 31111: Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) stdout: [ Command not valid until attached to a VM ]; stderr: [ Command not valid until attached to a VM ] exitValue = -1 LingeredApp stdout: []; LingeredApp stderr: [] LingeredApp exitValue = 0 java.lang.RuntimeException: 'Heap Parameters' missing from stdout/stderr From suenaga at oss.nttdata.com Fri Nov 22 08:55:38 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Fri, 22 Nov 2019 17:55:38 +0900 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> Message-ID: <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> Thanks David! Hmm... my slowdebug build works fine on my laptop. I will investigate more. Thanks, Yasumasa On 2019/11/22 17:08, David Holmes wrote: > On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: >> Hi all, >> >> I tried to get mixed stack via `jhsdb jstack --mixed`, but I couldn't. >> (See JBS for details) >> >> ?? https://bugs.openjdk.java.net/browse/JDK-8234624 >> >> I think it is caused by DWARF. AMD64 needs DWARF for stack unwinding, but SA does not handle it. >> So I created a patch. It works fine on my Fedora 31 x64 box, but it failed on submit repo. >> >> ?? http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 >> >> Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" was not found. >> I wonder why it failed, and why my serviceability/sa tests (with fastdebug build) was succeeded. >> Can you share details for this test? mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 > > I can't really shed any light on it, there were lots of failures - see below for example. The issue is with the VM that was being inspected but there's no output from that VM. > > David > ?----- > > ----------System.out:(10/413)---------- > Starting TestUniverse > Started LingeredApp with G1GC and pid 31111 > Starting clhsdb against 31111 > [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 > [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 > [2019-11-22T07:03:44.395815Z] Waiting for completion finished for process 31133 > hsdb> Command not valid until attached to a VM > hsdb> > 'Heap Parameters' missing from stdout/stderr > > ----------System.err:(53/3915)---------- > Command line: ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' 'jdk.test.lib.apps.LingeredApp' '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] > Attaching to process 31111, please wait... > Unable to connect to process ID 31111: > > Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in > remote process) > sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) > ????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) > ????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) > ????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) > ????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) > ????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) > ????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) > ????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) > ????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) > ?stdout: [ Command not valid until attached to a VM > ]; > ?stderr: [ Command not valid until attached to a VM > ] > ?exitValue = -1 > > ?LingeredApp stdout: []; > ?LingeredApp stderr: [] > ?LingeredApp exitValue = 0 > java.lang.RuntimeException: 'Heap Parameters' missing from stdout/stderr From serguei.spitsyn at oracle.com Fri Nov 22 10:10:14 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 22 Nov 2019 02:10:14 -0800 Subject: RFR(S): 8169467: GetLocalInstance returns JVMTI_ERROR_TYPE_MISMATCH (rather than JVMTI_ERROR_INVALID_SLOT) on static method In-Reply-To: References: <2df9eec5-b695-2f4e-8c25-65c5177f2208@oracle.com> Message-ID: <29de14f9-d278-5920-5be0-ffbf1c74e621@oracle.com> Thank you, Chris! Serguei On 11/20/19 18:22, Chris Plummer wrote: > +1 > > On 11/20/19 3:32 PM, Alex Menkov wrote: >> Looks good. >> >> --alex >> >> On 11/19/2019 15:36, serguei.spitsyn at oracle.com wrote: >>> Please, review a fix for: >>> ?? https://bugs.openjdk.java.net/browse/JDK-8169467 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2019/8169467-jvmti-local-instance.1/ >>> >>> >>> Summary: >>> ?? The JVMTI GetLocalInstance function should return >>> JVMTI_ERROR_INVALID_SLOT for static method frames. >>> ?? Instead, it returns the JVMTI_ERROR_TYPE_MISMATCH error. >>> ?? The fix adds necessary checks into the implementation. >>> >>> Testing in progress: >>> ?? - Locally on Linux-x64 with: vmTestbase_nsk_jvmti, >>> vmTestbase_nsk_jdi, jdk_jdi >>> ?? - All mach5 hs-tier5 >>> >>> Thanks, >>> Serguei > > From chris.plummer at oracle.com Fri Nov 22 16:52:31 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 22 Nov 2019 08:52:31 -0800 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> Message-ID: <1d87e3d1-2ccb-852b-41de-f2d8116f46f9@oracle.com> Hi Yasumasa, Start with the following code in HotSpotAgent.java: ??????? catch (NoSuchSymbolException e) { ??????????? throw new DebuggerException("Doesn't appear to be a HotSpot VM (could not find symbol \"" + ??????????? e.getSymbol() + "\" in remote process)"); ??????? } Fix it to include "e" as the cause of the DebuggerException. Then the exception backtrace that David included below will be a bit more useful. thanks, Chris On 11/22/19 12:55 AM, Yasumasa Suenaga wrote: > Thanks David! > > Hmm... my slowdebug build works fine on my laptop. > I will investigate more. > > > Thanks, > > Yasumasa > > > On 2019/11/22 17:08, David Holmes wrote: >> On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> I tried to get mixed stack via `jhsdb jstack --mixed`, but I couldn't. >>> (See JBS for details) >>> >>> ?? https://bugs.openjdk.java.net/browse/JDK-8234624 >>> >>> I think it is caused by DWARF. AMD64 needs DWARF for stack >>> unwinding, but SA does not handle it. >>> So I created a patch. It works fine on my Fedora 31 x64 box, but it >>> failed on submit repo. >>> >>> ?? http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 >>> >>> Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" >>> was not found. >>> I wonder why it failed, and why my serviceability/sa tests (with >>> fastdebug build) was succeeded. >>> Can you share details for this test? >>> mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 >> >> I can't really shed any light on it, there were lots of failures - >> see below for example. The issue is with the VM that was being >> inspected but there's no output from that VM. >> >> David >> ??----- >> >> ----------System.out:(10/413)---------- >> Starting TestUniverse >> Started LingeredApp with G1GC and pid 31111 >> Starting clhsdb against 31111 >> [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 >> [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 >> [2019-11-22T07:03:44.395815Z] Waiting for completion finished for >> process 31133 >> hsdb> Command not valid until attached to a VM >> hsdb> >> 'Heap Parameters' missing from stdout/stderr >> >> ----------System.err:(53/3915)---------- >> Command line: >> ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' >> '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' >> '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' >> 'jdk.test.lib.apps.LingeredApp' >> '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] >> Attaching to process 31111, please wait... >> Unable to connect to process ID 31111: >> >> Doesn't appear to be a HotSpot VM (could not find symbol >> "gHotSpotVMTypes" in >> remote process) >> sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a >> HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) >> ?????at >> jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) >> ?????at >> jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) >> ?????at >> jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) >> ?????at >> jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) >> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) >> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) >> ?????at >> jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) >> ?????at >> jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) >> ??stdout: [ Command not valid until attached to a VM >> ]; >> ??stderr: [ Command not valid until attached to a VM >> ] >> ??exitValue = -1 >> >> ??LingeredApp stdout: []; >> ??LingeredApp stderr: [] >> ??LingeredApp exitValue = 0 >> java.lang.RuntimeException: 'Heap Parameters' missing from stdout/stderr From daniel.daugherty at oracle.com Fri Nov 22 17:49:24 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 22 Nov 2019 12:49:24 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> Message-ID: Hi Coleen, Sorry for the delay in getting back to this re-review. On 11/21/19 9:12 AM, coleen.phillimore at oracle.com wrote: > > Please review a new version of this change that keeps the nmethod from > being unloaded, after it is added to the deferred event queue: > > http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html src/hotspot/share/code/nmethod.cpp ??? No comments. src/hotspot/share/oops/instanceKlass.cpp ??? No comments. src/hotspot/share/prims/jvmtiExport.cpp ??? No comments. src/hotspot/share/prims/jvmtiImpl.cpp ??? Nice solution with the new oops_do() and nmethods_do() functions! ??? old L988: void JvmtiDeferredEventQueue::enqueue(const JvmtiDeferredEvent& event) { ??? new L998: void JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) { ??????? Not sure why this was changed. ??????? Update: Looks like Serguei raised the issue and Coleen has already ??????? resolved it. src/hotspot/share/prims/jvmtiImpl.hpp ??? old L494: ??? QueueNode(const JvmtiDeferredEvent& event) ??? new L498: ??? QueueNode(JvmtiDeferredEvent& event) ??????? Why was this changed? ??????? Update: Not clear if this was covered by Coleen's reply to Serguei. ??? old L497: ??? const JvmtiDeferredEvent& event() const { return _event; } ??? new L501: ??? JvmtiDeferredEvent& event() { return _event; } ??????? Why was this changed? ??????? Update: Coleen's reply to Serguei explained this. Perhaps add: ????????????????? // Not const because of oops_do() and nmethods_do(). ??? old L509: ? static void enqueue(const JvmtiDeferredEvent& event) NOT_JVMTI_RETURN; ??? new L513: ? static void enqueue(JvmtiDeferredEvent event) NOT_JVMTI_RETURN; ??????? Why was this changed? ??????? Update: Looks like Serguei raised the issue and Coleen has already ??????? resolved it. src/hotspot/share/runtime/mutexLocker.cpp ??? This change is going to require some testing to make sure we don't ??? have any new deadlock scenarios. src/hotspot/share/runtime/serviceThread.cpp ??? L50 - nit - why the extra blank line? src/hotspot/share/runtime/serviceThread.hpp ??? Thanks for cleaning up the static: ????? ServiceThread::is_service_thread(Thread* thread) ??? stuff. Having it be different than the other threads was ??? a bit jarring. src/hotspot/share/runtime/thread.hpp ??? No comments. Thumbs up. My only comments are nits so I don't need to see a new webrev if you decide to fix them. Dan > > Ran the test that failed 100 times without failure, tier1 on Oracle > supported platforms, and tier2-3 including jvmti and jdi tests locally. > > See bug for more details about the crash. > > https://bugs.openjdk.java.net/browse/JDK-8173361 > > Thanks, > Coleen > > On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >> >> Hi Serguei, >> >> Sorry for not sending an update.? I talked to Erik and am working on >> a version that keeps the nmethod from being unloaded while it's in >> the deferred event queue, with a version that the GC people will >> like, and I like.? I'm testing it out now. >> >> Thanks! >> Coleen >> >> >> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Coleen, >>> >>> Sorry for the latency, I had to investigate it a little bit. >>> I still have some doubt your fix is right thing to do. >>> >>> >>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Coleen, >>>>> >>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>> start with this one. >>>>>> >>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>> (made_zombie or memory released) by the sweeper, but the nmethod >>>>>> could be unloaded.? Unloading the nmethod clears the Method* >>>>>> _method field. >>>>> >>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>> >>>>>> The post_compiled_method_load event needs the _method field to >>>>>> look at things like inlining and ScopeDesc fields.?? If the >>>>>> nmethod is unloaded, some of the oops are dead.? There are >>>>>> "holder" oops that correspond to the metadata in the nmethod.? If >>>>>> these oops are dead, causing the nmethod to get unloaded, then >>>>>> the metadata may not be valid. >>>>>> >>>>>> So my change 02 looks for a NULL nmethod._method field to tell >>>>>> whether we can post information about the nmethod. >>>>>> >>>>>> There's code in nmethod.cpp like: >>>>>> >>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>> ? if (_jmethod_id == NULL) { >>>>>> ??? // Cache the jmethod_id since it can no longer be looked up >>>>>> once the >>>>>> ??? // method itself has been marked for unloading. >>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>> ? } >>>>>> ? return _jmethod_id; >>>>>> } >>>>>> >>>>>> Which was added when post_method_load and unload were turned into >>>>>> deferred events. >>>>> >>>>> Could we cache the jmethodID in the >>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>> similarly as we do in the >>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>> This would help to get rid of the dependency on the nmethod::_method. >>>>> Do we depend on any other nmethod fields? >>>> >>>> Yes, there are other nmethod metadata that we rely on to print >>>> inline information, and this function >>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses >>>> the ScopeDesc data in the nmethod. >>> >>> One possible approach is to prepare and cache all this information >>> in the nmethod::post_compiled_method_load_event() before the >>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>> The event parameters are: >>> typedef struct { >>> const void* start_address; >>> jlocation location; >>> } jvmtiAddrLocationMap; >>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>> jmethodID method, >>> jint code_size, >>> const void* code_addr, >>> jint map_length, >>> const jvmtiAddrLocationMap* map, >>> const void* compile_info) >>> Some of these addresses above could be not accessible when an event >>> is posted. >>> Not sure yet if it is Okay. >>> The question is if this kind of refactoring is worth and right thing >>> to do. >>> >>>> >>>> We do cache the jmethodID but that's not good enough.? See my last >>>> comment in the bug report.? The jmethodID can point to an unloaded >>>> method. >>> >>> This looks like it is done a little bit late. >>> It'd better to do it before the event is deferred (see above). >>> >>>> I tried a version of keeping the nmethod alive, but the GC folks >>>> will hate it.? And it doesn't work and I hate it. >>> >>> From serviceability point of view this is the best and most >>> consistent approach. >>> I seems to me, it was initially designed this way. >>> The downside is it adds some extra complexity to the GC. >>> >>>> My version 01 is the best, with the caveat that maybe it should >>>> check for _method == NULL instead of nmethod->is_alive().? I have >>>> to talk to Erik to see if there's a race with concurrent class >>>> unloading. >>>> >>>> Any application that depends on a compiled method loading event on >>>> a class that could be unloaded is a buggy application.? >>>> Applications should not rely on when the JIT compiler decides to >>>> compile a method!? This happens to us for a stress test.? Most >>>> applications will get most of their compiled method loading events >>>> as they normally do. >>> >>> It is not an application that relies on the compiled method loading >>> event. >>> It is about profiling tools to be able to get correct information >>> about what is going on with compilations. >>> My concern is that if we skip such compiled method load events then >>> profilers have no way >>> to find out there many unneeded compilations that are thrown away >>> without any real use. >>> Also, it is not clear what happens with the subsequent compiled >>> method unload events. >>> Are they going to be skipped as well or they can appear and confuse >>> profilers? >>> >>> >>> Thanks, >>> Serguei >>>> >>>> Thanks, >>>> Coleen >>>> >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> I put more debugging in the bug to show this crash was from an >>>>>> unloaded nmethod. >>>>>> >>>>>> Coleen >>>>>> >>>>>> >>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> I have some questions. >>>>>>> >>>>>>> Both the compiler method load and unload are posted as deferred >>>>>>> events. >>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>> processes the event. >>>>>>> >>>>>>> The implementation is: >>>>>>> >>>>>>> JvmtiDeferredEvent >>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>> ? . . . >>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>> ? // this deferred event. >>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>> ? return event; >>>>>>> } >>>>>>> >>>>>>> JvmtiDeferredEvent >>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>>>> jmethodID id, const void* code) { >>>>>>> ? . . . >>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>> ? // generated code from being reused too early. We pass >>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>> ? // made into a zombie can be locked. >>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>> ? return event; >>>>>>> } >>>>>>> >>>>>>> void JvmtiDeferredEvent::post() { >>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>> ? switch(_type) { >>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>> ????? JvmtiExport::post_compiled_method_load(nm); >>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>> ????? break; >>>>>>> ??? } >>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>>>> ??????? _event_data.compiled_method_unload.method_id, >>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>> ????? break; >>>>>>> ??? } >>>>>>> ??? . . . >>>>>>> ? } >>>>>>> } >>>>>>> >>>>>>> Then I wonder how is it possible for the nmethod to be not alive >>>>>>> here?: >>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>> . . . >>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>> 2174 if (!nm->is_alive()) { >>>>>>> 2175 return; >>>>>>> 2176 } >>>>>>> At least, it lokks like something else is broken. >>>>>>> Do I miss something important here? >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>> unloaded nmethods >>>>>>>> >>>>>>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>>>>>> failure without the fix). >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Nov 22 18:30:20 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 22 Nov 2019 10:30:20 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> <65499c3c-de5e-6913-22b0-451730d3e7dd@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Fri Nov 22 19:15:50 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 22 Nov 2019 14:15:50 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> Message-ID: <24fc9c1c-bfd1-5edf-2231-d9ba0e0885f5@oracle.com> Dan, Thank you for reviewing this! On 11/22/19 12:49 PM, Daniel D. Daugherty wrote: > Hi Coleen, > > Sorry for the delay in getting back to this re-review. > > > On 11/21/19 9:12 AM, coleen.phillimore at oracle.com wrote: >> >> Please review a new version of this change that keeps the nmethod >> from being unloaded, after it is added to the deferred event queue: >> >> http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html > > src/hotspot/share/code/nmethod.cpp > ??? No comments. > > src/hotspot/share/oops/instanceKlass.cpp > ??? No comments. > > src/hotspot/share/prims/jvmtiExport.cpp > ??? No comments. > > src/hotspot/share/prims/jvmtiImpl.cpp > ??? Nice solution with the new oops_do() and nmethods_do() functions! Erik's insistance! > > ??? old L988: void JvmtiDeferredEventQueue::enqueue(const > JvmtiDeferredEvent& event) { > ??? new L998: void JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent > event) { > ??????? Not sure why this was changed. > > ??????? Update: Looks like Serguei raised the issue and Coleen has already > ??????? resolved it. Yes. > > src/hotspot/share/prims/jvmtiImpl.hpp > ??? old L494: ??? QueueNode(const JvmtiDeferredEvent& event) > ??? new L498: ??? QueueNode(JvmtiDeferredEvent& event) > ??????? Why was this changed? > > ??????? Update: Not clear if this was covered by Coleen's reply to > Serguei. > > ??? old L497: ??? const JvmtiDeferredEvent& event() const { return > _event; } > ??? new L501: ??? JvmtiDeferredEvent& event() { return _event; } > ??????? Why was this changed? > > ??????? Update: Coleen's reply to Serguei explained this. Perhaps add: > ????????????????? // Not const because of oops_do() and nmethods_do(). > > ??? old L509: ? static void enqueue(const JvmtiDeferredEvent& event) > NOT_JVMTI_RETURN; > ??? new L513: ? static void enqueue(JvmtiDeferredEvent event) > NOT_JVMTI_RETURN; > ??????? Why was this changed? > > ??????? Update: Looks like Serguei raised the issue and Coleen has already > ??????? resolved it. Yes, I fixed these. > > src/hotspot/share/runtime/mutexLocker.cpp > ??? This change is going to require some testing to make sure we don't > ??? have any new deadlock scenarios. Luckily, I've previously added an implicit NoSafepointVerifier to locks that are _allow_vm_block = true, like this one. + def(JmethodIdCreation_lock , PaddedMutex , leaf, true, _safepoint_check_never); // used for creating jmethodIDs. which prevents one class of deadlock. If we take out another lock with a higher rank, we'll get the ranking assert. This lock prevents insertion into an array, and has little outside calls. I'm running tests in tier 1-6 but any code that travels through this should get these assertion checks, rather than deadlocking. > > src/hotspot/share/runtime/serviceThread.cpp > ??? L50 - nit - why the extra blank line? To separate static data member definitions from functions.? I removed it. > > src/hotspot/share/runtime/serviceThread.hpp > ??? Thanks for cleaning up the static: > > ????? ServiceThread::is_service_thread(Thread* thread) > > ??? stuff. Having it be different than the other threads was > ??? a bit jarring. > > src/hotspot/share/runtime/thread.hpp > ??? No comments. > > Thumbs up. My only comments are nits so I don't need to see a > new webrev if you decide to fix them. So it turns out that in stress testing my fix forhttps://bugs.openjdk.java.net/browse/JDK-8212160 Because I was in the area and thought this was a duplicate of that bug (it is not).?? I found that calling oops_do and nmethods_do the ServiceThread? needs to hold the Service_lock, because other threads can be adding things to the global queue while the sweeper thread is calling this in a handshake. I am now retesting this change with the changes above, and with the Service_lock.?? So far my stress tests for JDK-81212160 and the stress test for this bug pass, but I'm going to run through all the tiers 1-6 over the weekend. Please have a look at the changes in the meantime. http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev http://cr.openjdk.java.net/~coleenp/2019/8173361.04/webrev Thanks, Coleen > > Dan > >> >> Ran the test that failed 100 times without failure, tier1 on Oracle >> supported platforms, and tier2-3 including jvmti and jdi tests locally. >> >> See bug for more details about the crash. >> >> https://bugs.openjdk.java.net/browse/JDK-8173361 >> >> Thanks, >> Coleen >> >> On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >>> >>> Hi Serguei, >>> >>> Sorry for not sending an update.? I talked to Erik and am working on >>> a version that keeps the nmethod from being unloaded while it's in >>> the deferred event queue, with a version that the GC people will >>> like, and I like.? I'm testing it out now. >>> >>> Thanks! >>> Coleen >>> >>> >>> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Coleen, >>>> >>>> Sorry for the latency, I had to investigate it a little bit. >>>> I still have some doubt your fix is right thing to do. >>>> >>>> >>>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>>> start with this one. >>>>>>> >>>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>>> (made_zombie or memory released) by the sweeper, but the nmethod >>>>>>> could be unloaded.? Unloading the nmethod clears the Method* >>>>>>> _method field. >>>>>> >>>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>>> >>>>>>> The post_compiled_method_load event needs the _method field to >>>>>>> look at things like inlining and ScopeDesc fields.?? If the >>>>>>> nmethod is unloaded, some of the oops are dead.? There are >>>>>>> "holder" oops that correspond to the metadata in the nmethod.? >>>>>>> If these oops are dead, causing the nmethod to get unloaded, >>>>>>> then the metadata may not be valid. >>>>>>> >>>>>>> So my change 02 looks for a NULL nmethod._method field to tell >>>>>>> whether we can post information about the nmethod. >>>>>>> >>>>>>> There's code in nmethod.cpp like: >>>>>>> >>>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>>> ? if (_jmethod_id == NULL) { >>>>>>> ??? // Cache the jmethod_id since it can no longer be looked up >>>>>>> once the >>>>>>> ??? // method itself has been marked for unloading. >>>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>>> ? } >>>>>>> ? return _jmethod_id; >>>>>>> } >>>>>>> >>>>>>> Which was added when post_method_load and unload were turned >>>>>>> into deferred events. >>>>>> >>>>>> Could we cache the jmethodID in the >>>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>>> similarly as we do in the >>>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>>> This would help to get rid of the dependency on the nmethod::_method. >>>>>> Do we depend on any other nmethod fields? >>>>> >>>>> Yes, there are other nmethod metadata that we rely on to print >>>>> inline information, and this function >>>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses >>>>> the ScopeDesc data in the nmethod. >>>> >>>> One possible approach is to prepare and cache all this information >>>> in the nmethod::post_compiled_method_load_event() before the >>>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>>> The event parameters are: >>>> typedef struct { >>>> const void* start_address; >>>> jlocation location; >>>> } jvmtiAddrLocationMap; >>>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>>> jmethodID method, >>>> jint code_size, >>>> const void* code_addr, >>>> jint map_length, >>>> const jvmtiAddrLocationMap* map, >>>> const void* compile_info) >>>> Some of these addresses above could be not accessible when an event >>>> is posted. >>>> Not sure yet if it is Okay. >>>> The question is if this kind of refactoring is worth and right >>>> thing to do. >>>> >>>>> >>>>> We do cache the jmethodID but that's not good enough.? See my last >>>>> comment in the bug report.? The jmethodID can point to an unloaded >>>>> method. >>>> >>>> This looks like it is done a little bit late. >>>> It'd better to do it before the event is deferred (see above). >>>> >>>>> I tried a version of keeping the nmethod alive, but the GC folks >>>>> will hate it.? And it doesn't work and I hate it. >>>> >>>> From serviceability point of view this is the best and most >>>> consistent approach. >>>> I seems to me, it was initially designed this way. >>>> The downside is it adds some extra complexity to the GC. >>>> >>>>> My version 01 is the best, with the caveat that maybe it should >>>>> check for _method == NULL instead of nmethod->is_alive().? I have >>>>> to talk to Erik to see if there's a race with concurrent class >>>>> unloading. >>>>> >>>>> Any application that depends on a compiled method loading event on >>>>> a class that could be unloaded is a buggy application.? >>>>> Applications should not rely on when the JIT compiler decides to >>>>> compile a method!? This happens to us for a stress test.? Most >>>>> applications will get most of their compiled method loading events >>>>> as they normally do. >>>> >>>> It is not an application that relies on the compiled method loading >>>> event. >>>> It is about profiling tools to be able to get correct information >>>> about what is going on with compilations. >>>> My concern is that if we skip such compiled method load events then >>>> profilers have no way >>>> to find out there many unneeded compilations that are thrown away >>>> without any real use. >>>> Also, it is not clear what happens with the subsequent compiled >>>> method unload events. >>>> Are they going to be skipped as well or they can appear and confuse >>>> profilers? >>>> >>>> >>>> Thanks, >>>> Serguei >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> I put more debugging in the bug to show this crash was from an >>>>>>> unloaded nmethod. >>>>>>> >>>>>>> Coleen >>>>>>> >>>>>>> >>>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> I have some questions. >>>>>>>> >>>>>>>> Both the compiler method load and unload are posted as deferred >>>>>>>> events. >>>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>>> processes the event. >>>>>>>> >>>>>>>> The implementation is: >>>>>>>> >>>>>>>> JvmtiDeferredEvent >>>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>>> ? . . . >>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>> ? // this deferred event. >>>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>>> ? return event; >>>>>>>> } >>>>>>>> >>>>>>>> JvmtiDeferredEvent >>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>>>>> jmethodID id, const void* code) { >>>>>>>> ? . . . >>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>>> ? // generated code from being reused too early. We pass >>>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>>> ? // made into a zombie can be locked. >>>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>>> ? return event; >>>>>>>> } >>>>>>>> >>>>>>>> void JvmtiDeferredEvent::post() { >>>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>>> ? switch(_type) { >>>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>>> ????? JvmtiExport::post_compiled_method_load(nm); >>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>> ????? break; >>>>>>>> ??? } >>>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>>>>> _event_data.compiled_method_unload.method_id, >>>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>> ????? break; >>>>>>>> ??? } >>>>>>>> ??? . . . >>>>>>>> ? } >>>>>>>> } >>>>>>>> >>>>>>>> Then I wonder how is it possible for the nmethod to be not >>>>>>>> alive here?: >>>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>>> . . . >>>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>>> 2174 if (!nm->is_alive()) { >>>>>>>> 2175 return; >>>>>>>> 2176 } >>>>>>>> At least, it lokks like something else is broken. >>>>>>>> Do I miss something important here? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>>> unloaded nmethods >>>>>>>>> >>>>>>>>> Tested tier1-3 and 100 times with test that failed (reproduced >>>>>>>>> failure without the fix). >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Fri Nov 22 19:42:46 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 22 Nov 2019 14:42:46 -0500 Subject: RFR (S) 8173658: JvmtiExport::post_class_unload() is broken for non-JavaThread initiators Message-ID: <8fe01d64-813f-195f-a11b-6137c0625f62@oracle.com> Summary: call extension ClassUnload event as a deferred event from the ServiceThread and remove unsafe arguments I'm still waiting for the CSR request to get approved but this change fixes the broken class unload events.? It's been tested with the existing test case, and hs-tier1 for all platforms and tier2-6 on linux-x64-debug. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173658.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8173658 Thanks, Coleen From daniel.daugherty at oracle.com Fri Nov 22 22:53:12 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 22 Nov 2019 17:53:12 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <24fc9c1c-bfd1-5edf-2231-d9ba0e0885f5@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> <24fc9c1c-bfd1-5edf-2231-d9ba0e0885f5@oracle.com> Message-ID: <020d7c80-4d77-ee96-5a7b-74acdbd54f86@oracle.com> > http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev src/hotspot/share/prims/jvmtiImpl.cpp ??? No comments. src/hotspot/share/prims/jvmtiImpl.hpp ??? No comments. src/hotspot/share/runtime/serviceThread.cpp ??? No comments. Thumbs up. Dan On 11/22/19 2:15 PM, coleen.phillimore at oracle.com wrote: > > Dan, Thank you for reviewing this! > > On 11/22/19 12:49 PM, Daniel D. Daugherty wrote: >> Hi Coleen, >> >> Sorry for the delay in getting back to this re-review. >> >> >> On 11/21/19 9:12 AM, coleen.phillimore at oracle.com wrote: >>> >>> Please review a new version of this change that keeps the nmethod >>> from being unloaded, after it is added to the deferred event queue: >>> >>> http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html >> >> src/hotspot/share/code/nmethod.cpp >> ??? No comments. >> >> src/hotspot/share/oops/instanceKlass.cpp >> ??? No comments. >> >> src/hotspot/share/prims/jvmtiExport.cpp >> ??? No comments. >> >> src/hotspot/share/prims/jvmtiImpl.cpp >> ??? Nice solution with the new oops_do() and nmethods_do() functions! > Erik's insistance! >> >> ??? old L988: void JvmtiDeferredEventQueue::enqueue(const >> JvmtiDeferredEvent& event) { >> ??? new L998: void >> JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) { >> ??????? Not sure why this was changed. >> >> ??????? Update: Looks like Serguei raised the issue and Coleen has >> already >> ??????? resolved it. > > Yes. >> >> src/hotspot/share/prims/jvmtiImpl.hpp >> ??? old L494: ??? QueueNode(const JvmtiDeferredEvent& event) >> ??? new L498: ??? QueueNode(JvmtiDeferredEvent& event) >> ??????? Why was this changed? >> >> ??????? Update: Not clear if this was covered by Coleen's reply to >> Serguei. >> >> ??? old L497: ??? const JvmtiDeferredEvent& event() const { return >> _event; } >> ??? new L501: ??? JvmtiDeferredEvent& event() { return _event; } >> ??????? Why was this changed? >> >> ??????? Update: Coleen's reply to Serguei explained this. Perhaps add: >> ????????????????? // Not const because of oops_do() and nmethods_do(). >> >> ??? old L509: ? static void enqueue(const JvmtiDeferredEvent& event) >> NOT_JVMTI_RETURN; >> ??? new L513: ? static void enqueue(JvmtiDeferredEvent event) >> NOT_JVMTI_RETURN; >> ??????? Why was this changed? >> >> ??????? Update: Looks like Serguei raised the issue and Coleen has >> already >> ??????? resolved it. > > Yes, I fixed these. >> >> src/hotspot/share/runtime/mutexLocker.cpp >> ??? This change is going to require some testing to make sure we don't >> ??? have any new deadlock scenarios. > > Luckily, I've previously added an implicit NoSafepointVerifier to > locks that are _allow_vm_block = true, like this one. > + def(JmethodIdCreation_lock , PaddedMutex , leaf, true, > _safepoint_check_never); // used for creating jmethodIDs. > which prevents one class of deadlock. If we take out another lock with > a higher rank, we'll get the ranking assert. > > This lock prevents insertion into an array, and has little outside calls. > > I'm running tests in tier 1-6 but any code that travels through this > should get these assertion checks, rather than deadlocking. > >> >> src/hotspot/share/runtime/serviceThread.cpp >> ??? L50 - nit - why the extra blank line? > > To separate static data member definitions from functions.? I removed it. >> >> src/hotspot/share/runtime/serviceThread.hpp >> ??? Thanks for cleaning up the static: >> >> ????? ServiceThread::is_service_thread(Thread* thread) >> >> ??? stuff. Having it be different than the other threads was >> ??? a bit jarring. >> >> src/hotspot/share/runtime/thread.hpp >> ??? No comments. >> >> Thumbs up. My only comments are nits so I don't need to see a >> new webrev if you decide to fix them. > > So it turns out that in stress testing my fix > forhttps://bugs.openjdk.java.net/browse/JDK-8212160 > > Because I was in the area and thought this was a duplicate of that bug > (it is not).?? I found that calling oops_do and nmethods_do the > ServiceThread? needs to hold the Service_lock, because other threads > can be adding things to the global queue while the sweeper thread is > calling this in a handshake. > > I am now retesting this change with the changes above, and with the > Service_lock.?? So far my stress tests for JDK-81212160 and the stress > test for this bug pass, but I'm going to run through all the tiers 1-6 > over the weekend. > > Please have a look at the changes in the meantime. > > http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev > http://cr.openjdk.java.net/~coleenp/2019/8173361.04/webrev > > Thanks, > Coleen >> >> Dan >> >>> >>> Ran the test that failed 100 times without failure, tier1 on Oracle >>> supported platforms, and tier2-3 including jvmti and jdi tests locally. >>> >>> See bug for more details about the crash. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8173361 >>> >>> Thanks, >>> Coleen >>> >>> On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi Serguei, >>>> >>>> Sorry for not sending an update.? I talked to Erik and am working >>>> on a version that keeps the nmethod from being unloaded while it's >>>> in the deferred event queue, with a version that the GC people will >>>> like, and I like.? I'm testing it out now. >>>> >>>> Thanks! >>>> Coleen >>>> >>>> >>>> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Coleen, >>>>> >>>>> Sorry for the latency, I had to investigate it a little bit. >>>>> I still have some doubt your fix is right thing to do. >>>>> >>>>> >>>>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> >>>>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>>>> start with this one. >>>>>>>> >>>>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>>>> (made_zombie or memory released) by the sweeper, but the >>>>>>>> nmethod could be unloaded. Unloading the nmethod clears the >>>>>>>> Method* _method field. >>>>>>> >>>>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>>>> >>>>>>>> The post_compiled_method_load event needs the _method field to >>>>>>>> look at things like inlining and ScopeDesc fields.?? If the >>>>>>>> nmethod is unloaded, some of the oops are dead.? There are >>>>>>>> "holder" oops that correspond to the metadata in the nmethod.? >>>>>>>> If these oops are dead, causing the nmethod to get unloaded, >>>>>>>> then the metadata may not be valid. >>>>>>>> >>>>>>>> So my change 02 looks for a NULL nmethod._method field to tell >>>>>>>> whether we can post information about the nmethod. >>>>>>>> >>>>>>>> There's code in nmethod.cpp like: >>>>>>>> >>>>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>>>> ? if (_jmethod_id == NULL) { >>>>>>>> ??? // Cache the jmethod_id since it can no longer be looked up >>>>>>>> once the >>>>>>>> ??? // method itself has been marked for unloading. >>>>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>>>> ? } >>>>>>>> ? return _jmethod_id; >>>>>>>> } >>>>>>>> >>>>>>>> Which was added when post_method_load and unload were turned >>>>>>>> into deferred events. >>>>>>> >>>>>>> Could we cache the jmethodID in the >>>>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>>>> similarly as we do in the >>>>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>>>> This would help to get rid of the dependency on the >>>>>>> nmethod::_method. >>>>>>> Do we depend on any other nmethod fields? >>>>>> >>>>>> Yes, there are other nmethod metadata that we rely on to print >>>>>> inline information, and this function >>>>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it >>>>>> uses the ScopeDesc data in the nmethod. >>>>> >>>>> One possible approach is to prepare and cache all this information >>>>> in the nmethod::post_compiled_method_load_event() before the >>>>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>>>> The event parameters are: >>>>> typedef struct { >>>>> const void* start_address; >>>>> jlocation location; >>>>> } jvmtiAddrLocationMap; >>>>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>>>> jmethodID method, >>>>> jint code_size, >>>>> const void* code_addr, >>>>> jint map_length, >>>>> const jvmtiAddrLocationMap* map, >>>>> const void* compile_info) >>>>> Some of these addresses above could be not accessible when an >>>>> event is posted. >>>>> Not sure yet if it is Okay. >>>>> The question is if this kind of refactoring is worth and right >>>>> thing to do. >>>>> >>>>>> >>>>>> We do cache the jmethodID but that's not good enough. See my last >>>>>> comment in the bug report.? The jmethodID can point to an >>>>>> unloaded method. >>>>> >>>>> This looks like it is done a little bit late. >>>>> It'd better to do it before the event is deferred (see above). >>>>> >>>>>> I tried a version of keeping the nmethod alive, but the GC folks >>>>>> will hate it.? And it doesn't work and I hate it. >>>>> >>>>> From serviceability point of view this is the best and most >>>>> consistent approach. >>>>> I seems to me, it was initially designed this way. >>>>> The downside is it adds some extra complexity to the GC. >>>>> >>>>>> My version 01 is the best, with the caveat that maybe it should >>>>>> check for _method == NULL instead of nmethod->is_alive().? I have >>>>>> to talk to Erik to see if there's a race with concurrent class >>>>>> unloading. >>>>>> >>>>>> Any application that depends on a compiled method loading event >>>>>> on a class that could be unloaded is a buggy application.? >>>>>> Applications should not rely on when the JIT compiler decides to >>>>>> compile a method!? This happens to us for a stress test.? Most >>>>>> applications will get most of their compiled method loading >>>>>> events as they normally do. >>>>> >>>>> It is not an application that relies on the compiled method >>>>> loading event. >>>>> It is about profiling tools to be able to get correct information >>>>> about what is going on with compilations. >>>>> My concern is that if we skip such compiled method load events >>>>> then profilers have no way >>>>> to find out there many unneeded compilations that are thrown away >>>>> without any real use. >>>>> Also, it is not clear what happens with the subsequent compiled >>>>> method unload events. >>>>> Are they going to be skipped as well or they can appear and >>>>> confuse profilers? >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> I put more debugging in the bug to show this crash was from an >>>>>>>> unloaded nmethod. >>>>>>>> >>>>>>>> Coleen >>>>>>>> >>>>>>>> >>>>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> I have some questions. >>>>>>>>> >>>>>>>>> Both the compiler method load and unload are posted as >>>>>>>>> deferred events. >>>>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>>>> processes the event. >>>>>>>>> >>>>>>>>> The implementation is: >>>>>>>>> >>>>>>>>> JvmtiDeferredEvent >>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>>>> ? . . . >>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>>> ? // this deferred event. >>>>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>>>> ? return event; >>>>>>>>> } >>>>>>>>> >>>>>>>>> JvmtiDeferredEvent >>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>>>>>> jmethodID id, const void* code) { >>>>>>>>> ? . . . >>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>>>> ? // generated code from being reused too early. We pass >>>>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>>>> ? // made into a zombie can be locked. >>>>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>>>> ? return event; >>>>>>>>> } >>>>>>>>> >>>>>>>>> void JvmtiDeferredEvent::post() { >>>>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>>>> ? switch(_type) { >>>>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>>>> ????? JvmtiExport::post_compiled_method_load(nm); >>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>> ????? break; >>>>>>>>> ??? } >>>>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>>>>>> _event_data.compiled_method_unload.method_id, >>>>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>> ????? break; >>>>>>>>> ??? } >>>>>>>>> ??? . . . >>>>>>>>> ? } >>>>>>>>> } >>>>>>>>> >>>>>>>>> Then I wonder how is it possible for the nmethod to be not >>>>>>>>> alive here?: >>>>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>>>> . . . >>>>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>>>> 2174 if (!nm->is_alive()) { >>>>>>>>> 2175 return; >>>>>>>>> 2176 } >>>>>>>>> At least, it lokks like something else is broken. >>>>>>>>> Do I miss something important here? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>>>> unloaded nmethods >>>>>>>>>> >>>>>>>>>> Tested tier1-3 and 100 times with test that failed >>>>>>>>>> (reproduced failure without the fix). >>>>>>>>>> >>>>>>>>>> open webrev at >>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Fri Nov 22 23:16:13 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 22 Nov 2019 18:16:13 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <020d7c80-4d77-ee96-5a7b-74acdbd54f86@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> <24fc9c1c-bfd1-5edf-2231-d9ba0e0885f5@oracle.com> <020d7c80-4d77-ee96-5a7b-74acdbd54f86@oracle.com> Message-ID: <6160906e-1e47-6dd1-b55a-2b48e6097f68@oracle.com> Thanks for the re-review, Dan! Coleen On 11/22/19 5:53 PM, Daniel D. Daugherty wrote: >> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev > > src/hotspot/share/prims/jvmtiImpl.cpp > ??? No comments. > > src/hotspot/share/prims/jvmtiImpl.hpp > ??? No comments. > > src/hotspot/share/runtime/serviceThread.cpp > ??? No comments. > > Thumbs up. > > Dan > > > On 11/22/19 2:15 PM, coleen.phillimore at oracle.com wrote: >> >> Dan, Thank you for reviewing this! >> >> On 11/22/19 12:49 PM, Daniel D. Daugherty wrote: >>> Hi Coleen, >>> >>> Sorry for the delay in getting back to this re-review. >>> >>> >>> On 11/21/19 9:12 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> Please review a new version of this change that keeps the nmethod >>>> from being unloaded, after it is added to the deferred event queue: >>>> >>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html >>> >>> src/hotspot/share/code/nmethod.cpp >>> ??? No comments. >>> >>> src/hotspot/share/oops/instanceKlass.cpp >>> ??? No comments. >>> >>> src/hotspot/share/prims/jvmtiExport.cpp >>> ??? No comments. >>> >>> src/hotspot/share/prims/jvmtiImpl.cpp >>> ??? Nice solution with the new oops_do() and nmethods_do() functions! >> Erik's insistance! >>> >>> ??? old L988: void JvmtiDeferredEventQueue::enqueue(const >>> JvmtiDeferredEvent& event) { >>> ??? new L998: void >>> JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) { >>> ??????? Not sure why this was changed. >>> >>> ??????? Update: Looks like Serguei raised the issue and Coleen has >>> already >>> ??????? resolved it. >> >> Yes. >>> >>> src/hotspot/share/prims/jvmtiImpl.hpp >>> ??? old L494: ??? QueueNode(const JvmtiDeferredEvent& event) >>> ??? new L498: ??? QueueNode(JvmtiDeferredEvent& event) >>> ??????? Why was this changed? >>> >>> ??????? Update: Not clear if this was covered by Coleen's reply to >>> Serguei. >>> >>> ??? old L497: ??? const JvmtiDeferredEvent& event() const { return >>> _event; } >>> ??? new L501: ??? JvmtiDeferredEvent& event() { return _event; } >>> ??????? Why was this changed? >>> >>> ??????? Update: Coleen's reply to Serguei explained this. Perhaps add: >>> ????????????????? // Not const because of oops_do() and nmethods_do(). >>> >>> ??? old L509: ? static void enqueue(const JvmtiDeferredEvent& event) >>> NOT_JVMTI_RETURN; >>> ??? new L513: ? static void enqueue(JvmtiDeferredEvent event) >>> NOT_JVMTI_RETURN; >>> ??????? Why was this changed? >>> >>> ??????? Update: Looks like Serguei raised the issue and Coleen has >>> already >>> ??????? resolved it. >> >> Yes, I fixed these. >>> >>> src/hotspot/share/runtime/mutexLocker.cpp >>> ??? This change is going to require some testing to make sure we don't >>> ??? have any new deadlock scenarios. >> >> Luckily, I've previously added an implicit NoSafepointVerifier to >> locks that are _allow_vm_block = true, like this one. >> + def(JmethodIdCreation_lock , PaddedMutex , leaf, true, >> _safepoint_check_never); // used for creating jmethodIDs. >> which prevents one class of deadlock. If we take out another lock >> with a higher rank, we'll get the ranking assert. >> >> This lock prevents insertion into an array, and has little outside calls. >> >> I'm running tests in tier 1-6 but any code that travels through this >> should get these assertion checks, rather than deadlocking. >> >>> >>> src/hotspot/share/runtime/serviceThread.cpp >>> ??? L50 - nit - why the extra blank line? >> >> To separate static data member definitions from functions.? I removed it. >>> >>> src/hotspot/share/runtime/serviceThread.hpp >>> ??? Thanks for cleaning up the static: >>> >>> ????? ServiceThread::is_service_thread(Thread* thread) >>> >>> ??? stuff. Having it be different than the other threads was >>> ??? a bit jarring. >>> >>> src/hotspot/share/runtime/thread.hpp >>> ??? No comments. >>> >>> Thumbs up. My only comments are nits so I don't need to see a >>> new webrev if you decide to fix them. >> >> So it turns out that in stress testing my fix >> forhttps://bugs.openjdk.java.net/browse/JDK-8212160 >> >> Because I was in the area and thought this was a duplicate of that >> bug (it is not).?? I found that calling oops_do and nmethods_do the >> ServiceThread? needs to hold the Service_lock, because other threads >> can be adding things to the global queue while the sweeper thread is >> calling this in a handshake. >> >> I am now retesting this change with the changes above, and with the >> Service_lock.?? So far my stress tests for JDK-81212160 and the >> stress test for this bug pass, but I'm going to run through all the >> tiers 1-6 over the weekend. >> >> Please have a look at the changes in the meantime. >> >> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev >> http://cr.openjdk.java.net/~coleenp/2019/8173361.04/webrev >> >> Thanks, >> Coleen >>> >>> Dan >>> >>>> >>>> Ran the test that failed 100 times without failure, tier1 on Oracle >>>> supported platforms, and tier2-3 including jvmti and jdi tests locally. >>>> >>>> See bug for more details about the crash. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8173361 >>>> >>>> Thanks, >>>> Coleen >>>> >>>> On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Hi Serguei, >>>>> >>>>> Sorry for not sending an update.? I talked to Erik and am working >>>>> on a version that keeps the nmethod from being unloaded while it's >>>>> in the deferred event queue, with a version that the GC people >>>>> will like, and I like.? I'm testing it out now. >>>>> >>>>> Thanks! >>>>> Coleen >>>>> >>>>> >>>>> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> Sorry for the latency, I had to investigate it a little bit. >>>>>> I still have some doubt your fix is right thing to do. >>>>>> >>>>>> >>>>>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> >>>>>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>> >>>>>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>>>>> start with this one. >>>>>>>>> >>>>>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>>>>> (made_zombie or memory released) by the sweeper, but the >>>>>>>>> nmethod could be unloaded. Unloading the nmethod clears the >>>>>>>>> Method* _method field. >>>>>>>> >>>>>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>>>>> >>>>>>>>> The post_compiled_method_load event needs the _method field to >>>>>>>>> look at things like inlining and ScopeDesc fields.?? If the >>>>>>>>> nmethod is unloaded, some of the oops are dead.? There are >>>>>>>>> "holder" oops that correspond to the metadata in the nmethod.? >>>>>>>>> If these oops are dead, causing the nmethod to get unloaded, >>>>>>>>> then the metadata may not be valid. >>>>>>>>> >>>>>>>>> So my change 02 looks for a NULL nmethod._method field to tell >>>>>>>>> whether we can post information about the nmethod. >>>>>>>>> >>>>>>>>> There's code in nmethod.cpp like: >>>>>>>>> >>>>>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>>>>> ? if (_jmethod_id == NULL) { >>>>>>>>> ??? // Cache the jmethod_id since it can no longer be looked >>>>>>>>> up once the >>>>>>>>> ??? // method itself has been marked for unloading. >>>>>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>>>>> ? } >>>>>>>>> ? return _jmethod_id; >>>>>>>>> } >>>>>>>>> >>>>>>>>> Which was added when post_method_load and unload were turned >>>>>>>>> into deferred events. >>>>>>>> >>>>>>>> Could we cache the jmethodID in the >>>>>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>>>>> similarly as we do in the >>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>>>>> This would help to get rid of the dependency on the >>>>>>>> nmethod::_method. >>>>>>>> Do we depend on any other nmethod fields? >>>>>>> >>>>>>> Yes, there are other nmethod metadata that we rely on to print >>>>>>> inline information, and this function >>>>>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it >>>>>>> uses the ScopeDesc data in the nmethod. >>>>>> >>>>>> One possible approach is to prepare and cache all this information >>>>>> in the nmethod::post_compiled_method_load_event() before the >>>>>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>>>>> The event parameters are: >>>>>> typedef struct { >>>>>> const void* start_address; >>>>>> jlocation location; >>>>>> } jvmtiAddrLocationMap; >>>>>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>>>>> jmethodID method, >>>>>> jint code_size, >>>>>> const void* code_addr, >>>>>> jint map_length, >>>>>> const jvmtiAddrLocationMap* map, >>>>>> const void* compile_info) >>>>>> Some of these addresses above could be not accessible when an >>>>>> event is posted. >>>>>> Not sure yet if it is Okay. >>>>>> The question is if this kind of refactoring is worth and right >>>>>> thing to do. >>>>>> >>>>>>> >>>>>>> We do cache the jmethodID but that's not good enough. See my >>>>>>> last comment in the bug report.? The jmethodID can point to an >>>>>>> unloaded method. >>>>>> >>>>>> This looks like it is done a little bit late. >>>>>> It'd better to do it before the event is deferred (see above). >>>>>> >>>>>>> I tried a version of keeping the nmethod alive, but the GC folks >>>>>>> will hate it.? And it doesn't work and I hate it. >>>>>> >>>>>> From serviceability point of view this is the best and most >>>>>> consistent approach. >>>>>> I seems to me, it was initially designed this way. >>>>>> The downside is it adds some extra complexity to the GC. >>>>>> >>>>>>> My version 01 is the best, with the caveat that maybe it should >>>>>>> check for _method == NULL instead of nmethod->is_alive().? I >>>>>>> have to talk to Erik to see if there's a race with concurrent >>>>>>> class unloading. >>>>>>> >>>>>>> Any application that depends on a compiled method loading event >>>>>>> on a class that could be unloaded is a buggy application.? >>>>>>> Applications should not rely on when the JIT compiler decides to >>>>>>> compile a method! This happens to us for a stress test.? Most >>>>>>> applications will get most of their compiled method loading >>>>>>> events as they normally do. >>>>>> >>>>>> It is not an application that relies on the compiled method >>>>>> loading event. >>>>>> It is about profiling tools to be able to get correct information >>>>>> about what is going on with compilations. >>>>>> My concern is that if we skip such compiled method load events >>>>>> then profilers have no way >>>>>> to find out there many unneeded compilations that are thrown away >>>>>> without any real use. >>>>>> Also, it is not clear what happens with the subsequent compiled >>>>>> method unload events. >>>>>> Are they going to be skipped as well or they can appear and >>>>>> confuse profilers? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>>> I put more debugging in the bug to show this crash was from an >>>>>>>>> unloaded nmethod. >>>>>>>>> >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Coleen, >>>>>>>>>> >>>>>>>>>> I have some questions. >>>>>>>>>> >>>>>>>>>> Both the compiler method load and unload are posted as >>>>>>>>>> deferred events. >>>>>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>>>>> processes the event. >>>>>>>>>> >>>>>>>>>> The implementation is: >>>>>>>>>> >>>>>>>>>> JvmtiDeferredEvent >>>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>>>>> ? . . . >>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>>>> ? // this deferred event. >>>>>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>>>>> ? return event; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> JvmtiDeferredEvent >>>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, >>>>>>>>>> jmethodID id, const void* code) { >>>>>>>>>> ? . . . >>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>>>>> ? // generated code from being reused too early. We pass >>>>>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>>>>> ? // made into a zombie can be locked. >>>>>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>>>>> ? return event; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> void JvmtiDeferredEvent::post() { >>>>>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>>>>> ? switch(_type) { >>>>>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>>>>> JvmtiExport::post_compiled_method_load(nm); >>>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>>> ????? break; >>>>>>>>>> ??? } >>>>>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>>>>> ????? JvmtiExport::post_compiled_method_unload( >>>>>>>>>> _event_data.compiled_method_unload.method_id, >>>>>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>>> ????? break; >>>>>>>>>> ??? } >>>>>>>>>> ??? . . . >>>>>>>>>> ? } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Then I wonder how is it possible for the nmethod to be not >>>>>>>>>> alive here?: >>>>>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>>>>> . . . >>>>>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>>>>> 2174 if (!nm->is_alive()) { >>>>>>>>>> 2175 return; >>>>>>>>>> 2176 } >>>>>>>>>> At least, it lokks like something else is broken. >>>>>>>>>> Do I miss something important here? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>>>>> unloaded nmethods >>>>>>>>>>> >>>>>>>>>>> Tested tier1-3 and 100 times with test that failed >>>>>>>>>>> (reproduced failure without the fix). >>>>>>>>>>> >>>>>>>>>>> open webrev at >>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Nov 22 23:34:07 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 22 Nov 2019 15:34:07 -0800 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <020d7c80-4d77-ee96-5a7b-74acdbd54f86@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> <24fc9c1c-bfd1-5edf-2231-d9ba0e0885f5@oracle.com> <020d7c80-4d77-ee96-5a7b-74acdbd54f86@oracle.com> Message-ID: <568c2562-0a56-73ac-c0af-43339d701b19@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Nov 22 23:45:09 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 22 Nov 2019 15:45:09 -0800 Subject: RFR (S) 8173658: JvmtiExport::post_class_unload() is broken for non-JavaThread initiators In-Reply-To: <8fe01d64-813f-195f-a11b-6137c0625f62@oracle.com> References: <8fe01d64-813f-195f-a11b-6137c0625f62@oracle.com> Message-ID: <51328d80-ef58-fc86-f359-2c57edc17e32@oracle.com> An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Sat Nov 23 01:35:18 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 22 Nov 2019 20:35:18 -0500 Subject: RFR (S) 8173658: JvmtiExport::post_class_unload() is broken for non-JavaThread initiators In-Reply-To: <51328d80-ef58-fc86-f359-2c57edc17e32@oracle.com> References: <8fe01d64-813f-195f-a11b-6137c0625f62@oracle.com> <51328d80-ef58-fc86-f359-2c57edc17e32@oracle.com> Message-ID: On 11/22/19 6:45 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > It looks good in general. > Just one minor request: > > http://cr.openjdk.java.net/~coleenp/2019/8173658.01/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.frames.html > http://cr.openjdk.java.net/~coleenp/2019/8173658.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html > > Could you, please, rename: > TYPE_CLASS_UNLOADED => TYPE_CLASS_UNLOAD > ? and class_unloaded => class_unload > > to keep it consistent with TYPE_COMPILED_METHOD_UNLOAD and > compiled_method_unload? > > Thank you for taking care about this! Thanks for reviewing this.? I fixed them both. Coleen > > Thanks, > Serguei > > > On 11/22/19 11:42, coleen.phillimore at oracle.com wrote: >> Summary: call extension ClassUnload event as a deferred event from >> the ServiceThread and remove unsafe arguments >> >> I'm still waiting for the CSR request to get approved but this change >> fixes the broken class unload events.? It's been tested with the >> existing test case, and hs-tier1 for all platforms and tier2-6 on >> linux-x64-debug. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8173658.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8173658 >> >> Thanks, >> Coleen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Sat Nov 23 01:39:59 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 23 Nov 2019 10:39:59 +0900 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <1d87e3d1-2ccb-852b-41de-f2d8116f46f9@oracle.com> References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> <1d87e3d1-2ccb-852b-41de-f2d8116f46f9@oracle.com> Message-ID: On 2019/11/23 1:52, Chris Plummer wrote: > Hi Yasumasa, > > Start with the following code in HotSpotAgent.java: > > ??????? catch (NoSuchSymbolException e) { > ??????????? throw new DebuggerException("Doesn't appear to be a HotSpot VM (could not find symbol \"" + > ??????????? e.getSymbol() + "\" in remote process)"); > ??????? } > > Fix it to include "e" as the cause of the DebuggerException. Then the exception backtrace that David included below will be a bit more useful. Thank you for the advise, Chris! But I cannot access Mach 5 result because I'm not an Oracle employee... I'm not sure I can get root cause from the email from submit repo. yasumasa > thanks, > > Chris > > > On 11/22/19 12:55 AM, Yasumasa Suenaga wrote: >> Thanks David! >> >> Hmm... my slowdebug build works fine on my laptop. >> I will investigate more. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/11/22 17:08, David Holmes wrote: >>> On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> I tried to get mixed stack via `jhsdb jstack --mixed`, but I couldn't. >>>> (See JBS for details) >>>> >>>> ?? https://bugs.openjdk.java.net/browse/JDK-8234624 >>>> >>>> I think it is caused by DWARF. AMD64 needs DWARF for stack unwinding, but SA does not handle it. >>>> So I created a patch. It works fine on my Fedora 31 x64 box, but it failed on submit repo. >>>> >>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 >>>> >>>> Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" was not found. >>>> I wonder why it failed, and why my serviceability/sa tests (with fastdebug build) was succeeded. >>>> Can you share details for this test? mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 >>> >>> I can't really shed any light on it, there were lots of failures - see below for example. The issue is with the VM that was being inspected but there's no output from that VM. >>> >>> David >>> ??----- >>> >>> ----------System.out:(10/413)---------- >>> Starting TestUniverse >>> Started LingeredApp with G1GC and pid 31111 >>> Starting clhsdb against 31111 >>> [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 >>> [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 >>> [2019-11-22T07:03:44.395815Z] Waiting for completion finished for process 31133 >>> hsdb> Command not valid until attached to a VM >>> hsdb> >>> 'Heap Parameters' missing from stdout/stderr >>> >>> ----------System.err:(53/3915)---------- >>> Command line: ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' 'jdk.test.lib.apps.LingeredApp' '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] >>> Attaching to process 31111, please wait... >>> Unable to connect to process ID 31111: >>> >>> Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in >>> remote process) >>> sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) >>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) >>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) >>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) >>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) >>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) >>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) >>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) >>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) >>> ??stdout: [ Command not valid until attached to a VM >>> ]; >>> ??stderr: [ Command not valid until attached to a VM >>> ] >>> ??exitValue = -1 >>> >>> ??LingeredApp stdout: []; >>> ??LingeredApp stderr: [] >>> ??LingeredApp exitValue = 0 >>> java.lang.RuntimeException: 'Heap Parameters' missing from stdout/stderr > > From serguei.spitsyn at oracle.com Sat Nov 23 01:40:55 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 22 Nov 2019 17:40:55 -0800 Subject: RFR (S) 8173658: JvmtiExport::post_class_unload() is broken for non-JavaThread initiators In-Reply-To: References: <8fe01d64-813f-195f-a11b-6137c0625f62@oracle.com> <51328d80-ef58-fc86-f359-2c57edc17e32@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Sat Nov 23 04:24:26 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 23 Nov 2019 13:24:26 +0900 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> <1d87e3d1-2ccb-852b-41de-f2d8116f46f9@oracle.com> Message-ID: David, Chris, Can you share the result of this test? mach5-one-ysuenaga-JDK-8234624-1-20191123-0234-6938325 It failed on TestJhsdbJstackMixed.java and ClhsdbPstack.java . Thanks, Yasumasa On 2019/11/23 10:39, Yasumasa Suenaga wrote: > On 2019/11/23 1:52, Chris Plummer wrote: >> Hi Yasumasa, >> >> Start with the following code in HotSpotAgent.java: >> >> ???????? catch (NoSuchSymbolException e) { >> ???????????? throw new DebuggerException("Doesn't appear to be a HotSpot VM (could not find symbol \"" + >> ???????????? e.getSymbol() + "\" in remote process)"); >> ???????? } >> >> Fix it to include "e" as the cause of the DebuggerException. Then the exception backtrace that David included below will be a bit more useful. > > Thank you for the advise, Chris! > But I cannot access Mach 5 result because I'm not an Oracle employee... > > I'm not sure I can get root cause from the email from submit repo. > > > yasumasa > > >> thanks, >> >> Chris >> >> >> On 11/22/19 12:55 AM, Yasumasa Suenaga wrote: >>> Thanks David! >>> >>> Hmm... my slowdebug build works fine on my laptop. >>> I will investigate more. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2019/11/22 17:08, David Holmes wrote: >>>> On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> I tried to get mixed stack via `jhsdb jstack --mixed`, but I couldn't. >>>>> (See JBS for details) >>>>> >>>>> ?? https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>> >>>>> I think it is caused by DWARF. AMD64 needs DWARF for stack unwinding, but SA does not handle it. >>>>> So I created a patch. It works fine on my Fedora 31 x64 box, but it failed on submit repo. >>>>> >>>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 >>>>> >>>>> Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" was not found. >>>>> I wonder why it failed, and why my serviceability/sa tests (with fastdebug build) was succeeded. >>>>> Can you share details for this test? mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 >>>> >>>> I can't really shed any light on it, there were lots of failures - see below for example. The issue is with the VM that was being inspected but there's no output from that VM. >>>> >>>> David >>>> ??----- >>>> >>>> ----------System.out:(10/413)---------- >>>> Starting TestUniverse >>>> Started LingeredApp with G1GC and pid 31111 >>>> Starting clhsdb against 31111 >>>> [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 >>>> [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 >>>> [2019-11-22T07:03:44.395815Z] Waiting for completion finished for process 31133 >>>> hsdb> Command not valid until attached to a VM >>>> hsdb> >>>> 'Heap Parameters' missing from stdout/stderr >>>> >>>> ----------System.err:(53/3915)---------- >>>> Command line: ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' 'jdk.test.lib.apps.LingeredApp' '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] >>>> Attaching to process 31111, please wait... >>>> Unable to connect to process ID 31111: >>>> >>>> Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in >>>> remote process) >>>> sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) >>>> ??stdout: [ Command not valid until attached to a VM >>>> ]; >>>> ??stderr: [ Command not valid until attached to a VM >>>> ] >>>> ??exitValue = -1 >>>> >>>> ??LingeredApp stdout: []; >>>> ??LingeredApp stderr: [] >>>> ??LingeredApp exitValue = 0 >>>> java.lang.RuntimeException: 'Heap Parameters' missing from stdout/stderr >> >> From volker.simonis at gmail.com Sat Nov 23 08:14:36 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Sat, 23 Nov 2019 09:14:36 +0100 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> <1d87e3d1-2ccb-852b-41de-f2d8116f46f9@oracle.com> Message-ID: Just a wild guess, but maybe after your changes, the debug symbols are required and not found any more? This could be caused by the fact that Mach5 builds with different settings compared to you? See https://hg.openjdk.java.net/jdk-updates/jdk9u/raw-file/tip/common/doc/building.html#native-debug-symbols for the different variants. You could try to see if you can reproduce the failure locally by building --with-native-debug-symbols=none, external,zipped Yasumasa Suenaga schrieb am Sa., 23. Nov. 2019, 05:24: > David, Chris, > > Can you share the result of this test? > > mach5-one-ysuenaga-JDK-8234624-1-20191123-0234-6938325 > > It failed on TestJhsdbJstackMixed.java and ClhsdbPstack.java . > > > Thanks, > > Yasumasa > > > On 2019/11/23 10:39, Yasumasa Suenaga wrote: > > On 2019/11/23 1:52, Chris Plummer wrote: > >> Hi Yasumasa, > >> > >> Start with the following code in HotSpotAgent.java: > >> > >> catch (NoSuchSymbolException e) { > >> throw new DebuggerException("Doesn't appear to be a > HotSpot VM (could not find symbol \"" + > >> e.getSymbol() + "\" in remote process)"); > >> } > >> > >> Fix it to include "e" as the cause of the DebuggerException. Then the > exception backtrace that David included below will be a bit more useful. > > > > Thank you for the advise, Chris! > > But I cannot access Mach 5 result because I'm not an Oracle employee... > > > > I'm not sure I can get root cause from the email from submit repo. > > > > > > yasumasa > > > > > >> thanks, > >> > >> Chris > >> > >> > >> On 11/22/19 12:55 AM, Yasumasa Suenaga wrote: > >>> Thanks David! > >>> > >>> Hmm... my slowdebug build works fine on my laptop. > >>> I will investigate more. > >>> > >>> > >>> Thanks, > >>> > >>> Yasumasa > >>> > >>> > >>> On 2019/11/22 17:08, David Holmes wrote: > >>>> On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: > >>>>> Hi all, > >>>>> > >>>>> I tried to get mixed stack via `jhsdb jstack --mixed`, but I > couldn't. > >>>>> (See JBS for details) > >>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8234624 > >>>>> > >>>>> I think it is caused by DWARF. AMD64 needs DWARF for stack > unwinding, but SA does not handle it. > >>>>> So I created a patch. It works fine on my Fedora 31 x64 box, but it > failed on submit repo. > >>>>> > >>>>> http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 > >>>>> > >>>>> Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" > was not found. > >>>>> I wonder why it failed, and why my serviceability/sa tests (with > fastdebug build) was succeeded. > >>>>> Can you share details for this test? > mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 > >>>> > >>>> I can't really shed any light on it, there were lots of failures - > see below for example. The issue is with the VM that was being inspected > but there's no output from that VM. > >>>> > >>>> David > >>>> ----- > >>>> > >>>> ----------System.out:(10/413)---------- > >>>> Starting TestUniverse > >>>> Started LingeredApp with G1GC and pid 31111 > >>>> Starting clhsdb against 31111 > >>>> [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 > >>>> [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 > >>>> [2019-11-22T07:03:44.395815Z] Waiting for completion finished for > process 31133 > >>>> hsdb> Command not valid until attached to a VM > >>>> hsdb> > >>>> 'Heap Parameters' missing from stdout/stderr > >>>> > >>>> ----------System.err:(53/3915)---------- > >>>> Command line: > ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' > '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' > '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' > 'jdk.test.lib.apps.LingeredApp' '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] > >>>> Attaching to process 31111, please wait... > >>>> Unable to connect to process ID 31111: > >>>> > >>>> Doesn't appear to be a HotSpot VM (could not find symbol > "gHotSpotVMTypes" in > >>>> remote process) > >>>> sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a > HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) > >>>> at > jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) > >>>> at > jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) > >>>> at > jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) > >>>> at > jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) > >>>> at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) > >>>> at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) > >>>> at > jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) > >>>> at > jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) > >>>> stdout: [ Command not valid until attached to a VM > >>>> ]; > >>>> stderr: [ Command not valid until attached to a VM > >>>> ] > >>>> exitValue = -1 > >>>> > >>>> LingeredApp stdout: []; > >>>> LingeredApp stderr: [] > >>>> LingeredApp exitValue = 0 > >>>> java.lang.RuntimeException: 'Heap Parameters' missing from > stdout/stderr > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Sat Nov 23 08:59:35 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 23 Nov 2019 17:59:35 +0900 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> <1d87e3d1-2ccb-852b-41de-f2d8116f46f9@oracle.com> Message-ID: <288b6292-f785-0cee-18b5-65f64ac24e87@oss.nttdata.com> Hi Volker, I pushed new patch to submit repo [1] which includes fallback code if .eh_frame does not exist. So your guess might correct. However, according to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 Architecture Processor Supplement [2], we need to use DWARF in .eh_frame or .debug_frame for stack unwinding. So I think .eh_frame section should be included in ELF binaries for AMD64. In fact, libjvm.so in JDKs from jdk.java.net includes it. The patch [1] passed serviceability/sa tests except TestJhsdbJstackMixed.java and ClhsdbPstack.java . They seem to be timeout. They might need more time to complete with my patch. If we can extend timeout value, I want to try it. I want to know which configure options are passed in Mach5 on submit repo, and all binaries in submit repo have .eh_frame section. (I know it would be configured to slowdebug, but that's all...) Of course, I also want to know stdout of `jhsdb jstack` on the failure tests :) Thanks, Yasumasa [1] https://hg.openjdk.java.net/jdk/submit/rev/c3334c661fdf [2] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf On 2019/11/23 17:14, Volker Simonis wrote: > Just a wild guess, but maybe after your changes, the debug symbols are required and not found any more? This could be caused by the fact that Mach5 builds with different settings compared to you? See https://hg.openjdk.java.net/jdk-updates/jdk9u/raw-file/tip/common/doc/building.html#native-debug-symbols for the different variants. > > You could try to see if you can reproduce the failure locally by building --with-native-debug-symbols=none, external,zipped > > Yasumasa Suenaga > schrieb am Sa., 23. Nov. 2019, 05:24: > > David, Chris, > > Can you share the result of this test? > > ? ?mach5-one-ysuenaga-JDK-8234624-1-20191123-0234-6938325 > > It failed on TestJhsdbJstackMixed.java and ClhsdbPstack.java . > > > Thanks, > > Yasumasa > > > On 2019/11/23 10:39, Yasumasa Suenaga wrote: > > On 2019/11/23 1:52, Chris Plummer wrote: > >> Hi Yasumasa, > >> > >> Start with the following code in HotSpotAgent.java: > >> > >> ???????? catch (NoSuchSymbolException e) { > >> ???????????? throw new DebuggerException("Doesn't appear to be a HotSpot VM (could not find symbol \"" + > >> ???????????? e.getSymbol() + "\" in remote process)"); > >> ???????? } > >> > >> Fix it to include "e" as the cause of the DebuggerException. Then the exception backtrace that David included below will be a bit more useful. > > > > Thank you for the advise, Chris! > > But I cannot access Mach 5 result because I'm not an Oracle employee... > > > > I'm not sure I can get root cause from the email from submit repo. > > > > > > yasumasa > > > > > >> thanks, > >> > >> Chris > >> > >> > >> On 11/22/19 12:55 AM, Yasumasa Suenaga wrote: > >>> Thanks David! > >>> > >>> Hmm... my slowdebug build works fine on my laptop. > >>> I will investigate more. > >>> > >>> > >>> Thanks, > >>> > >>> Yasumasa > >>> > >>> > >>> On 2019/11/22 17:08, David Holmes wrote: > >>>> On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: > >>>>> Hi all, > >>>>> > >>>>> I tried to get mixed stack via `jhsdb jstack --mixed`, but I couldn't. > >>>>> (See JBS for details) > >>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8234624 > >>>>> > >>>>> I think it is caused by DWARF. AMD64 needs DWARF for stack unwinding, but SA does not handle it. > >>>>> So I created a patch. It works fine on my Fedora 31 x64 box, but it failed on submit repo. > >>>>> > >>>>> http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 > >>>>> > >>>>> Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" was not found. > >>>>> I wonder why it failed, and why my serviceability/sa tests (with fastdebug build) was succeeded. > >>>>> Can you share details for this test? mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 > >>>> > >>>> I can't really shed any light on it, there were lots of failures - see below for example. The issue is with the VM that was being inspected but there's no output from that VM. > >>>> > >>>> David > >>>> ??----- > >>>> > >>>> ----------System.out:(10/413)---------- > >>>> Starting TestUniverse > >>>> Started LingeredApp with G1GC and pid 31111 > >>>> Starting clhsdb against 31111 > >>>> [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 > >>>> [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 > >>>> [2019-11-22T07:03:44.395815Z] Waiting for completion finished for process 31133 > >>>> hsdb> Command not valid until attached to a VM > >>>> hsdb> > >>>> 'Heap Parameters' missing from stdout/stderr > >>>> > >>>> ----------System.err:(53/3915)---------- > >>>> Command line: ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' 'jdk.test.lib.apps.LingeredApp' '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] > >>>> Attaching to process 31111, please wait... > >>>> Unable to connect to process ID 31111: > >>>> > >>>> Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in > >>>> remote process) > >>>> sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) > >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) > >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) > >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) > >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) > >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) > >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) > >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) > >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) > >>>> ??stdout: [ Command not valid until attached to a VM > >>>> ]; > >>>> ??stderr: [ Command not valid until attached to a VM > >>>> ] > >>>> ??exitValue = -1 > >>>> > >>>> ??LingeredApp stdout: []; > >>>> ??LingeredApp stderr: [] > >>>> ??LingeredApp exitValue = 0 > >>>> java.lang.RuntimeException: 'Heap Parameters' missing from stdout/stderr > >> > >> > From suenaga at oss.nttdata.com Sun Nov 24 02:41:12 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sun, 24 Nov 2019 11:41:12 +0900 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <288b6292-f785-0cee-18b5-65f64ac24e87@oss.nttdata.com> References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> <1d87e3d1-2ccb-852b-41de-f2d8116f46f9@oracle.com> <288b6292-f785-0cee-18b5-65f64ac24e87@oss.nttdata.com> Message-ID: <1ffa31ff-7046-b160-0046-e391eec147b6@oss.nttdata.com> >> You could try to see if you can reproduce the failure locally by building --with-native-debug-symbols=none, external,zipped I tested TestJhsdbJstackMixed and ClhsdbPstack with slowdebug VM which configured with --with-native-debug-symbols=none on my laptop, they passed. (Fedora 31 x64 on Hyper-V, Core i7-8665U x 4vcpu) I wait someone share the result. BTW I tried to extend timeout value [1], but it still failed with timeout. I'm still finding the cause of this failure. Yasumasa [1] http://hg.openjdk.java.net/jdk/submit/rev/308e214cc03a On 2019/11/23 17:59, Yasumasa Suenaga wrote: > Hi Volker, > > I pushed new patch to submit repo [1] which includes fallback code if .eh_frame does not exist. > So your guess might correct. > > However, according to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 > Architecture Processor Supplement [2], we need to use DWARF in .eh_frame or .debug_frame for > stack unwinding. > So I think .eh_frame section should be included in ELF binaries for AMD64. > In fact, libjvm.so in JDKs from jdk.java.net includes it. > > > The patch [1] passed serviceability/sa tests except TestJhsdbJstackMixed.java and ClhsdbPstack.java? . > They seem to be timeout. They might need more time to complete with my patch. > If we can extend timeout value, I want to try it. > > > I want to know which configure options are passed in Mach5 on submit repo, and all binaries in > submit repo have .eh_frame section. > (I know it would be configured to slowdebug, but that's all...) > > Of course, I also want to know stdout of `jhsdb jstack` on the failure tests :) > > > Thanks, > > Yasumasa > > > [1] https://hg.openjdk.java.net/jdk/submit/rev/c3334c661fdf > [2] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf > > > On 2019/11/23 17:14, Volker Simonis wrote: >> Just a wild guess, but maybe after your changes, the debug symbols are required and not found any more? This could be caused by the fact that Mach5 builds with different settings compared to you? See https://hg.openjdk.java.net/jdk-updates/jdk9u/raw-file/tip/common/doc/building.html#native-debug-symbols for the different variants. >> >> You could try to see if you can reproduce the failure locally by building --with-native-debug-symbols=none, external,zipped >> >> Yasumasa Suenaga > schrieb am Sa., 23. Nov. 2019, 05:24: >> >> ??? David, Chris, >> >> ??? Can you share the result of this test? >> >> ???? ? ?mach5-one-ysuenaga-JDK-8234624-1-20191123-0234-6938325 >> >> ??? It failed on TestJhsdbJstackMixed.java and ClhsdbPstack.java . >> >> >> ??? Thanks, >> >> ??? Yasumasa >> >> >> ??? On 2019/11/23 10:39, Yasumasa Suenaga wrote: >> ???? > On 2019/11/23 1:52, Chris Plummer wrote: >> ???? >> Hi Yasumasa, >> ???? >> >> ???? >> Start with the following code in HotSpotAgent.java: >> ???? >> >> ???? >> ???????? catch (NoSuchSymbolException e) { >> ???? >> ???????????? throw new DebuggerException("Doesn't appear to be a HotSpot VM (could not find symbol \"" + >> ???? >> ???????????? e.getSymbol() + "\" in remote process)"); >> ???? >> ???????? } >> ???? >> >> ???? >> Fix it to include "e" as the cause of the DebuggerException. Then the exception backtrace that David included below will be a bit more useful. >> ???? > >> ???? > Thank you for the advise, Chris! >> ???? > But I cannot access Mach 5 result because I'm not an Oracle employee... >> ???? > >> ???? > I'm not sure I can get root cause from the email from submit repo. >> ???? > >> ???? > >> ???? > yasumasa >> ???? > >> ???? > >> ???? >> thanks, >> ???? >> >> ???? >> Chris >> ???? >> >> ???? >> >> ???? >> On 11/22/19 12:55 AM, Yasumasa Suenaga wrote: >> ???? >>> Thanks David! >> ???? >>> >> ???? >>> Hmm... my slowdebug build works fine on my laptop. >> ???? >>> I will investigate more. >> ???? >>> >> ???? >>> >> ???? >>> Thanks, >> ???? >>> >> ???? >>> Yasumasa >> ???? >>> >> ???? >>> >> ???? >>> On 2019/11/22 17:08, David Holmes wrote: >> ???? >>>> On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: >> ???? >>>>> Hi all, >> ???? >>>>> >> ???? >>>>> I tried to get mixed stack via `jhsdb jstack --mixed`, but I couldn't. >> ???? >>>>> (See JBS for details) >> ???? >>>>> >> ???? >>>>> https://bugs.openjdk.java.net/browse/JDK-8234624 >> ???? >>>>> >> ???? >>>>> I think it is caused by DWARF. AMD64 needs DWARF for stack unwinding, but SA does not handle it. >> ???? >>>>> So I created a patch. It works fine on my Fedora 31 x64 box, but it failed on submit repo. >> ???? >>>>> >> ???? >>>>> http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 >> ???? >>>>> >> ???? >>>>> Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" was not found. >> ???? >>>>> I wonder why it failed, and why my serviceability/sa tests (with fastdebug build) was succeeded. >> ???? >>>>> Can you share details for this test? mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 >> ???? >>>> >> ???? >>>> I can't really shed any light on it, there were lots of failures - see below for example. The issue is with the VM that was being inspected but there's no output from that VM. >> ???? >>>> >> ???? >>>> David >> ???? >>>> ??----- >> ???? >>>> >> ???? >>>> ----------System.out:(10/413)---------- >> ???? >>>> Starting TestUniverse >> ???? >>>> Started LingeredApp with G1GC and pid 31111 >> ???? >>>> Starting clhsdb against 31111 >> ???? >>>> [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 >> ???? >>>> [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 >> ???? >>>> [2019-11-22T07:03:44.395815Z] Waiting for completion finished for process 31133 >> ???? >>>> hsdb> Command not valid until attached to a VM >> ???? >>>> hsdb> >> ???? >>>> 'Heap Parameters' missing from stdout/stderr >> ???? >>>> >> ???? >>>> ----------System.err:(53/3915)---------- >> ???? >>>> Command line: ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' 'jdk.test.lib.apps.LingeredApp' '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] >> ???? >>>> Attaching to process 31111, please wait... >> ???? >>>> Unable to connect to process ID 31111: >> ???? >>>> >> ???? >>>> Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in >> ???? >>>> remote process) >> ???? >>>> sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) >> ???? >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) >> ???? >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) >> ???? >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) >> ???? >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) >> ???? >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) >> ???? >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) >> ???? >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) >> ???? >>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) >> ???? >>>> ??stdout: [ Command not valid until attached to a VM >> ???? >>>> ]; >> ???? >>>> ??stderr: [ Command not valid until attached to a VM >> ???? >>>> ] >> ???? >>>> ??exitValue = -1 >> ???? >>>> >> ???? >>>> ??LingeredApp stdout: []; >> ???? >>>> ??LingeredApp stderr: [] >> ???? >>>> ??LingeredApp exitValue = 0 >> ???? >>>> java.lang.RuntimeException: 'Heap Parameters' missing from stdout/stderr >> ???? >> >> ???? >> >> From david.holmes at oracle.com Sun Nov 24 23:12:09 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 25 Nov 2019 09:12:09 +1000 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> <1d87e3d1-2ccb-852b-41de-f2d8116f46f9@oracle.com> Message-ID: <739d788f-ea18-be7b-23d6-63ef6bb064ba@oracle.com> On 23/11/2019 2:24 pm, Yasumasa Suenaga wrote: > David, Chris, > > Can you share the result of this test? > > ? mach5-one-ysuenaga-JDK-8234624-1-20191123-0234-6938325 > > It failed on TestJhsdbJstackMixed.java and ClhsdbPstack.java . I don't know what to make of the result. For TestJhsdbJstackMixed it timed out. There is a stack dump in the log which for the most part is quite normal e.g. ----------------- 8449 ----------------- "NoFramePointerJNIFib" #13 prio=5 tid=0x00007f7224674000 nid=0x2101 runnable [0x00007f71f54d9000] java.lang.Thread.State: RUNNABLE JavaThread state: _thread_in_native 0x00007f722d2d26c0 fib + 0x40 ----------------- 8438 ----------------- "Common-Cleaner" #12 daemon prio=8 tid=0x00007f722461a800 nid=0x20f6 in Object.wait() [0x00007f71f5af8000] java.lang.Thread.State: TIMED_WAITING (on object monitor) JavaThread state: _thread_blocked 0x00007f722ce4cde2 __pthread_cond_timedwait + 0x132 0x00007f722c030fa4 ObjectSynchronizer::wait(Handle, long, Thread*) + 0x84 0x00007f722b9097fd JVM_MonitorWait + 0x11d 0x00007f720c927dbe method entry point (kind = native) 0x00007f720c91f0b3 * java.lang.Object.wait(long) bci:0 (Interpreted frame) 0x00007f720c91ee00 * java.lang.ref.ReferenceQueue.remove(long) bci:59 line:155 (Interpreted frame) 0x00007f720c91f0f8 * jdk.internal.ref.CleanerImpl.run() bci:65 line:148 (Interpreted frame) 0x00007f720c91f0b3 * java.lang.Thread.run() bci:11 line:832 (Interpreted frame) 0x00007f720c9159ca 0x00007f722b7af34c JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac 0x00007f722b7ac3ce JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33e 0x00007f722b7ac5ea JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Thread*) + 0xca 0x00007f722b905f07 thread_entry(JavaThread*, Thread*) + 0x127 0x00007f722c09dde6 JavaThread::thread_main_inner() + 0x226 0x00007f722c0a34c6 Thread::call_run() + 0xf6 0x00007f722bdd57be thread_native_entry(Thread*) + 0x10e 0x00007f722ce48ea5 start_thread + 0xc5 but after normal thread output we get to ----------------- 8406 ----------------- 0x00007f722ce4a017 pthread_join + 0xa7 0x00007f722d474050 ???????? 0x00007f722d474050 ???????? < repeats unknown number of times due to output log overflow> 0x00007f722d474050 ???????? 0x00007f722d474050 ???????? DEBUG: [0x00007f722d2d26c0 fib + 0x40] DEBUG: Test triggered interesting condition. DEBUG: Test PASSED. --- The ClhsdbPstack.java generated a core dump but no hs_err file. It also had the ????? stack dump 0x00007f882d400050 ???????? 0x00007f882d400050 ???????? 0x00007f882d400050 ???????? 0x00007f882d400050 ???????? ]; stderr: [] exitValue = 134 LingeredApp stdout: []; LingeredApp stderr: [] LingeredApp exitValue = 0 java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0] David ----- > > Thanks, > > Yasumasa > > > On 2019/11/23 10:39, Yasumasa Suenaga wrote: >> On 2019/11/23 1:52, Chris Plummer wrote: >>> Hi Yasumasa, >>> >>> Start with the following code in HotSpotAgent.java: >>> >>> ???????? catch (NoSuchSymbolException e) { >>> ???????????? throw new DebuggerException("Doesn't appear to be a >>> HotSpot VM (could not find symbol \"" + >>> ???????????? e.getSymbol() + "\" in remote process)"); >>> ???????? } >>> >>> Fix it to include "e" as the cause of the DebuggerException. Then the >>> exception backtrace that David included below will be a bit more useful. >> >> Thank you for the advise, Chris! >> But I cannot access Mach 5 result because I'm not an Oracle employee... >> >> I'm not sure I can get root cause from the email from submit repo. >> >> >> yasumasa >> >> >>> thanks, >>> >>> Chris >>> >>> >>> On 11/22/19 12:55 AM, Yasumasa Suenaga wrote: >>>> Thanks David! >>>> >>>> Hmm... my slowdebug build works fine on my laptop. >>>> I will investigate more. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/11/22 17:08, David Holmes wrote: >>>>> On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> I tried to get mixed stack via `jhsdb jstack --mixed`, but I >>>>>> couldn't. >>>>>> (See JBS for details) >>>>>> >>>>>> ?? https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>> >>>>>> I think it is caused by DWARF. AMD64 needs DWARF for stack >>>>>> unwinding, but SA does not handle it. >>>>>> So I created a patch. It works fine on my Fedora 31 x64 box, but >>>>>> it failed on submit repo. >>>>>> >>>>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 >>>>>> >>>>>> Failed test was linux-x64-debug, and it is due to >>>>>> "gHotSpotVMTypes" was not found. >>>>>> I wonder why it failed, and why my serviceability/sa tests (with >>>>>> fastdebug build) was succeeded. >>>>>> Can you share details for this test? >>>>>> mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 >>>>> >>>>> I can't really shed any light on it, there were lots of failures - >>>>> see below for example. The issue is with the VM that was being >>>>> inspected but there's no output from that VM. >>>>> >>>>> David >>>>> ??----- >>>>> >>>>> ----------System.out:(10/413)---------- >>>>> Starting TestUniverse >>>>> Started LingeredApp with G1GC and pid 31111 >>>>> Starting clhsdb against 31111 >>>>> [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 >>>>> [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 >>>>> [2019-11-22T07:03:44.395815Z] Waiting for completion finished for >>>>> process 31133 >>>>> hsdb> Command not valid until attached to a VM >>>>> hsdb> >>>>> 'Heap Parameters' missing from stdout/stderr >>>>> >>>>> ----------System.err:(53/3915)---------- >>>>> Command line: >>>>> ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' >>>>> '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' >>>>> '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' >>>>> 'jdk.test.lib.apps.LingeredApp' >>>>> '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] >>>>> Attaching to process 31111, please wait... >>>>> Unable to connect to process ID 31111: >>>>> >>>>> Doesn't appear to be a HotSpot VM (could not find symbol >>>>> "gHotSpotVMTypes" in >>>>> remote process) >>>>> sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a >>>>> HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) >>>>> ?????at >>>>> jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) >>>>> >>>>> ?????at >>>>> jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) >>>>> >>>>> ?????at >>>>> jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) >>>>> >>>>> ?????at >>>>> jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) >>>>> >>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) >>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) >>>>> ?????at >>>>> jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) >>>>> >>>>> ?????at >>>>> jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) >>>>> ??stdout: [ Command not valid until attached to a VM >>>>> ]; >>>>> ??stderr: [ Command not valid until attached to a VM >>>>> ] >>>>> ??exitValue = -1 >>>>> >>>>> ??LingeredApp stdout: []; >>>>> ??LingeredApp stderr: [] >>>>> ??LingeredApp exitValue = 0 >>>>> java.lang.RuntimeException: 'Heap Parameters' missing from >>>>> stdout/stderr >>> >>> From suenaga at oss.nttdata.com Mon Nov 25 04:54:22 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 25 Nov 2019 13:54:22 +0900 Subject: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <739d788f-ea18-be7b-23d6-63ef6bb064ba@oracle.com> References: <119bdc9d-f764-c25f-3930-df03983f115f@oss.nttdata.com> <02f62c3a-c22b-1cf9-2f9b-c9ea997d4d55@oracle.com> <0e68d6ac-7a70-152b-867f-a70f672d95ce@oss.nttdata.com> <1d87e3d1-2ccb-852b-41de-f2d8116f46f9@oracle.com> <739d788f-ea18-be7b-23d6-63ef6bb064ba@oracle.com> Message-ID: Thanks David! I tweaked my patch, and it passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191125-0350-6967731). I will send review request. Yasumasa On 2019/11/25 8:12, David Holmes wrote: > On 23/11/2019 2:24 pm, Yasumasa Suenaga wrote: >> David, Chris, >> >> Can you share the result of this test? >> >> ?? mach5-one-ysuenaga-JDK-8234624-1-20191123-0234-6938325 >> >> It failed on TestJhsdbJstackMixed.java and ClhsdbPstack.java . > > I don't know what to make of the result. For TestJhsdbJstackMixed it timed out. There is a stack dump in the log which for the most part is quite normal e.g. > > ----------------- 8449 ----------------- > "NoFramePointerJNIFib" #13 prio=5 tid=0x00007f7224674000 nid=0x2101 runnable [0x00007f71f54d9000] > ?? java.lang.Thread.State: RUNNABLE > ?? JavaThread state: _thread_in_native > 0x00007f722d2d26c0??? fib + 0x40 > ----------------- 8438 ----------------- > "Common-Cleaner" #12 daemon prio=8 tid=0x00007f722461a800 nid=0x20f6 in Object.wait() [0x00007f71f5af8000] > ?? java.lang.Thread.State: TIMED_WAITING (on object monitor) > ?? JavaThread state: _thread_blocked > 0x00007f722ce4cde2??? __pthread_cond_timedwait + 0x132 > 0x00007f722c030fa4??? ObjectSynchronizer::wait(Handle, long, Thread*) + 0x84 > 0x00007f722b9097fd??? JVM_MonitorWait + 0x11d > 0x00007f720c927dbe??? method entry point (kind = native) > 0x00007f720c91f0b3??? * java.lang.Object.wait(long) bci:0 (Interpreted frame) > 0x00007f720c91ee00??? * java.lang.ref.ReferenceQueue.remove(long) bci:59 line:155 (Interpreted frame) > 0x00007f720c91f0f8??? * jdk.internal.ref.CleanerImpl.run() bci:65 line:148 (Interpreted frame) > 0x00007f720c91f0b3??? * java.lang.Thread.run() bci:11 line:832 (Interpreted frame) > 0x00007f720c9159ca??? > 0x00007f722b7af34c??? JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) + 0x6ac > 0x00007f722b7ac3ce??? JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*) + 0x33e > 0x00007f722b7ac5ea??? JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Thread*) + 0xca > 0x00007f722b905f07??? thread_entry(JavaThread*, Thread*) + 0x127 > 0x00007f722c09dde6??? JavaThread::thread_main_inner() + 0x226 > 0x00007f722c0a34c6??? Thread::call_run() + 0xf6 > 0x00007f722bdd57be??? thread_native_entry(Thread*) + 0x10e > 0x00007f722ce48ea5??? start_thread + 0xc5 > > but after normal thread output we get to > > ----------------- 8406 ----------------- > 0x00007f722ce4a017??? pthread_join + 0xa7 > 0x00007f722d474050??????? ???????? > 0x00007f722d474050??????? ???????? > < repeats unknown number of times due to output log overflow> > 0x00007f722d474050??????? ???????? > 0x00007f722d474050??????? ???????? > > DEBUG: [0x00007f722d2d26c0??? fib + 0x40] > DEBUG: Test triggered interesting condition. > DEBUG: Test PASSED. > > --- > > The ClhsdbPstack.java generated a core dump but no hs_err file. It also had the ????? stack dump > > 0x00007f882d400050??????? ???????? > 0x00007f882d400050??????? ???????? > 0x00007f882d400050??????? ???????? > 0x00007f882d400050??????? ???????? > ]; > ?stderr: [] > ?exitValue = 134 > > ?LingeredApp stdout: []; > ?LingeredApp stderr: [] > ?LingeredApp exitValue = 0 > java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0] > > David > ----- > > >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/11/23 10:39, Yasumasa Suenaga wrote: >>> On 2019/11/23 1:52, Chris Plummer wrote: >>>> Hi Yasumasa, >>>> >>>> Start with the following code in HotSpotAgent.java: >>>> >>>> ???????? catch (NoSuchSymbolException e) { >>>> ???????????? throw new DebuggerException("Doesn't appear to be a HotSpot VM (could not find symbol \"" + >>>> ???????????? e.getSymbol() + "\" in remote process)"); >>>> ???????? } >>>> >>>> Fix it to include "e" as the cause of the DebuggerException. Then the exception backtrace that David included below will be a bit more useful. >>> >>> Thank you for the advise, Chris! >>> But I cannot access Mach 5 result because I'm not an Oracle employee... >>> >>> I'm not sure I can get root cause from the email from submit repo. >>> >>> >>> yasumasa >>> >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> >>>> On 11/22/19 12:55 AM, Yasumasa Suenaga wrote: >>>>> Thanks David! >>>>> >>>>> Hmm... my slowdebug build works fine on my laptop. >>>>> I will investigate more. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/11/22 17:08, David Holmes wrote: >>>>>> On 22/11/2019 5:42 pm, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> I tried to get mixed stack via `jhsdb jstack --mixed`, but I couldn't. >>>>>>> (See JBS for details) >>>>>>> >>>>>>> ?? https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>> >>>>>>> I think it is caused by DWARF. AMD64 needs DWARF for stack unwinding, but SA does not handle it. >>>>>>> So I created a patch. It works fine on my Fedora 31 x64 box, but it failed on submit repo. >>>>>>> >>>>>>> ?? http://hg.openjdk.java.net/jdk/submit/rev/f97745e0af75 >>>>>>> >>>>>>> Failed test was linux-x64-debug, and it is due to "gHotSpotVMTypes" was not found. >>>>>>> I wonder why it failed, and why my serviceability/sa tests (with fastdebug build) was succeeded. >>>>>>> Can you share details for this test? mach5-one-ysuenaga-JDK-8234624-20191122-0630-6909161 >>>>>> >>>>>> I can't really shed any light on it, there were lots of failures - see below for example. The issue is with the VM that was being inspected but there's no output from that VM. >>>>>> >>>>>> David >>>>>> ??----- >>>>>> >>>>>> ----------System.out:(10/413)---------- >>>>>> Starting TestUniverse >>>>>> Started LingeredApp with G1GC and pid 31111 >>>>>> Starting clhsdb against 31111 >>>>>> [2019-11-22T07:03:42.836056Z] Gathering output for process 31133 >>>>>> [2019-11-22T07:03:44.395452Z] Waiting for completion for process 31133 >>>>>> [2019-11-22T07:03:44.395815Z] Waiting for completion finished for process 31133 >>>>>> hsdb> Command not valid until attached to a VM >>>>>> hsdb> >>>>>> 'Heap Parameters' missing from stdout/stderr >>>>>> >>>>>> ----------System.err:(53/3915)---------- >>>>>> Command line: ['/opt/mach5/mesos/work_dir/jib-master/install/2019-11-22-0629473.suenaga.source/linux-x64-debug.jdk/jdk-14/fastdebug/bin/java' '-XX:+UnlockExperimentalVMOptions' '-XX:+UseG1GC' '-cp' '/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/serviceability/sa/TestUniverse.d:/opt/mach5/mesos/work_dir/slaves/6e54f4af-e606-43b0-80ce-0a482a5988b6-S156/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/0454c404-a309-4896-bf31-90b9636056fa/runs/eed41e19-a725-491b-9ddd-c380024cedc9/testoutput/test-support/jtreg_open_test_hotspot_jtreg_tier1_serviceability/classes/2/test/lib' 'jdk.test.lib.apps.LingeredApp' '918bf6a8-d3df-4fd1-bdca-13fc399c67f3.lck' ] >>>>>> Attaching to process 31111, please wait... >>>>>> Unable to connect to process ID 31111: >>>>>> >>>>>> Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in >>>>>> remote process) >>>>>> sun.jvm.hotspot.debugger.DebuggerException: Doesn't appear to be a HotSpot VM (could not find symbol "gHotSpotVMTypes" in remote process) >>>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.setupVM(HotSpotAgent.java:413) >>>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:306) >>>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:141) >>>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.attachDebugger(CLHSDB.java:180) >>>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:61) >>>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:40) >>>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:270) >>>>>> ?????at jdk.hotspot.agent/sun.jvm.hotspot.SALauncher.main(SALauncher.java:406) >>>>>> ??stdout: [ Command not valid until attached to a VM >>>>>> ]; >>>>>> ??stderr: [ Command not valid until attached to a VM >>>>>> ] >>>>>> ??exitValue = -1 >>>>>> >>>>>> ??LingeredApp stdout: []; >>>>>> ??LingeredApp stderr: [] >>>>>> ??LingeredApp exitValue = 0 >>>>>> java.lang.RuntimeException: 'Heap Parameters' missing from stdout/stderr >>>> >>>> From suenaga at oss.nttdata.com Mon Nov 25 05:08:41 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 25 Nov 2019 14:08:41 +0900 Subject: RFR: 8234624: jstack mixed mode should refer DWARF Message-ID: Hi all, Please review this change: JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame for stack unwinding. As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system library (e.g. libc) might be compiled with this feature. However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). So it might be lack of stack frames. I guess JDK-8219201 is caused by same issue. Thanks, Yasumasa [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf From david.holmes at oracle.com Mon Nov 25 05:52:53 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 25 Nov 2019 15:52:53 +1000 Subject: RFR (S) 8173658: JvmtiExport::post_class_unload() is broken for non-JavaThread initiators In-Reply-To: <8fe01d64-813f-195f-a11b-6137c0625f62@oracle.com> References: <8fe01d64-813f-195f-a11b-6137c0625f62@oracle.com> Message-ID: Hi Coleen, On 23/11/2019 5:42 am, coleen.phillimore at oracle.com wrote: > Summary: call extension ClassUnload event as a deferred event from the > ServiceThread and remove unsafe arguments Looks good. Minor nit: src/hotspot/share/prims/jvmtiExport.cpp assert(Thread::current()->is_Java_thread(), "must be called from ServiceThread"); we have an explicit check for is_service_thread available :) Thanks, David ----- > I'm still waiting for the CSR request to get approved but this change > fixes the broken class unload events.? It's been tested with the > existing test case, and hs-tier1 for all platforms and tier2-6 on > linux-x64-debug. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173658.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8173658 > > Thanks, > Coleen From serguei.spitsyn at oracle.com Mon Nov 25 09:38:21 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 25 Nov 2019 01:38:21 -0800 Subject: RFR(S): 8221372: Test vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java times out Message-ID: An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Mon Nov 25 13:47:48 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 25 Nov 2019 08:47:48 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <568c2562-0a56-73ac-c0af-43339d701b19@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> <24fc9c1c-bfd1-5edf-2231-d9ba0e0885f5@oracle.com> <020d7c80-4d77-ee96-5a7b-74acdbd54f86@oracle.com> <568c2562-0a56-73ac-c0af-43339d701b19@oracle.com> Message-ID: <200cb839-9019-58f1-17e5-7a0426a6035b@oracle.com> Thanks for the code review, Serguei! Coleen On 11/22/19 6:34 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > +1 > > Thanks, > Serguei > > > On 11/22/19 14:53, Daniel D. Daugherty wrote: >>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev >> >> src/hotspot/share/prims/jvmtiImpl.cpp >> ??? No comments. >> >> src/hotspot/share/prims/jvmtiImpl.hpp >> ??? No comments. >> >> src/hotspot/share/runtime/serviceThread.cpp >> ??? No comments. >> >> Thumbs up. >> >> Dan >> >> >> On 11/22/19 2:15 PM, coleen.phillimore at oracle.com wrote: >>> >>> Dan, Thank you for reviewing this! >>> >>> On 11/22/19 12:49 PM, Daniel D. Daugherty wrote: >>>> Hi Coleen, >>>> >>>> Sorry for the delay in getting back to this re-review. >>>> >>>> >>>> On 11/21/19 9:12 AM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Please review a new version of this change that keeps the nmethod >>>>> from being unloaded, after it is added to the deferred event queue: >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html >>>> >>>> src/hotspot/share/code/nmethod.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/share/oops/instanceKlass.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/share/prims/jvmtiExport.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/share/prims/jvmtiImpl.cpp >>>> ??? Nice solution with the new oops_do() and nmethods_do() functions! >>> Erik's insistance! >>>> >>>> ??? old L988: void JvmtiDeferredEventQueue::enqueue(const >>>> JvmtiDeferredEvent& event) { >>>> ??? new L998: void >>>> JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) { >>>> ??????? Not sure why this was changed. >>>> >>>> ??????? Update: Looks like Serguei raised the issue and Coleen has >>>> already >>>> ??????? resolved it. >>> >>> Yes. >>>> >>>> src/hotspot/share/prims/jvmtiImpl.hpp >>>> ??? old L494: ??? QueueNode(const JvmtiDeferredEvent& event) >>>> ??? new L498: ??? QueueNode(JvmtiDeferredEvent& event) >>>> ??????? Why was this changed? >>>> >>>> ??????? Update: Not clear if this was covered by Coleen's reply to >>>> Serguei. >>>> >>>> ??? old L497: ??? const JvmtiDeferredEvent& event() const { return >>>> _event; } >>>> ??? new L501: ??? JvmtiDeferredEvent& event() { return _event; } >>>> ??????? Why was this changed? >>>> >>>> ??????? Update: Coleen's reply to Serguei explained this. Perhaps add: >>>> ????????????????? // Not const because of oops_do() and nmethods_do(). >>>> >>>> ??? old L509: ? static void enqueue(const JvmtiDeferredEvent& >>>> event) NOT_JVMTI_RETURN; >>>> ??? new L513: ? static void enqueue(JvmtiDeferredEvent event) >>>> NOT_JVMTI_RETURN; >>>> ??????? Why was this changed? >>>> >>>> ??????? Update: Looks like Serguei raised the issue and Coleen has >>>> already >>>> ??????? resolved it. >>> >>> Yes, I fixed these. >>>> >>>> src/hotspot/share/runtime/mutexLocker.cpp >>>> ??? This change is going to require some testing to make sure we don't >>>> ??? have any new deadlock scenarios. >>> >>> Luckily, I've previously added an implicit NoSafepointVerifier to >>> locks that are _allow_vm_block = true, like this one. >>> + def(JmethodIdCreation_lock , PaddedMutex , leaf, true, >>> _safepoint_check_never); // used for creating jmethodIDs. >>> which prevents one class of deadlock. If we take out another lock >>> with a higher rank, we'll get the ranking assert. >>> >>> This lock prevents insertion into an array, and has little outside >>> calls. >>> >>> I'm running tests in tier 1-6 but any code that travels through this >>> should get these assertion checks, rather than deadlocking. >>> >>>> >>>> src/hotspot/share/runtime/serviceThread.cpp >>>> ??? L50 - nit - why the extra blank line? >>> >>> To separate static data member definitions from functions.? I >>> removed it. >>>> >>>> src/hotspot/share/runtime/serviceThread.hpp >>>> ??? Thanks for cleaning up the static: >>>> >>>> ????? ServiceThread::is_service_thread(Thread* thread) >>>> >>>> ??? stuff. Having it be different than the other threads was >>>> ??? a bit jarring. >>>> >>>> src/hotspot/share/runtime/thread.hpp >>>> ??? No comments. >>>> >>>> Thumbs up. My only comments are nits so I don't need to see a >>>> new webrev if you decide to fix them. >>> >>> So it turns out that in stress testing my fix >>> forhttps://bugs.openjdk.java.net/browse/JDK-8212160 >>> >>> Because I was in the area and thought this was a duplicate of that >>> bug (it is not).?? I found that calling oops_do and nmethods_do the >>> ServiceThread? needs to hold the Service_lock, because other threads >>> can be adding things to the global queue while the sweeper thread is >>> calling this in a handshake. >>> >>> I am now retesting this change with the changes above, and with the >>> Service_lock.?? So far my stress tests for JDK-81212160 and the >>> stress test for this bug pass, but I'm going to run through all the >>> tiers 1-6 over the weekend. >>> >>> Please have a look at the changes in the meantime. >>> >>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev >>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04/webrev >>> >>> Thanks, >>> Coleen >>>> >>>> Dan >>>> >>>>> >>>>> Ran the test that failed 100 times without failure, tier1 on >>>>> Oracle supported platforms, and tier2-3 including jvmti and jdi >>>>> tests locally. >>>>> >>>>> See bug for more details about the crash. >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>> On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Hi Serguei, >>>>>> >>>>>> Sorry for not sending an update.? I talked to Erik and am working >>>>>> on a version that keeps the nmethod from being unloaded while >>>>>> it's in the deferred event queue, with a version that the GC >>>>>> people will like, and I like.? I'm testing it out now. >>>>>> >>>>>> Thanks! >>>>>> Coleen >>>>>> >>>>>> >>>>>> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> Sorry for the latency, I had to investigate it a little bit. >>>>>>> I still have some doubt your fix is right thing to do. >>>>>>> >>>>>>> >>>>>>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> >>>>>>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>>>>>> start with this one. >>>>>>>>>> >>>>>>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>>>>>> (made_zombie or memory released) by the sweeper, but the >>>>>>>>>> nmethod could be unloaded. Unloading the nmethod clears the >>>>>>>>>> Method* _method field. >>>>>>>>> >>>>>>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>>>>>> >>>>>>>>>> The post_compiled_method_load event needs the _method field >>>>>>>>>> to look at things like inlining and ScopeDesc fields.?? If >>>>>>>>>> the nmethod is unloaded, some of the oops are dead.? There >>>>>>>>>> are "holder" oops that correspond to the metadata in the >>>>>>>>>> nmethod.? If these oops are dead, causing the nmethod to get >>>>>>>>>> unloaded, then the metadata may not be valid. >>>>>>>>>> >>>>>>>>>> So my change 02 looks for a NULL nmethod._method field to >>>>>>>>>> tell whether we can post information about the nmethod. >>>>>>>>>> >>>>>>>>>> There's code in nmethod.cpp like: >>>>>>>>>> >>>>>>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>>>>>> ? if (_jmethod_id == NULL) { >>>>>>>>>> ??? // Cache the jmethod_id since it can no longer be looked >>>>>>>>>> up once the >>>>>>>>>> ??? // method itself has been marked for unloading. >>>>>>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>>>>>> ? } >>>>>>>>>> ? return _jmethod_id; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Which was added when post_method_load and unload were turned >>>>>>>>>> into deferred events. >>>>>>>>> >>>>>>>>> Could we cache the jmethodID in the >>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>>>>>> similarly as we do in the >>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>>>>>> This would help to get rid of the dependency on the >>>>>>>>> nmethod::_method. >>>>>>>>> Do we depend on any other nmethod fields? >>>>>>>> >>>>>>>> Yes, there are other nmethod metadata that we rely on to print >>>>>>>> inline information, and this function >>>>>>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it >>>>>>>> uses the ScopeDesc data in the nmethod. >>>>>>> >>>>>>> One possible approach is to prepare and cache all this information >>>>>>> in the nmethod::post_compiled_method_load_event() before the >>>>>>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>>>>>> The event parameters are: >>>>>>> typedef struct { >>>>>>> const void* start_address; >>>>>>> jlocation location; >>>>>>> } jvmtiAddrLocationMap; >>>>>>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>>>>>> jmethodID method, >>>>>>> jint code_size, >>>>>>> const void* code_addr, >>>>>>> jint map_length, >>>>>>> const jvmtiAddrLocationMap* map, >>>>>>> const void* compile_info) >>>>>>> Some of these addresses above could be not accessible when an >>>>>>> event is posted. >>>>>>> Not sure yet if it is Okay. >>>>>>> The question is if this kind of refactoring is worth and right >>>>>>> thing to do. >>>>>>> >>>>>>>> >>>>>>>> We do cache the jmethodID but that's not good enough.? See my >>>>>>>> last comment in the bug report.? The jmethodID can point to an >>>>>>>> unloaded method. >>>>>>> >>>>>>> This looks like it is done a little bit late. >>>>>>> It'd better to do it before the event is deferred (see above). >>>>>>> >>>>>>>> I tried a version of keeping the nmethod alive, but the GC >>>>>>>> folks will hate it.? And it doesn't work and I hate it. >>>>>>> >>>>>>> From serviceability point of view this is the best and most >>>>>>> consistent approach. >>>>>>> I seems to me, it was initially designed this way. >>>>>>> The downside is it adds some extra complexity to the GC. >>>>>>> >>>>>>>> My version 01 is the best, with the caveat that maybe it should >>>>>>>> check for _method == NULL instead of nmethod->is_alive().? I >>>>>>>> have to talk to Erik to see if there's a race with concurrent >>>>>>>> class unloading. >>>>>>>> >>>>>>>> Any application that depends on a compiled method loading event >>>>>>>> on a class that could be unloaded is a buggy application.? >>>>>>>> Applications should not rely on when the JIT compiler decides >>>>>>>> to compile a method! This happens to us for a stress test.? >>>>>>>> Most applications will get most of their compiled method >>>>>>>> loading events as they normally do. >>>>>>> >>>>>>> It is not an application that relies on the compiled method >>>>>>> loading event. >>>>>>> It is about profiling tools to be able to get correct >>>>>>> information about what is going on with compilations. >>>>>>> My concern is that if we skip such compiled method load events >>>>>>> then profilers have no way >>>>>>> to find out there many unneeded compilations that are thrown >>>>>>> away without any real use. >>>>>>> Also, it is not clear what happens with the subsequent compiled >>>>>>> method unload events. >>>>>>> Are they going to be skipped as well or they can appear and >>>>>>> confuse profilers? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>>> I put more debugging in the bug to show this crash was from >>>>>>>>>> an unloaded nmethod. >>>>>>>>>> >>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Coleen, >>>>>>>>>>> >>>>>>>>>>> I have some questions. >>>>>>>>>>> >>>>>>>>>>> Both the compiler method load and unload are posted as >>>>>>>>>>> deferred events. >>>>>>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>>>>>> processes the event. >>>>>>>>>>> >>>>>>>>>>> The implementation is: >>>>>>>>>>> >>>>>>>>>>> JvmtiDeferredEvent >>>>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>>>>>> ? . . . >>>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>>>>> ? // this deferred event. >>>>>>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>>>>>> ? return event; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> JvmtiDeferredEvent >>>>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* >>>>>>>>>>> nm, jmethodID id, const void* code) { >>>>>>>>>>> ? . . . >>>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>>>>>> ? // generated code from being reused too early. We pass >>>>>>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>>>>>> ? // made into a zombie can be locked. >>>>>>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>>>>>> ? return event; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> void JvmtiDeferredEvent::post() { >>>>>>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>>>>>> ? switch(_type) { >>>>>>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>>>>>> JvmtiExport::post_compiled_method_load(nm); >>>>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>>>> ????? break; >>>>>>>>>>> ??? } >>>>>>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>>>>>> JvmtiExport::post_compiled_method_unload( >>>>>>>>>>> _event_data.compiled_method_unload.method_id, >>>>>>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>>>> ????? break; >>>>>>>>>>> ??? } >>>>>>>>>>> ??? . . . >>>>>>>>>>> ? } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Then I wonder how is it possible for the nmethod to be not >>>>>>>>>>> alive here?: >>>>>>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>>>>>> . . . >>>>>>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>>>>>> 2174 if (!nm->is_alive()) { >>>>>>>>>>> 2175 return; >>>>>>>>>>> 2176 } >>>>>>>>>>> At least, it lokks like something else is broken. >>>>>>>>>>> Do I miss something important here? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>>>>>> unloaded nmethods >>>>>>>>>>>> >>>>>>>>>>>> Tested tier1-3 and 100 times with test that failed >>>>>>>>>>>> (reproduced failure without the fix). >>>>>>>>>>>> >>>>>>>>>>>> open webrev at >>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Mon Nov 25 13:52:36 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 25 Nov 2019 08:52:36 -0500 Subject: RFR (S) 8173658: JvmtiExport::post_class_unload() is broken for non-JavaThread initiators In-Reply-To: References: <8fe01d64-813f-195f-a11b-6137c0625f62@oracle.com> Message-ID: <58811209-8eda-ab6f-19b4-08a98e01409e@oracle.com> On 11/25/19 12:52 AM, David Holmes wrote: > Hi Coleen, > > On 23/11/2019 5:42 am, coleen.phillimore at oracle.com wrote: >> Summary: call extension ClassUnload event as a deferred event from >> the ServiceThread and remove unsafe arguments > > Looks good. > > Minor nit: > > src/hotspot/share/prims/jvmtiExport.cpp > > assert(Thread::current()->is_Java_thread(), "must be called from > ServiceThread"); > > we have an explicit check for is_service_thread available :) Yes, thanks now I do.? Thanks for reviewing and your comments in the CSR. Coleen > > Thanks, > David > ----- > >> I'm still waiting for the CSR request to get approved but this change >> fixes the broken class unload events.? It's been tested with the >> existing test case, and hs-tier1 for all platforms and tier2-6 on >> linux-x64-debug. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8173658.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8173658 >> >> Thanks, >> Coleen From erik.osterlund at oracle.com Mon Nov 25 14:37:45 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 25 Nov 2019 15:37:45 +0100 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: <200cb839-9019-58f1-17e5-7a0426a6035b@oracle.com> References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> <24fc9c1c-bfd1-5edf-2231-d9ba0e0885f5@oracle.com> <020d7c80-4d77-ee96-5a7b-74acdbd54f86@oracle.com> <568c2562-0a56-73ac-c0af-43339d701b19@oracle.com> <200cb839-9019-58f1-17e5-7a0426a6035b@oracle.com> Message-ID: Hi Coleen, Still good BTW! Thanks, /Erik On 2019-11-25 14:47, coleen.phillimore at oracle.com wrote: > Thanks for the code review, Serguei! > Coleen > > On 11/22/19 6:34 PM, serguei.spitsyn at oracle.com wrote: >> Hi Coleen, >> >> +1 >> >> Thanks, >> Serguei >> >> >> On 11/22/19 14:53, Daniel D. Daugherty wrote: >>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev >>> >>> src/hotspot/share/prims/jvmtiImpl.cpp >>> ??? No comments. >>> >>> src/hotspot/share/prims/jvmtiImpl.hpp >>> ??? No comments. >>> >>> src/hotspot/share/runtime/serviceThread.cpp >>> ??? No comments. >>> >>> Thumbs up. >>> >>> Dan >>> >>> >>> On 11/22/19 2:15 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Dan, Thank you for reviewing this! >>>> >>>> On 11/22/19 12:49 PM, Daniel D. Daugherty wrote: >>>>> Hi Coleen, >>>>> >>>>> Sorry for the delay in getting back to this re-review. >>>>> >>>>> >>>>> On 11/21/19 9:12 AM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Please review a new version of this change that keeps the nmethod >>>>>> from being unloaded, after it is added to the deferred event queue: >>>>>> >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html >>>>> >>>>> src/hotspot/share/code/nmethod.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/share/oops/instanceKlass.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/share/prims/jvmtiExport.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/share/prims/jvmtiImpl.cpp >>>>> ??? Nice solution with the new oops_do() and nmethods_do() functions! >>>> Erik's insistance! >>>>> >>>>> ??? old L988: void JvmtiDeferredEventQueue::enqueue(const >>>>> JvmtiDeferredEvent& event) { >>>>> ??? new L998: void >>>>> JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) { >>>>> ??????? Not sure why this was changed. >>>>> >>>>> ??????? Update: Looks like Serguei raised the issue and Coleen has >>>>> already >>>>> ??????? resolved it. >>>> >>>> Yes. >>>>> >>>>> src/hotspot/share/prims/jvmtiImpl.hpp >>>>> ??? old L494: ??? QueueNode(const JvmtiDeferredEvent& event) >>>>> ??? new L498: ??? QueueNode(JvmtiDeferredEvent& event) >>>>> ??????? Why was this changed? >>>>> >>>>> ??????? Update: Not clear if this was covered by Coleen's reply to >>>>> Serguei. >>>>> >>>>> ??? old L497: ??? const JvmtiDeferredEvent& event() const { return >>>>> _event; } >>>>> ??? new L501: ??? JvmtiDeferredEvent& event() { return _event; } >>>>> ??????? Why was this changed? >>>>> >>>>> ??????? Update: Coleen's reply to Serguei explained this. Perhaps add: >>>>> ????????????????? // Not const because of oops_do() and nmethods_do(). >>>>> >>>>> ??? old L509: ? static void enqueue(const JvmtiDeferredEvent& >>>>> event) NOT_JVMTI_RETURN; >>>>> ??? new L513: ? static void enqueue(JvmtiDeferredEvent event) >>>>> NOT_JVMTI_RETURN; >>>>> ??????? Why was this changed? >>>>> >>>>> ??????? Update: Looks like Serguei raised the issue and Coleen has >>>>> already >>>>> ??????? resolved it. >>>> >>>> Yes, I fixed these. >>>>> >>>>> src/hotspot/share/runtime/mutexLocker.cpp >>>>> ??? This change is going to require some testing to make sure we don't >>>>> ??? have any new deadlock scenarios. >>>> >>>> Luckily, I've previously added an implicit NoSafepointVerifier to >>>> locks that are _allow_vm_block = true, like this one. >>>> + def(JmethodIdCreation_lock , PaddedMutex , leaf, true, >>>> _safepoint_check_never); // used for creating jmethodIDs. >>>> which prevents one class of deadlock. If we take out another lock >>>> with a higher rank, we'll get the ranking assert. >>>> >>>> This lock prevents insertion into an array, and has little outside >>>> calls. >>>> >>>> I'm running tests in tier 1-6 but any code that travels through >>>> this should get these assertion checks, rather than deadlocking. >>>> >>>>> >>>>> src/hotspot/share/runtime/serviceThread.cpp >>>>> ??? L50 - nit - why the extra blank line? >>>> >>>> To separate static data member definitions from functions. I >>>> removed it. >>>>> >>>>> src/hotspot/share/runtime/serviceThread.hpp >>>>> ??? Thanks for cleaning up the static: >>>>> >>>>> ????? ServiceThread::is_service_thread(Thread* thread) >>>>> >>>>> ??? stuff. Having it be different than the other threads was >>>>> ??? a bit jarring. >>>>> >>>>> src/hotspot/share/runtime/thread.hpp >>>>> ??? No comments. >>>>> >>>>> Thumbs up. My only comments are nits so I don't need to see a >>>>> new webrev if you decide to fix them. >>>> >>>> So it turns out that in stress testing my fix >>>> forhttps://bugs.openjdk.java.net/browse/JDK-8212160 >>>> >>>> Because I was in the area and thought this was a duplicate of that >>>> bug (it is not).?? I found that calling oops_do and nmethods_do the >>>> ServiceThread? needs to hold the Service_lock, because other >>>> threads can be adding things to the global queue while the sweeper >>>> thread is calling this in a handshake. >>>> >>>> I am now retesting this change with the changes above, and with the >>>> Service_lock.?? So far my stress tests for JDK-81212160 and the >>>> stress test for this bug pass, but I'm going to run through all the >>>> tiers 1-6 over the weekend. >>>> >>>> Please have a look at the changes in the meantime. >>>> >>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev >>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04/webrev >>>> >>>> Thanks, >>>> Coleen >>>>> >>>>> Dan >>>>> >>>>>> >>>>>> Ran the test that failed 100 times without failure, tier1 on >>>>>> Oracle supported platforms, and tier2-3 including jvmti and jdi >>>>>> tests locally. >>>>>> >>>>>> See bug for more details about the crash. >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >>>>>> On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Hi Serguei, >>>>>>> >>>>>>> Sorry for not sending an update.? I talked to Erik and am >>>>>>> working on a version that keeps the nmethod from being unloaded >>>>>>> while it's in the deferred event queue, with a version that the >>>>>>> GC people will like, and I like.? I'm testing it out now. >>>>>>> >>>>>>> Thanks! >>>>>>> Coleen >>>>>>> >>>>>>> >>>>>>> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> Sorry for the latency, I had to investigate it a little bit. >>>>>>>> I still have some doubt your fix is right thing to do. >>>>>>>> >>>>>>>> >>>>>>>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Coleen, >>>>>>>>>> >>>>>>>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, I've been working on answers to these questions, so I'll >>>>>>>>>>> start with this one. >>>>>>>>>>> >>>>>>>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>>>>>>> (made_zombie or memory released) by the sweeper, but the >>>>>>>>>>> nmethod could be unloaded.? Unloading the nmethod clears the >>>>>>>>>>> Method* _method field. >>>>>>>>>> >>>>>>>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>>>>>>> >>>>>>>>>>> The post_compiled_method_load event needs the _method field >>>>>>>>>>> to look at things like inlining and ScopeDesc fields.?? If >>>>>>>>>>> the nmethod is unloaded, some of the oops are dead.? There >>>>>>>>>>> are "holder" oops that correspond to the metadata in the >>>>>>>>>>> nmethod.? If these oops are dead, causing the nmethod to get >>>>>>>>>>> unloaded, then the metadata may not be valid. >>>>>>>>>>> >>>>>>>>>>> So my change 02 looks for a NULL nmethod._method field to >>>>>>>>>>> tell whether we can post information about the nmethod. >>>>>>>>>>> >>>>>>>>>>> There's code in nmethod.cpp like: >>>>>>>>>>> >>>>>>>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>>>>>>> ? if (_jmethod_id == NULL) { >>>>>>>>>>> ??? // Cache the jmethod_id since it can no longer be looked >>>>>>>>>>> up once the >>>>>>>>>>> ??? // method itself has been marked for unloading. >>>>>>>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>>>>>>> ? } >>>>>>>>>>> ? return _jmethod_id; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Which was added when post_method_load and unload were turned >>>>>>>>>>> into deferred events. >>>>>>>>>> >>>>>>>>>> Could we cache the jmethodID in the >>>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>>>>>>> similarly as we do in the >>>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>>>>>>> This would help to get rid of the dependency on the >>>>>>>>>> nmethod::_method. >>>>>>>>>> Do we depend on any other nmethod fields? >>>>>>>>> >>>>>>>>> Yes, there are other nmethod metadata that we rely on to print >>>>>>>>> inline information, and this function >>>>>>>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it >>>>>>>>> uses the ScopeDesc data in the nmethod. >>>>>>>> >>>>>>>> One possible approach is to prepare and cache all this information >>>>>>>> in the nmethod::post_compiled_method_load_event() before the >>>>>>>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>>>>>>> The event parameters are: >>>>>>>> typedef struct { >>>>>>>> const void* start_address; >>>>>>>> jlocation location; >>>>>>>> } jvmtiAddrLocationMap; >>>>>>>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>>>>>>> jmethodID method, >>>>>>>> jint code_size, >>>>>>>> const void* code_addr, >>>>>>>> jint map_length, >>>>>>>> const jvmtiAddrLocationMap* map, >>>>>>>> const void* compile_info) >>>>>>>> Some of these addresses above could be not accessible when an >>>>>>>> event is posted. >>>>>>>> Not sure yet if it is Okay. >>>>>>>> The question is if this kind of refactoring is worth and right >>>>>>>> thing to do. >>>>>>>> >>>>>>>>> >>>>>>>>> We do cache the jmethodID but that's not good enough.? See my >>>>>>>>> last comment in the bug report. The jmethodID can point to an >>>>>>>>> unloaded method. >>>>>>>> >>>>>>>> This looks like it is done a little bit late. >>>>>>>> It'd better to do it before the event is deferred (see above). >>>>>>>> >>>>>>>>> I tried a version of keeping the nmethod alive, but the GC >>>>>>>>> folks will hate it.? And it doesn't work and I hate it. >>>>>>>> >>>>>>>> From serviceability point of view this is the best and most >>>>>>>> consistent approach. >>>>>>>> I seems to me, it was initially designed this way. >>>>>>>> The downside is it adds some extra complexity to the GC. >>>>>>>> >>>>>>>>> My version 01 is the best, with the caveat that maybe it >>>>>>>>> should check for _method == NULL instead of >>>>>>>>> nmethod->is_alive().? I have to talk to Erik to see if there's >>>>>>>>> a race with concurrent class unloading. >>>>>>>>> >>>>>>>>> Any application that depends on a compiled method loading >>>>>>>>> event on a class that could be unloaded is a buggy >>>>>>>>> application.? Applications should not rely on when the JIT >>>>>>>>> compiler decides to compile a method!? This happens to us for >>>>>>>>> a stress test. Most applications will get most of their >>>>>>>>> compiled method loading events as they normally do. >>>>>>>> >>>>>>>> It is not an application that relies on the compiled method >>>>>>>> loading event. >>>>>>>> It is about profiling tools to be able to get correct >>>>>>>> information about what is going on with compilations. >>>>>>>> My concern is that if we skip such compiled method load events >>>>>>>> then profilers have no way >>>>>>>> to find out there many unneeded compilations that are thrown >>>>>>>> away without any real use. >>>>>>>> Also, it is not clear what happens with the subsequent compiled >>>>>>>> method unload events. >>>>>>>> Are they going to be skipped as well or they can appear and >>>>>>>> confuse profilers? >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>>> I put more debugging in the bug to show this crash was from >>>>>>>>>>> an unloaded nmethod. >>>>>>>>>>> >>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi Coleen, >>>>>>>>>>>> >>>>>>>>>>>> I have some questions. >>>>>>>>>>>> >>>>>>>>>>>> Both the compiler method load and unload are posted as >>>>>>>>>>>> deferred events. >>>>>>>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>>>>>>> processes the event. >>>>>>>>>>>> >>>>>>>>>>>> The implementation is: >>>>>>>>>>>> >>>>>>>>>>>> JvmtiDeferredEvent >>>>>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>>>>>>> ? . . . >>>>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>>>>>> ? // this deferred event. >>>>>>>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>>>>>>> ? return event; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> JvmtiDeferredEvent >>>>>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* >>>>>>>>>>>> nm, jmethodID id, const void* code) { >>>>>>>>>>>> ? . . . >>>>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can process >>>>>>>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>>>>>>> ? // generated code from being reused too early. We pass >>>>>>>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>>>>>>> ? // made into a zombie can be locked. >>>>>>>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>>>>>>> ? return event; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> void JvmtiDeferredEvent::post() { >>>>>>>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>>>>>>> ? switch(_type) { >>>>>>>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>>>>>>> JvmtiExport::post_compiled_method_load(nm); >>>>>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>>>>> ????? break; >>>>>>>>>>>> ??? } >>>>>>>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>>>>>>> JvmtiExport::post_compiled_method_unload( >>>>>>>>>>>> _event_data.compiled_method_unload.method_id, >>>>>>>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>>>>> ????? break; >>>>>>>>>>>> ??? } >>>>>>>>>>>> ??? . . . >>>>>>>>>>>> ? } >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> Then I wonder how is it possible for the nmethod to be not >>>>>>>>>>>> alive here?: >>>>>>>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>>>>>>> . . . >>>>>>>>>>>> 2173 // It's not safe to look at metadata for unloaded methods. >>>>>>>>>>>> 2174 if (!nm->is_alive()) { >>>>>>>>>>>> 2175 return; >>>>>>>>>>>> 2176 } >>>>>>>>>>>> At least, it lokks like something else is broken. >>>>>>>>>>>> Do I miss something important here? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>>>>>>> unloaded nmethods >>>>>>>>>>>>> >>>>>>>>>>>>> Tested tier1-3 and 100 times with test that failed >>>>>>>>>>>>> (reproduced failure without the fix). >>>>>>>>>>>>> >>>>>>>>>>>>> open webrev at >>>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Coleen >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.schmelter at sap.com Mon Nov 25 14:41:03 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Mon, 25 Nov 2019 14:41:03 +0000 Subject: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump Message-ID: Hello, this change removes the need to use seek on the hprof file when creating a heap dump, thus making it possible to stream the dump. This enables us to dump to a socket or directly gzip the dump. Instead of fixing the heap dump segments size on the written file, the size of the heap dump segments is either fixed up in the buffer instead or, for entries to big to fit into the buffer fully, the entry get its own segment with no need to fix up the segment size later. To do this, we now need to know how large an heap dump segment entry is when starting to write the entry. This is either trivial (for the roots) or already known (for the instance and array dump entries). Just the class entry needed a little more code to track the size. The change results in more heap dump segments in the written heap dump. But since the overhead per segment is 9 bytes, even for the smallest used buffer (64K) the overhead is less than 0.02%. Additionally the heap dump now expects to be able to allocate at least 64k for the buffer. The old code tried to run even with a buffer of 1 byte or no buffer at all. Bugreport: https://bugs.openjdk.java.net/browse/JDK-8234510 Webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8234510/webrev.0/ Best regards, Ralf From erik.gahlin at oracle.com Mon Nov 25 16:27:46 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Mon, 25 Nov 2019 17:27:46 +0100 Subject: 8233197(S): Invert JvmtiExport::post_vm_initialized() and Jfr:on_vm_start() start-up order for correct option parsing In-Reply-To: <62407a3d-f6a2-400b-9311-9ab7e32d85f7@default> References: <1ca7ae34-41fe-fad1-4bd2-57cdf9667bd9@oracle.com> <62407a3d-f6a2-400b-9311-9ab7e32d85f7@default> Message-ID: <8da1145a-5e65-8db8-8a11-9bce1af22233@oracle.com> Looks good. Erik On 2019-11-20 21:54, Markus Gronlund wrote: > > "It does not look as a good idea to change the JVMTI phase like above. > > ? If you need the ONLOAD phase just to enable capabilities then it is > better to do it in the real ONLOAD phase. > > ? Do I miss anything important here? > > ? Please, ask questions if you have any problems with it." > > Yes, so the reason for the phase transition is not so much to do with > capabilities, but that an agent can only register, i.e. call GetEnv(), > in phases JVMTI_PHASE_ONLOAD and JVMTI_PHASE_LIVE. > > create_vm_init_agents() is where the (temporary) > JVMTI_PHASE_PRIMORDIAL to JVMTI_PHASE_ONLOAD happens during the > callouts to Agent_OnLoad(), and then the state is returned to > JVMTI_PHASE_PRIMORDIAL. It is hard to find an unconditional hook point > there since create_vm_init_agents() is made conditional on > Arguments::init_agents_at_startup(), with a listing populated from > "real agents" (on command-line). > > The JFR JVMTI agent itself is also conditional, installed only if JFR > is actively started (i.e. a starting a recording). Hence, the phase > transition mechanism merely replicates the state changes in > create_vm_init_agents() to have the agent register properly. This is a > moot point now however as I have taken another pass. I now found a way > to only have the agent register during the JVMTI_PHASE_LIVE phase, so > the phase transition mechanism is not needed. > > "The Jfr::on_vm_init() is confusing as there is a mismatch with the > JVMTI phases order. > > ? It fills like it means JFR init event (not VM init) or something > like this. > > ? Or maybe it denotes the VM initialization start. :) > > ? I'll be happy if you could explain it a little bit." > > Yes, this is confusing, I agree. Of course, JFR has a tight relation > to the JVMTI phases, but only in so far as to coordinate agent > registration. The JFR calls are not intended to reflect the JVMTI > phases per se but a more general initialization order state > description, like you say "VM initialization start and completion". > However, it is very hard to encode proper semantics into the JFR calls > in Threads::create_vm() to reflect the concepts of "stages"; they are > simply not well-defined. In addition, there are so many of them J. For > example, I always get confused that VM initialization is reflected in > JVMTI by the VMStart event and the completion by the VMInit event > (representing VM initialization complete). At the same time, the > DTRACE macros have both HOTSPOT_VM_INIT_BEGIN() HOTSPOT_VM_INIT_END() > placed before both... > > I abandoned the attempt to encode anything meaningful into the JFR > calls trying to represent a certain "VM initialization stage". > > Instead, I will just have syntactic JFR calls reflecting some relative > order (on_create_vm_1(), on_create_vm_2(),.. _3()) etc. Looks like > there are precedents of this style. > > ?Not sure, if your agent needs to enable these capabilities > (introduced in JDK 9 with modules): > ? can_generate_early_vmstart > ? can_generate_early_class_hook_events? > > Thanks for the suggestion Serguei, but these capabilities are not yet > needed. > > Here is the updated webrev: > http://cr.openjdk.java.net/~mgronlun/8233197/webrev02/ > > Thanks again > > Markus > > *From:*Serguei Spitsyn > *Sent:* den 20 november 2019 04:10 > *To:* Markus Gronlund ; hotspot-jfr-dev > ; > hotspot-runtime-dev at openjdk.java.net; serviceability-dev at openjdk.java.net > *Subject:* Re: 8233197(S): Invert JvmtiExport::post_vm_initialized() > and Jfr:on_vm_start() start-up order for correct option parsing > > Hi Marcus, > > It looks good in general. > > A couple of comments though. > > http://cr.openjdk.java.net/~mgronlun/8233197/webrev01/src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp.frames.html > > 258 class JvmtiPhaseTransition { > 259? private: > 260?? bool _transition; > 261? public: > 262?? JvmtiPhaseTransition() : _transition(JvmtiEnvBase::get_phase() > == JVMTI_PHASE_PRIMORDIAL) { > 263???? if (_transition) { > 264?????? JvmtiEnvBase::set_phase(JVMTI_PHASE_ONLOAD); > 265???? } > 266?? } > 267?? ~JvmtiPhaseTransition() { > 268???? if (_transition) { > 269?????? assert(JvmtiEnvBase::get_phase() == JVMTI_PHASE_ONLOAD, > "invariant"); > 270?????? JvmtiEnvBase::set_phase(JVMTI_PHASE_PRIMORDIAL); > 271???? } > 272?? } > 273 }; > 274 > ?275 static bool initialize() { > 276?? JavaThread* const jt = current_java_thread(); > 277?? assert(jt != NULL, "invariant"); > 278?? assert(jt->thread_state() == _thread_in_vm, "invariant"); > 279?? DEBUG_ONLY(JfrJavaSupport::check_java_thread_in_vm(jt)); > *280?? JvmtiPhaseTransition jvmti_phase_transition;* > 281?? ThreadToNativeFromVM transition(jt); > 282?? if (create_jvmti_env(jt) != JNI_OK) { > 283???? assert(jfr_jvmti_env == NULL, "invariant"); > 284???? return false; > 285?? } > 286?? assert(jfr_jvmti_env != NULL, "invariant"); > 287?? if (!register_capabilities(jt)) { > 288???? return false; > 289?? } > 290?? if (!register_callbacks(jt)) { > 291???? return false; > 292?? } > 293?? return update_class_file_load_hook_event(JVMTI_ENABLE); > 294 } > > > It does not look as a good idea to change the JVMTI phase like above. > If you need the ONLOAD phase just to enable capabilities then it is > better to do it in the real ONLOAD phase. > Do I miss anything important here? > Please, ask questions if you have any problems with it. > > The Jfr::on_vm_init() is confusing as there is a mismatch with the > JVMTI phases order. > It fills like it means JFR init event (not VM init) or something like > this. > Or maybe it denotes the VM initialization start. :) > I'll be happy if you could explain it a little bit. > > Not sure, if your agent needs to enable these capabilities (introduced > in JDK 9 with modules): > ? can_generate_early_vmstart > ? can_generate_early_class_hook_events > > Thanks, > Serguei > > > On 11/19/19 06:38, Markus Gronlund wrote: > > Greetings, > > (apologies for the wide distribution) > > Kindly asking for reviews for the following changeset: > > Bug:https://bugs.openjdk.java.net/browse/JDK-8233197 > > Webrev:http://cr.openjdk.java.net/~mgronlun/8233197/webrev01/ > > Testing: serviceability/jvmti, jdk_jfr, tier1-5 > > Summary: please see bug for description. > > For Runtime / Serviceability folks: > > This change slightly modifies the relative order in Threads::create_vm(); please see threads.cpp. > > There is an upcall as part of Jfr::on_vm_start() that delivers global JFR command-line options to Java (only if set). > > The behavioral change amounts to a few classes loaded as part of establishing this upcall (all internal JFR classes and/or java.base classes, loaded by the bootloader) no longer being visible to the ClassFileLoadHook's of agents. These classes are visible to agents that work with "early_start" JVMTI environments however. > > The major part of JFR startup with associated class loading still happens as part of Jfr::on_vm_live() with no behavioral change in relation to agents. > > Thank you > > Markus > -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry.cable at oracle.com Mon Nov 25 17:10:37 2019 From: larry.cable at oracle.com (Laurence Cable) Date: Mon, 25 Nov 2019 09:10:37 -0800 (PST) Subject: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump In-Reply-To: References: Message-ID: <866ba7da-c16f-223d-0fc4-64b7ab69f831@oracle.com> What (if any) is the compatibility impact of this change on tools consuming the heap dump format? Thanks - Larry On 11/25/19 6:41 AM, Schmelter, Ralf wrote: > Hello, > > this change removes the need to use seek on the hprof file when creating a heap dump, thus making it possible to stream the dump. This enables us to dump to a socket or directly gzip the dump. > > Instead of fixing the heap dump segments size on the written file, the size of the heap dump segments is either fixed up in the buffer instead or, for entries to big to fit into the buffer fully, the entry get its own segment with no need to fix up the segment size later. > > To do this, we now need to know how large an heap dump segment entry is when starting to write the entry. This is either trivial (for the roots) or already known (for the instance and array dump entries). Just the class entry needed a little more code to track the size. > > The change results in more heap dump segments in the written heap dump. But since the overhead per segment is 9 bytes, even for the smallest used buffer (64K) the overhead is less than 0.02%. Additionally the heap dump now expects to be able to allocate at least 64k for the buffer. The old code tried to run even with a buffer of 1 byte or no buffer at all. > > Bugreport: https://bugs.openjdk.java.net/browse/JDK-8234510 > Webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8234510/webrev.0/ > > Best regards, > Ralf From chris.plummer at oracle.com Mon Nov 25 20:41:17 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 25 Nov 2019 12:41:17 -0800 Subject: RFR(S): 8221372: Test vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java times out In-Reply-To: References: Message-ID: <7508e51e-3dab-337e-0f92-13a4676810e8@oracle.com> An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Mon Nov 25 22:12:07 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 25 Nov 2019 14:12:07 -0800 Subject: RFR(S): 8221372: Test vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java times out In-Reply-To: <7508e51e-3dab-337e-0f92-13a4676810e8@oracle.com> References: <7508e51e-3dab-337e-0f92-13a4676810e8@oracle.com> Message-ID: +1 The only nit: 87 jvmtiError err = jvmti->SetEventNotificationMode(mode, event_type, event_thread); 88 if (err != JVMTI_ERROR_NONE) { 89 printf("Failed to disable % event: %s (%d)\n", 90 event_name, TranslateError(err), err); 91 result = STATUS_FAILED; 92 } This func is used to both enable/disable, but logging always report "Failed to disable". --alex On 11/25/2019 12:41, Chris Plummer wrote: > Hi Serguei, > > It looks like before your fix, runs were normally just a few seconds, > but there occasionally took 1 to 15 minutes, some of which result in a > timeout. I looked at some of your recent results and it looks like now > they are always just a few seconds, so that's a good sign that you > addressed the timeout issue. > > Changes look good. > > thanks, > > Chris > > On 11/25/19 1:38 AM, serguei.spitsyn at oracle.com wrote: >> Please, review a fix for test bug: >> https://bugs.openjdk.java.net/browse/JDK-8221372 >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2019/8221372-jvmti-thread-state.1/ >> >> >> Summary: >> ? The test timeouts always happen with the JFR recording and mostly on >> windows. >> ? I was not able to reproduce this with mach5 100 runs though. >> ? However, I think the issue is with the MethodEnter/MethodExit events >> that are set globally. >> ? It is not only ~20 times slower but also impacts all JFR methods >> working in background. >> >> ? The fix includes the following changes: >> ?? - the MethodEnter/MethodExit events are removed >> ?? - the code is refactored to implement repeating fragments as functions >> ?? - minimal tracing is added to help with analysis of timeouts if >> they remain >> >> >> Testing: >> ? Tested the? vmTestbase/nsk/jvmti/GetThreadState/thrstat001 test with >> mach5 100 runs. >> >> >> Thanks, >> Serguei > From serguei.spitsyn at oracle.com Mon Nov 25 23:44:33 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 25 Nov 2019 15:44:33 -0800 Subject: RFR(S): 8221372: Test vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java times out In-Reply-To: References: <7508e51e-3dab-337e-0f92-13a4676810e8@oracle.com> Message-ID: Hi Alex, Thank you for review and the comment! I saw this bug forgot to fix. Will fix now. Thanks, Serguei On 11/25/19 2:12 PM, Alex Menkov wrote: > +1 > > The only nit: > > 87???? jvmtiError err = jvmti->SetEventNotificationMode(mode, > event_type, event_thread); > ? 88???? if (err != JVMTI_ERROR_NONE) { > ? 89???????? printf("Failed to disable % event: %s (%d)\n", > ? 90??????????????? event_name, TranslateError(err), err); > ? 91???????? result = STATUS_FAILED; > ? 92???? } > > This func is used to both enable/disable, but logging always report > "Failed to disable". > > --alex > > On 11/25/2019 12:41, Chris Plummer wrote: >> Hi Serguei, >> >> It looks like before your fix, runs were normally just a few seconds, >> but there occasionally took 1 to 15 minutes, some of which result in >> a timeout. I looked at some of your recent results and it looks like >> now they are always just a few seconds, so that's a good sign that >> you addressed the timeout issue. >> >> Changes look good. >> >> thanks, >> >> Chris >> >> On 11/25/19 1:38 AM, serguei.spitsyn at oracle.com wrote: >>> Please, review a fix for test bug: >>> https://bugs.openjdk.java.net/browse/JDK-8221372 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2019/8221372-jvmti-thread-state.1/ >>> >>> >>> >>> Summary: >>> ? The test timeouts always happen with the JFR recording and mostly >>> on windows. >>> ? I was not able to reproduce this with mach5 100 runs though. >>> ? However, I think the issue is with the MethodEnter/MethodExit >>> events that are set globally. >>> ? It is not only ~20 times slower but also impacts all JFR methods >>> working in background. >>> >>> ? The fix includes the following changes: >>> ?? - the MethodEnter/MethodExit events are removed >>> ?? - the code is refactored to implement repeating fragments as >>> functions >>> ?? - minimal tracing is added to help with analysis of timeouts if >>> they remain >>> >>> >>> Testing: >>> ? Tested the? vmTestbase/nsk/jvmti/GetThreadState/thrstat001 test >>> with mach5 100 runs. >>> >>> >>> Thanks, >>> Serguei >> From serguei.spitsyn at oracle.com Mon Nov 25 23:47:18 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 25 Nov 2019 15:47:18 -0800 Subject: RFR(S): 8221372: Test vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java times out In-Reply-To: <7508e51e-3dab-337e-0f92-13a4676810e8@oracle.com> References: <7508e51e-3dab-337e-0f92-13a4676810e8@oracle.com> Message-ID: Hi Chris, Thank you for looking at it! May I count you as a reviewer? My plan is to submit another mach5 100-times run before the push. Thanks, Serguei On 11/25/19 12:41 PM, Chris Plummer wrote: > Hi Serguei, > > It looks like before your fix, runs were normally just a few seconds, > but there occasionally took 1 to 15 minutes, some of which result in a > timeout. I looked at some of your recent results and it looks like now > they are always just a few seconds, so that's a good sign that you > addressed the timeout issue. > > Changes look good. > > thanks, > > Chris > > On 11/25/19 1:38 AM, serguei.spitsyn at oracle.com wrote: >> Please, review a fix for test bug: >> https://bugs.openjdk.java.net/browse/JDK-8221372 >> >> Webrev: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2019/8221372-jvmti-thread-state.1/ >> >> >> Summary: >> ? The test timeouts always happen with the JFR recording and mostly >> on windows. >> ? I was not able to reproduce this with mach5 100 runs though. >> ? However, I think the issue is with the MethodEnter/MethodExit >> events that are set globally. >> ? It is not only ~20 times slower but also impacts all JFR methods >> working in background. >> >> ? The fix includes the following changes: >> ?? - the MethodEnter/MethodExit events are removed >> ?? - the code is refactored to implement repeating fragments as functions >> ?? - minimal tracing is added to help with analysis of timeouts if >> they remain >> >> >> Testing: >> ? Tested the? vmTestbase/nsk/jvmti/GetThreadState/thrstat001 test >> with mach5 100 runs. >> >> >> Thanks, >> Serguei > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Nov 26 00:47:56 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 25 Nov 2019 16:47:56 -0800 Subject: RFR(S): 8221372: Test vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java times out In-Reply-To: References: <7508e51e-3dab-337e-0f92-13a4676810e8@oracle.com> Message-ID: <5cec9b49-3fa3-410d-a86e-4427cb99867e@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Nov 26 01:47:55 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 25 Nov 2019 17:47:55 -0800 Subject: RFR(S): 8221372: Test vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java times out In-Reply-To: <5cec9b49-3fa3-410d-a86e-4427cb99867e@oracle.com> References: <7508e51e-3dab-337e-0f92-13a4676810e8@oracle.com> <5cec9b49-3fa3-410d-a86e-4427cb99867e@oracle.com> Message-ID: <44c1f819-6b64-6dd8-033b-7f15413e4d82@oracle.com> Thanks, Chris! Serguei On 11/25/19 4:47 PM, Chris Plummer wrote: > Yes > > On 11/25/19 3:47 PM, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> Thank you for looking at it! >> May I count you as a reviewer? >> My plan is to submit another mach5 100-times run before the push. >> >> Thanks, >> Serguei >> >> On 11/25/19 12:41 PM, Chris Plummer wrote: >>> Hi Serguei, >>> >>> It looks like before your fix, runs were normally just a few >>> seconds, but there occasionally took 1 to 15 minutes, some of which >>> result in a timeout. I looked at some of your recent results and it >>> looks like now they are always just a few seconds, so that's a good >>> sign that you addressed the timeout issue. >>> >>> Changes look good. >>> >>> thanks, >>> >>> Chris >>> >>> On 11/25/19 1:38 AM, serguei.spitsyn at oracle.com wrote: >>>> Please, review a fix for test bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8221372 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2019/8221372-jvmti-thread-state.1/ >>>> >>>> >>>> Summary: >>>> ? The test timeouts always happen with the JFR recording and mostly >>>> on windows. >>>> ? I was not able to reproduce this with mach5 100 runs though. >>>> ? However, I think the issue is with the MethodEnter/MethodExit >>>> events that are set globally. >>>> ? It is not only ~20 times slower but also impacts all JFR methods >>>> working in background. >>>> >>>> ? The fix includes the following changes: >>>> ?? - the MethodEnter/MethodExit events are removed >>>> ?? - the code is refactored to implement repeating fragments as >>>> functions >>>> ?? - minimal tracing is added to help with analysis of timeouts if >>>> they remain >>>> >>>> >>>> Testing: >>>> ? Tested the? vmTestbase/nsk/jvmti/GetThreadState/thrstat001 test >>>> with mach5 100 runs. >>>> >>>> >>>> Thanks, >>>> Serguei >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.schmelter at sap.com Tue Nov 26 09:30:37 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Tue, 26 Nov 2019 09:30:37 +0000 Subject: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump In-Reply-To: <866ba7da-c16f-223d-0fc4-64b7ab69f831@oracle.com> References: <866ba7da-c16f-223d-0fc4-64b7ab69f831@oracle.com> Message-ID: Hi Larry, there should be no compatibility impact. The hprof format stayed the same, just the heap dump segments we write are smaller on average and more frequent. I tested the created heap dumps with the jtreg test (the former jhat code), memory analyzer from eclipse, heap hero (an online heap analyzer) and visual VM. All without problems. Best regards, Ralf -----Original Message----- From: Laurence Cable Sent: Montag, 25. November 2019 18:11 To: Schmelter, Ralf ; OpenJDK Serviceability ; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump What (if any) is the compatibility impact of this change on tools consuming the heap dump format? Thanks - Larry From coleen.phillimore at oracle.com Tue Nov 26 14:22:07 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 26 Nov 2019 09:22:07 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" Message-ID: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> Summary: Add local deferred event list to thread to post events outside CodeCache_lock. This patch builds on the patch for JDK-8173361.? With this patch, I made the JvmtiDeferredEventQueue an instance class (not AllStatic) and have one per thread.? The CodeBlob event that used to drop the CodeCache_lock and raced with the sweeper thread, adds the events it wants to post to its thread local list, and processes it outside the lock.? The list is walked in GC and by the sweeper to keep the nmethods from being unloaded and zombied, respectively. Also, the jmethod_id field in nmethod was only used as a boolean so don't create a jmethod_id until needed for post_compiled_method_unload. Ran hs tier1-8 on linux-x64-debug and the stress test that crashed in the original bug report. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8212160 Thanks, Coleen From larry.cable at oracle.com Tue Nov 26 16:39:05 2019 From: larry.cable at oracle.com (Laurence Cable) Date: Tue, 26 Nov 2019 08:39:05 -0800 Subject: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump In-Reply-To: References: <866ba7da-c16f-223d-0fc4-64b7ab69f831@oracle.com> Message-ID: <7b5ee1a9-6ed0-f897-9646-a2f6ee5e2742@oracle.com> COOL! thx - Larry On 11/26/19 1:30 AM, Schmelter, Ralf wrote: > Hi Larry, > > there should be no compatibility impact. The hprof format stayed the same, just the heap dump segments we write are smaller on average and more frequent. > > I tested the created heap dumps with the jtreg test (the former jhat code), memory analyzer from eclipse, heap hero (an online heap analyzer) and visual VM. All without problems. > > Best regards, > Ralf > > -----Original Message----- > From: Laurence Cable > Sent: Montag, 25. November 2019 18:11 > To: Schmelter, Ralf ; OpenJDK Serviceability ; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump > > What (if any) is the compatibility impact of this change on tools > consuming the heap dump format? > > Thanks > > - Larry From david.holmes at oracle.com Wed Nov 27 00:03:56 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 27 Nov 2019 10:03:56 +1000 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> Message-ID: <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> (adding runtime as well) Hi Coleen, On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: > Summary: Add local deferred event list to thread to post events outside > CodeCache_lock. > > This patch builds on the patch for JDK-8173361.? With this patch, I made > the JvmtiDeferredEventQueue an instance class (not AllStatic) and have > one per thread.? The CodeBlob event that used to drop the CodeCache_lock > and raced with the sweeper thread, adds the events it wants to post to > its thread local list, and processes it outside the lock.? The list is > walked in GC and by the sweeper to keep the nmethods from being unloaded > and zombied, respectively. Sorry I don't understand why we would want/need a deferred event queue for every JavaThread? Isn't this only relevant for non-JavaThreads that need to have the ServiceThread process the deferred event? David > Also, the jmethod_id field in nmethod was only used as a boolean so > don't create a jmethod_id until needed for post_compiled_method_unload. > > Ran hs tier1-8 on linux-x64-debug and the stress test that crashed in > the original bug report. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8212160 > > Thanks, > Coleen From suenaga at oss.nttdata.com Thu Nov 28 12:39:27 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 28 Nov 2019 21:39:27 +0900 Subject: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: Message-ID: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> Hi, I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). Could you review new webrev? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ The diff from previous webrev is here: http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b Thanks, Yasumasa On 2019/11/25 14:08, Yasumasa Suenaga wrote: > Hi all, > > Please review this change: > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ > > > According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 > Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame > for stack unwinding. > > As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system > library (e.g. libc) might be compiled with this feature. > > However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). > So it might be lack of stack frames. > > I guess JDK-8219201 is caused by same issue. > > > Thanks, > > Yasumasa > > > [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf From matthias.baesken at sap.com Thu Nov 28 16:21:21 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 28 Nov 2019 16:21:21 +0000 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter Message-ID: Hello, please review this small patch . It adds return value checking for calloc at one place where it is missing . Thanks, Matthias Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8234968 http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.1/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Fri Nov 29 05:28:01 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Fri, 29 Nov 2019 05:28:01 +0000 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: Message-ID: Hi Matthias, you'll have to initialize *decodedLen = 0; before you return, otherwise there's a build error on Linux: .../jdk/src/java.instrument/share/native/libinstrument/InvocationAdapter.c:866:17: error: 'len' may be used uninitialized in this function [-Werror=maybe-uninitialized] int new_len = convertUft8ToPlatformString(path, len, platform, MAXPATHLEN); ^~~~~~~ Apart from that, the fix looks good and trivial. Cheers Christoph From: serviceability-dev On Behalf Of Baesken, Matthias Sent: Donnerstag, 28. November 2019 17:21 To: serviceability-dev at openjdk.java.net Subject: [CAUTION] RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter Hello, please review this small patch . It adds return value checking for calloc at one place where it is missing . Thanks, Matthias Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8234968 http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.1/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Fri Nov 29 06:29:52 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 29 Nov 2019 07:29:52 +0100 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: Message-ID: Hi Matthias, I am not certain the callers are prepared to handle NULL. This is used in a chain of TRANSFORM macro calls which AFAICS do not handle NULL; e.g. , at 872, we pass the returned pointer to convertUft8ToPlatformString which passes it on (on Windows) to MultiByteToWideChar, which does not handle NULL input. So I wonder whether a clear error message with an exit would be better in this case. Otherwise we may get a crash just some instructions later. Cheers, Thomas On Thu, Nov 28, 2019 at 5:21 PM Baesken, Matthias wrote: > Hello, please review this small patch . > > It adds return value checking for calloc at one place where it is missing . > > > > Thanks, Matthias > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8234968 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.1/ > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthias.baesken at sap.com Fri Nov 29 07:17:55 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 29 Nov 2019 07:17:55 +0000 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: Message-ID: Hi Thomas, Christoph, thanks for the comments . Of course the init of * decodedLen must be added . In case of returning NULL from decodePath , we would have tmp == NULL (in char* tmp = func; ) , assign tmp to res and then we jplis_assert , see : #define TRANSFORM(res,func) { \ char* tmp = func; \ if (tmp != res) { \ free(res); \ res = tmp; \ } \ jplis_assert((void*)res != (void*)NULL); \ } ?. TRANSFORM(path, decodePath(path,&len)); New webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.2/ Best regards, Matthias From: Thomas St?fe Sent: Freitag, 29. November 2019 07:30 To: Baesken, Matthias Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter Hi Matthias, I am not certain the callers are prepared to handle NULL. This is used in a chain of TRANSFORM macro calls which AFAICS do not handle NULL; e.g. , at 872, we pass the returned pointer to convertUft8ToPlatformString which passes it on (on Windows) to MultiByteToWideChar, which does not handle NULL input. So I wonder whether a clear error message with an exit would be better in this case. Otherwise we may get a crash just some instructions later. Cheers, Thomas On Thu, Nov 28, 2019 at 5:21 PM Baesken, Matthias > wrote: Hello, please review this small patch . It adds return value checking for calloc at one place where it is missing . Thanks, Matthias Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8234968 http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.1/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Fri Nov 29 07:19:58 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 29 Nov 2019 07:19:58 +0000 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: Message-ID: <458cf032-4eb8-b804-62f2-b46e2487859c@oracle.com> On 29/11/2019 06:29, Thomas St?fe wrote: > Hi Matthias, > > I am not certain the callers are prepared to handle NULL. > > This is used in a chain of TRANSFORM macro calls which AFAICS do not > handle NULL; e.g. , at 872, we pass the returned pointer to > convertUft8ToPlatformString which passes it on (on Windows) to > MultiByteToWideChar, which does not handle NULL input. > > So I wonder whether a clear error message with an exit would be better > in this case. Otherwise we may get a crash just some instructions later. > Right, this needs a lot more analysis to see if it's even possible to continue. The main usage is VM startup where the -javaagent option specifies agents that have the Boot-Class-Path attribute. In that case it would not be unreasonable to abort the process, it's unlikely to get much startup in the startup if memory is exhausted. The other possible context is where a tool agent is loaded into a running VM, in which case have the attach thread return with a pending exception might be okay (although the VM is likely to shutdown anyway as the memory exhaustion will be detected/handled elsewhere). -Alan From thomas.stuefe at gmail.com Fri Nov 29 07:32:56 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 29 Nov 2019 08:32:56 +0100 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: <458cf032-4eb8-b804-62f2-b46e2487859c@oracle.com> References: <458cf032-4eb8-b804-62f2-b46e2487859c@oracle.com> Message-ID: Just read Matthias reply: We call jplis_assert() if allocation fails. Looking at src/java.instrument/share/native/libinstrument/JPLISAssert.h I see that these assertions seem to be turned on all the time: 45 #define JPLISASSERT_ENABLEASSERTIONS (1) and lands us in JPLISAssertCondition() (possible improvement here is to evaluate the condition before the call): 58 #define jplis_assert(x) JPLISAssertCondition((jboolean)(x), #x, THIS_FILE, __LINE__) However, JPLISAssertCondition() is not an assert - name is misleading - but just a printf(): 39 void 40 JPLISAssertCondition( jboolean condition, 41 const char * assertionText, 42 const char * file, 43 int line) { 44 if ( !condition ) { 45 fprintf(stderr, "*** java.lang.instrument ASSERTION FAILED ***: \"%s\" at %s line: %d\n", 46 assertionText, 47 file, 48 line); 49 } 50 } Maybe I miss something but I do not see an abort. ---- I think we should add an exit(2) or abort(2) to the assertion. But I also think this is a different issue from what Matthias tries to fix, so I am fine with Matthias change. Cheers, Thomas On Fri, Nov 29, 2019 at 8:20 AM Alan Bateman wrote: > On 29/11/2019 06:29, Thomas St?fe wrote: > > Hi Matthias, > > > > I am not certain the callers are prepared to handle NULL. > > > > This is used in a chain of TRANSFORM macro calls which AFAICS do not > > handle NULL; e.g. , at 872, we pass the returned pointer to > > convertUft8ToPlatformString which passes it on (on Windows) to > > MultiByteToWideChar, which does not handle NULL input. > > > > So I wonder whether a clear error message with an exit would be better > > in this case. Otherwise we may get a crash just some instructions later. > > > Right, this needs a lot more analysis to see if it's even possible to > continue. The main usage is VM startup where the -javaagent option > specifies agents that have the Boot-Class-Path attribute. In that case > it would not be unreasonable to abort the process, it's unlikely to get > much startup in the startup if memory is exhausted. The other possible > context is where a tool agent is loaded into a running VM, in which case > have the attach thread return with a pending exception might be okay > (although the VM is likely to shutdown anyway as the memory exhaustion > will be detected/handled elsewhere). > > -Alan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthias.baesken at sap.com Fri Nov 29 07:59:50 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 29 Nov 2019 07:59:50 +0000 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: <458cf032-4eb8-b804-62f2-b46e2487859c@oracle.com> Message-ID: Hi Thomas, probably jplis_assert(x) should better be named jplis_warn(x) . Additionally the TRANSFORM macro could be enhanced by an abort() call (or abortJVM ?) or something similar . Best regards, Matthias Just read Matthias reply: We call jplis_assert() if allocation fails. Looking at src/java.instrument/share/native/libinstrument/JPLISAssert.h I see that these assertions seem to be turned on all the time: 45 #define JPLISASSERT_ENABLEASSERTIONS (1) and lands us in JPLISAssertCondition() (possible improvement here is to evaluate the condition before the call): 58 #define jplis_assert(x) JPLISAssertCondition((jboolean)(x), #x, THIS_FILE, __LINE__) However, JPLISAssertCondition() is not an assert - name is misleading - but just a printf(): 39 void 40 JPLISAssertCondition( jboolean condition, 41 const char * assertionText, 42 const char * file, 43 int line) { 44 if ( !condition ) { 45 fprintf(stderr, "*** java.lang.instrument ASSERTION FAILED ***: \"%s\" at %s line: %d\n", 46 assertionText, 47 file, 48 line); 49 } 50 } Maybe I miss something but I do not see an abort. ---- I think we should add an exit(2) or abort(2) to the assertion. But I also think this is a different issue from what Matthias tries to fix, so I am fine with Matthias change. Cheers, Thomas On Fri, Nov 29, 2019 at 8:20 AM Alan Bateman > wrote: On 29/11/2019 06:29, Thomas St?fe wrote: > Hi Matthias, > > I am not certain the callers are prepared to handle NULL. > > This is used in a chain of TRANSFORM macro calls which AFAICS do not > handle NULL; e.g. , at 872, we pass the returned pointer to > convertUft8ToPlatformString which passes it on (on Windows) to > MultiByteToWideChar, which does not handle NULL input. > > So I wonder whether a clear error message with an exit would be better > in this case. Otherwise we may get a crash just some instructions later. > Right, this needs a lot more analysis to see if it's even possible to continue. The main usage is VM startup where the -javaagent option specifies agents that have the Boot-Class-Path attribute. In that case it would not be unreasonable to abort the process, it's unlikely to get much startup in the startup if memory is exhausted. The other possible context is where a tool agent is loaded into a running VM, in which case have the attach thread return with a pending exception might be okay (although the VM is likely to shutdown anyway as the memory exhaustion will be detected/handled elsewhere). -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Fri Nov 29 08:08:35 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 29 Nov 2019 09:08:35 +0100 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: <458cf032-4eb8-b804-62f2-b46e2487859c@oracle.com> Message-ID: On Fri, Nov 29, 2019 at 8:59 AM Baesken, Matthias wrote: > Hi Thomas, probably jplis_assert(x) should better be named > jplis_warn(x) . > > > Yes :-) Additionally the TRANSFORM macro could be enhanced by an abort() call (or abortJVM > > ?) or something similar . > > > I think that makes sense, or one could even make jplis_assert abort() the program. But I leave this up to you, could also be done in a different issue. The patch you posted looks fine to me as it is. Cheer,s Thomas Best regards, Matthias > > > > > > > > > > > > Just read Matthias reply: > > > > We call jplis_assert() if allocation fails. Looking at > > > > src/java.instrument/share/native/libinstrument/JPLISAssert.h > > > > I see that these assertions seem to be turned on all the time: > > > > 45 #define JPLISASSERT_ENABLEASSERTIONS (1) > > > > and lands us in JPLISAssertCondition() (possible improvement here is to > evaluate the condition before the call): > > > > 58 #define jplis_assert(x) > JPLISAssertCondition((jboolean)(x), #x, THIS_FILE, __LINE__) > > > > However, JPLISAssertCondition() is not an assert - name is misleading - > but just a printf(): > > > > 39 void > 40 JPLISAssertCondition( jboolean condition, > 41 const char * assertionText, > 42 const char * file, > 43 int line) { > 44 if ( !condition ) { > 45 fprintf(stderr, "*** java.lang.instrument ASSERTION FAILED > ***: \"%s\" at %s line: %d\n", > 46 assertionText, > 47 file, > 48 line); > 49 } > 50 } > > > > Maybe I miss something but I do not see an abort. > > > > ---- > > > > I think we should add an exit(2) or abort(2) to the assertion. > > > > But I also think this is a different issue from what Matthias tries to > fix, so I am fine with Matthias change. > > > > Cheers, Thomas > > > > > > > > > > > > On Fri, Nov 29, 2019 at 8:20 AM Alan Bateman > wrote: > > On 29/11/2019 06:29, Thomas St?fe wrote: > > Hi Matthias, > > > > I am not certain the callers are prepared to handle NULL. > > > > This is used in a chain of TRANSFORM macro calls which AFAICS do not > > handle NULL; e.g. , at 872, we pass the returned pointer to > > convertUft8ToPlatformString which passes it on (on Windows) to > > MultiByteToWideChar, which does not handle NULL input. > > > > So I wonder whether a clear error message with an exit would be better > > in this case. Otherwise we may get a crash just some instructions later. > > > Right, this needs a lot more analysis to see if it's even possible to > continue. The main usage is VM startup where the -javaagent option > specifies agents that have the Boot-Class-Path attribute. In that case > it would not be unreasonable to abort the process, it's unlikely to get > much startup in the startup if memory is exhausted. The other possible > context is where a tool agent is loaded into a running VM, in which case > have the attach thread return with a pending exception might be okay > (although the VM is likely to shutdown anyway as the memory exhaustion > will be detected/handled elsewhere). > > -Alan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Fri Nov 29 10:41:08 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 29 Nov 2019 10:41:08 +0000 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: <458cf032-4eb8-b804-62f2-b46e2487859c@oracle.com> Message-ID: <5ca42824-37d8-44d5-7d49-8b270852d2e6@oracle.com> On 29/11/2019 07:32, Thomas St?fe wrote: > Just read Matthias reply: > > We call jplis_assert() if allocation fails. Looking at > > src/java.instrument/share/native/libinstrument/JPLISAssert.h > > I see that these assertions seem to be turned on all the time: > > ?45 #define JPLISASSERT_ENABLEASSERTIONS ? ?(1) > > and lands us in?JPLISAssertCondition() (possible improvement here is > to evaluate the condition before the call): > > ?58 #define jplis_assert(x) JPLISAssertCondition((jboolean)(x), #x, > THIS_FILE, __LINE__) > > However, JPLISAssertCondition() is not an assert - name is misleading > - but just a printf(): > > ?39 void > ?40 JPLISAssertCondition( ? jboolean ? ? ? ?condition, > ?41 ? ? ? ? ? ? ? ? ? ? ? ? const char * ? ?assertionText, > ?42 ? ? ? ? ? ? ? ? ? ? ? ? const char * ? ?file, > ?43 ? ? ? ? ? ? ? ? ? ? ? ? int ? ? ? ? ? ? line) { > ?44 ? ? if ( !condition ) { > ?45 ? ? ? ? fprintf(stderr, "*** java.lang.instrument ASSERTION FAILED > ***: \"%s\" at %s line: %d\n", > ?46 assertionText, > ?47 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? file, > ?48 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? line); > ?49 ? ? } > ?50 } > > Maybe I miss something but I do not see an abort. > There is technical debt that dates back to the original development of the JPLIS agent in JDK 5. If we run out of memory during VM startup then doing a graceful abort seems right, pointless trying to continue. I don't know how far you want to go with the current patch but a bit icky to continue (even with a warning) with the wrong configuration. The late binding agent case is different of course, I think the native method has to complete with a pending exception rather than aborting the VM. -Alan