From ioi.lam at oracle.com Tue Sep 1 01:17:07 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 31 Aug 2020 18:17:07 -0700 Subject: RFR(M) 8252526 Remove excessive inclusion of jvmti.h and jvmtiExport.hpp In-Reply-To: <610aa37d-3f5e-cae7-197a-b3294b217aac@oracle.com> References: <610aa37d-3f5e-cae7-197a-b3294b217aac@oracle.com> Message-ID: <198a1835-a17b-a83f-145b-e49f5ed37e10@oracle.com> On 8/31/20 4:05 PM, David Holmes wrote: > Hi Ioi, > > I haven't looked at the code changes ... > > On 1/09/2020 4:13 am, Ioi Lam wrote: >> https://bugs.openjdk.java.net/browse/JDK-8252526 >> http://cr.openjdk.java.net/~iklam/jdk16/8252526-fix-jvmti-hpp.v01/ >> >> (I marked this RFR as "M" because 63 files have changed, but most of >> the are >> just adding a missing #include "prims/jvmtiExport.hpp"). >> >> jvmti.h is included 905 times and jvmtiExport.hpp is included 776 times >> (by 971 hotspot .o files). Most of these are unnecessarily included >> by the >> following 3 popular header files: >> >> [1] javaClasses.hpp: ThreadStatus is rarely used, and should be moved >> ?? ? to javaThreadStatus.hpp. I also converted the enum to an C++ 11 >> ?? ? enum class for better type safety. (see also JDK-8247938). >> >> [2] os.hpp: No need to include jvm.h. Use forward declaration >> ?? ? "typedef struct _jvmtiTimerInfo jvmtiTimerInfo;" instead. > > That does not seem reasonable to me. It is one thing to do a simple > forward declaration of a class but this is an internal detail of JVMTI > which os.hpp has no business knowing about. > How about changing jvmti.h from: struct _jvmtiTimerInfo; typedef struct _jvmtiTimerInfo jvmtiTimerInfo; to struct jvmtiTimerInfo; typedef struct jvmtiTimerInfo jvmtiTimerInfo; Then os.hpp can declare: struct jvmtiTimerInfo; I wonder why we use the _ prefix. It seems like an anachronism to me. Do we still support C compilers that cannot handle "typedef struct Foo Foo;" >> [3] thread.hpp: No need to include jvmExport.hpp. Use forward >> declaration >> ?? ? for JvmtiSampledObjectAllocEventCollector and >> ?? ? JvmtiVMObjectAllocEventCollector instead. >> >> >> The total number of includes have reduced from 252033 to 250001. >> Build time of >> slow-debug libjvm.so is reduced from 2:07 to 2:04 on my machine. > > I'm not sure why we really care. The build times are not that > problematic that it warrants some of these header file hacks IMO. My main goal is to speed up incremental builds. Currently we have more than 200 header files that are included by more than half of the .o files of hotspot. I would often get massive rebuilds after touching a header file. Removing the unnecessary dependencies will help developer productivity. Also, currently we do a lot of build validations in tier5 and we had consistently missed build problems with the minimal build. I suggested to move those builds to tier1, but that was turned down because the builds are not fast enough. > Are those figures with or without PCH? > I tested on Linux without PCH. Thanks - Ioi > David > ----- > >> >> Thanks >> - Ioi >> From david.holmes at oracle.com Tue Sep 1 01:18:36 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 1 Sep 2020 11:18:36 +1000 Subject: 8248337: sparc related code clean up after solaris removal In-Reply-To: References: Message-ID: Hi Yumin, On 1/09/2020 7:32 am, Yumin Qi wrote: > HI, > > ? Please review for > > ? bug: https://bugs.openjdk.java.net/browse/JDK-8248337 > > ? webrev:http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-01/ > > > ? Summary: After Solaris supported files removed from repo, there are > some remnants which needs cleaning up. Some comments are not correct, > and some refer to wrong files. Those changes are mostly okay but I have a few minor issues/suggestions below. > There is a flag seems only useful for > Sparc: UseRDPCForConstantTableBase, which got removed in this patch . Despite the description of the flag it is far from clear that the use of the flag affects sparc only. It affects the pinned() function so seems somewhat platform agnostic in that sense - which is why this was not dealt with in the SPARC removal process. I think this needs closer examination by the compiler folk, with a recommendation on whether it can/should be changed or not. Regardless as this is a product flag then I think this change should be factored out and we go through the appropriate deprecate/obsolete/expire process. > Also in postaloc.cpp, the delay slot seems is only for sparc too, but I > am not sure about that. Most of the patch are in comment section. It refers to spill slot not delay slot. I don't see anything obviously sparc specific about that block of code. Specific comments: src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp -// 64 bits items (sparc abi) even though java would only store +// 64 bits items even though java would only store Should "(sparc abi)" be replaced with "(Aarch64 abi)" as you did for other platforms? --- src/hotspot/cpu/arm/frame_arm.hpp (and other files) // The interpreter and adapters will extend the frame of the caller. // Since oopMaps are based on the sp of the caller before extension - // we need to know that value. However in order to compute the address - // of the return address we need the real "raw" sp. Since sparc already - // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's - // original sp we use that convention. + // we need to know that value. However in order to compute the return + // address we need the real "raw" sp. I think this is losing too much information as it no longer describes the convention. I would suggest: // The interpreter and adapters will extend the frame of the caller. // Since oopMaps are based on the sp of the caller before extension // we need to know that value. However in order to compute the address - // of the return address we need the real "raw" sp. Since sparc already - // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's - // original sp we use that convention. + // of the return address we need the real "raw" sp. By convention we + // use sp() to mean "raw" sp and unextended_sp() to mean the caller's + // original sp. --- src/hotspot/cpu/ppc/jniTypes_ppc.hpp - // stubGenerator_sparc.cpp) reverse the argument list constructed by + // stubGenerator_${CPU}.cpp) reverse the argument list constructed by Just replace sparc with ppc as done for other platforms. --- src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp - // This greatly simplifies the cases here compared to sparc. + // This greatly simplifies the cases here. Just delete the comment as there is nothing to compare simplicity or complexity against. --- src/hotspot/share/c1/c1_LIRGenerator.cpp - // In 64bit the type can be long, sparc doesn't have this assert + // In 64bit the type can be long // assert(offset.type()->tag() == intTag, "invalid type"); compiler folk should decide what to do here but I think the comment and commented out assert can just be deleted. --- src/hotspot/share/c1/c1_Runtime1.cpp - case handle_exception_nofpu_id: // Unused on sparc + case handle_exception_nofpu_id: // unused. the new comment is incorrect as this case is not unused. I suggest just deleting the comment. Thanks, David ----- > > > ? Tests passed tier1-4 > > > ? Thanks > > ? Yumin > From ioi.lam at oracle.com Tue Sep 1 04:00:13 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 31 Aug 2020 21:00:13 -0700 Subject: RFR(M) 8252526 Remove excessive inclusion of jvmti.h and jvmtiExport.hpp In-Reply-To: <198a1835-a17b-a83f-145b-e49f5ed37e10@oracle.com> References: <610aa37d-3f5e-cae7-197a-b3294b217aac@oracle.com> <198a1835-a17b-a83f-145b-e49f5ed37e10@oracle.com> Message-ID: <687bda99-e689-7c53-ee29-f624029f1d1b@oracle.com> On 8/31/20 6:17 PM, Ioi Lam wrote: > > > On 8/31/20 4:05 PM, David Holmes wrote: >> Hi Ioi, >> >> I haven't looked at the code changes ... >> >> On 1/09/2020 4:13 am, Ioi Lam wrote: >>> https://bugs.openjdk.java.net/browse/JDK-8252526 >>> http://cr.openjdk.java.net/~iklam/jdk16/8252526-fix-jvmti-hpp.v01/ >>> >>> (I marked this RFR as "M" because 63 files have changed, but most of >>> the are >>> just adding a missing #include "prims/jvmtiExport.hpp"). >>> >>> jvmti.h is included 905 times and jvmtiExport.hpp is included 776 times >>> (by 971 hotspot .o files). Most of these are unnecessarily included >>> by the >>> following 3 popular header files: >>> >>> [1] javaClasses.hpp: ThreadStatus is rarely used, and should be moved >>> ?? ? to javaThreadStatus.hpp. I also converted the enum to an C++ 11 >>> ?? ? enum class for better type safety. (see also JDK-8247938). >>> >>> [2] os.hpp: No need to include jvm.h. Use forward declaration >>> ?? ? "typedef struct _jvmtiTimerInfo jvmtiTimerInfo;" instead. >> >> That does not seem reasonable to me. It is one thing to do a simple >> forward declaration of a class but this is an internal detail of >> JVMTI which os.hpp has no business knowing about. >> > > How about changing jvmti.h from: > > struct _jvmtiTimerInfo; > typedef struct _jvmtiTimerInfo jvmtiTimerInfo; > > to > > struct jvmtiTimerInfo; > typedef struct jvmtiTimerInfo jvmtiTimerInfo; > > Then os.hpp can declare: > > struct jvmtiTimerInfo; > > I wonder why we use the _ prefix. It seems like an anachronism to me. > Do we still support C compilers that cannot handle "typedef struct Foo > Foo;" > In fact I think this typedef is not an implementation detail. It's not intended to hide anything in jvmti.h. I.e., it's not a typedef of an opaque structure. All fields in the structure can be accessed without any access control. It's just an odd style of declaring a structure that tries to be compatible with ancient C compilers, so you can use "jvmtiTimerInfo" instead of "struct jvmtiTimerInfo" everywhere. It could have been written like this (from our official spec): https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html typedef struct { ??? jlong max_value; ??? jboolean may_skip_forward; ??? jboolean may_skip_backward; ??? jvmtiTimerKind kind; ??? jlong reserved1; ??? jlong reserved2; } jvmtiTimerInfo; Why don't we use this style in jvmti.h? Well, one difference is that this style cannot be forward declared without revealing the contents of the struct! I would argue that the reason for the current style is for the *very purpose* so that it can be forward declared :-) Thanks - Ioi >>> [3] thread.hpp: No need to include jvmExport.hpp. Use forward >>> declaration >>> ?? ? for JvmtiSampledObjectAllocEventCollector and >>> ?? ? JvmtiVMObjectAllocEventCollector instead. >>> >>> >>> The total number of includes have reduced from 252033 to 250001. >>> Build time of >>> slow-debug libjvm.so is reduced from 2:07 to 2:04 on my machine. >> >> I'm not sure why we really care. The build times are not that >> problematic that it warrants some of these header file hacks IMO. > > My main goal is to speed up incremental builds. Currently we have more > than 200 header files that are included by more than half of the .o > files of hotspot. I would often get massive rebuilds after touching a > header file. Removing the unnecessary dependencies will help developer > productivity. > > Also, currently we do a lot of build validations in tier5 and we had > consistently missed build problems with the minimal build. I suggested > to move those builds to tier1, but that was turned down because the > builds are not fast enough. > >> Are those figures with or without PCH? >> > I tested on Linux without PCH. > > Thanks > - Ioi > >> David >> ----- >> >>> >>> Thanks >>> - Ioi >>> > From david.holmes at oracle.com Tue Sep 1 04:38:51 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 1 Sep 2020 14:38:51 +1000 Subject: RFR(M) 8252526 Remove excessive inclusion of jvmti.h and jvmtiExport.hpp In-Reply-To: <687bda99-e689-7c53-ee29-f624029f1d1b@oracle.com> References: <610aa37d-3f5e-cae7-197a-b3294b217aac@oracle.com> <198a1835-a17b-a83f-145b-e49f5ed37e10@oracle.com> <687bda99-e689-7c53-ee29-f624029f1d1b@oracle.com> Message-ID: On 1/09/2020 2:00 pm, Ioi Lam wrote: > On 8/31/20 6:17 PM, Ioi Lam wrote: >> On 8/31/20 4:05 PM, David Holmes wrote: >>> Hi Ioi, >>> >>> I haven't looked at the code changes ... >>> >>> On 1/09/2020 4:13 am, Ioi Lam wrote: >>>> https://bugs.openjdk.java.net/browse/JDK-8252526 >>>> http://cr.openjdk.java.net/~iklam/jdk16/8252526-fix-jvmti-hpp.v01/ >>>> >>>> (I marked this RFR as "M" because 63 files have changed, but most of >>>> the are >>>> just adding a missing #include "prims/jvmtiExport.hpp"). >>>> >>>> jvmti.h is included 905 times and jvmtiExport.hpp is included 776 times >>>> (by 971 hotspot .o files). Most of these are unnecessarily included >>>> by the >>>> following 3 popular header files: >>>> >>>> [1] javaClasses.hpp: ThreadStatus is rarely used, and should be moved >>>> ?? ? to javaThreadStatus.hpp. I also converted the enum to an C++ 11 >>>> ?? ? enum class for better type safety. (see also JDK-8247938). >>>> >>>> [2] os.hpp: No need to include jvm.h. Use forward declaration >>>> ?? ? "typedef struct _jvmtiTimerInfo jvmtiTimerInfo;" instead. >>> >>> That does not seem reasonable to me. It is one thing to do a simple >>> forward declaration of a class but this is an internal detail of >>> JVMTI which os.hpp has no business knowing about. >>> >> >> How about changing jvmti.h from: >> >> struct _jvmtiTimerInfo; >> typedef struct _jvmtiTimerInfo jvmtiTimerInfo; >> >> to >> >> struct jvmtiTimerInfo; >> typedef struct jvmtiTimerInfo jvmtiTimerInfo; >> >> Then os.hpp can declare: >> >> struct jvmtiTimerInfo; Does that actually work? If so that's equivalent to "class Foo;" forward declarations and so is acceptable. >> >> I wonder why we use the _ prefix. It seems like an anachronism to me. >> Do we still support C compilers that cannot handle "typedef struct Foo >> Foo;" >> > In fact I think this typedef is not an implementation detail. It's not > intended to hide anything in jvmti.h. I.e., it's not a typedef of an > opaque structure. All fields in the structure can be accessed without > any access control. > > It's just an odd style of declaring a structure that tries to be > compatible with ancient C compilers, so you can use "jvmtiTimerInfo" > instead of "struct jvmtiTimerInfo" everywhere. It could have been > written like this (from our official spec): > > https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html > > typedef struct { > ??? jlong max_value; > ??? jboolean may_skip_forward; > ??? jboolean may_skip_backward; > ??? jvmtiTimerKind kind; > ??? jlong reserved1; > ??? jlong reserved2; > } jvmtiTimerInfo; > > Why don't we use this style in jvmti.h? Well, one difference is that > this style cannot be forward declared without revealing the contents of > the struct! > > I would argue that the reason for the current style is for the *very > purpose* so that it can be forward declared :-) I don't agree with the conclusion. ;-) The intent is that anyone who needs jvmtiTimerInfo #includes jvmti.h - that is the whole point of header files afterall. os.hpp has no business knowing what jvmtiTimerInfo needs to be typedef'd to to make things work. >>>> [3] thread.hpp: No need to include jvmExport.hpp. Use forward >>>> declaration >>>> ?? ? for JvmtiSampledObjectAllocEventCollector and >>>> ?? ? JvmtiVMObjectAllocEventCollector instead. >>>> >>>> >>>> The total number of includes have reduced from 252033 to 250001. >>>> Build time of >>>> slow-debug libjvm.so is reduced from 2:07 to 2:04 on my machine. >>> >>> I'm not sure why we really care. The build times are not that >>> problematic that it warrants some of these header file hacks IMO. >> >> My main goal is to speed up incremental builds. Currently we have more >> than 200 header files that are included by more than half of the .o >> files of hotspot. I would often get massive rebuilds after touching a >> header file. Removing the unnecessary dependencies will help developer >> productivity. On some absolute scale of measurement perhaps, but in practice people tend to handle other tasks while waiting for builds to complete. >> >> Also, currently we do a lot of build validations in tier5 and we had >> consistently missed build problems with the minimal build. I suggested >> to move those builds to tier1, but that was turned down because the >> builds are not fast enough. But they are full builds, not incremental so I don't see how this work would really help them. Especially as ... >> >>> Are those figures with or without PCH? >>> >> I tested on Linux without PCH. ... most builds use PCH so we aren't recompiling these header files unnecessarily. I'm not objecting to doing the cleanup in this area as long as we aren't overly contorting things just to avoid an include that logically is actually needed. The initial jvmtiTimerInfo situation is on the wrong side of the line IMO - and unfortunately, due to the way the os class is declared, it isn't one easily fixed by refactoring. Cheers, David ----- >> >> Thanks >> - Ioi >> >>> David >>> ----- >>> >>>> >>>> Thanks >>>> - Ioi >>>> >> > From yumin.qi at oracle.com Tue Sep 1 04:56:32 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Mon, 31 Aug 2020 21:56:32 -0700 Subject: 8248337: sparc related code clean up after solaris removal In-Reply-To: References: Message-ID: Hi, David ? Thanks for review. I will wait for compiler folks' comments. Thanks Yumin On 8/31/20 6:18 PM, David Holmes wrote: > Hi Yumin, > > On 1/09/2020 7:32 am, Yumin Qi wrote: >> HI, >> >> ?? Please review for >> >> ?? bug: https://bugs.openjdk.java.net/browse/JDK-8248337 >> >> webrev:http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-01/ >> >> >> ?? Summary: After Solaris supported files removed from repo, there are some remnants which needs cleaning up. Some comments are not correct, and some refer to wrong files. > > Those changes are mostly okay but I have a few minor issues/suggestions below. > >> There is a flag seems only useful for Sparc: UseRDPCForConstantTableBase, which got removed in this patch . > > Despite the description of the flag it is far from clear that the use of the flag affects sparc only. It affects the pinned() function so seems somewhat platform agnostic in that sense - which is why this was not dealt with in the SPARC removal process. I think this needs closer examination by the compiler folk, with a recommendation on whether it can/should be changed or not. Regardless as this is a product flag then I think this change should be factored out and we go through the appropriate deprecate/obsolete/expire process. > >> Also in postaloc.cpp, the delay slot seems is only for sparc too, but I am not sure about that. Most of the patch are in comment section. > > It refers to spill slot not delay slot. I don't see anything obviously sparc specific about that block of code. > > Specific comments: > > src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp > > -// 64 bits items (sparc abi) even though java would only store > +// 64 bits items even though java would only store > > Should "(sparc abi)" be replaced with "(Aarch64 abi)" as you did for other platforms? > > --- > > src/hotspot/cpu/arm/frame_arm.hpp (and other files) > > ?? // The interpreter and adapters will extend the frame of the caller. > ?? // Since oopMaps are based on the sp of the caller before extension > -? // we need to know that value. However in order to compute the address > -? // of the return address we need the real "raw" sp. Since sparc already > -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's > -? // original sp we use that convention. > +? // we need to know that value. However in order to compute the return > +? // address we need the real "raw" sp. > > I think this is losing too much information as it no longer describes the convention. I would suggest: > > ?? // The interpreter and adapters will extend the frame of the caller. > ?? // Since oopMaps are based on the sp of the caller before extension > ?? // we need to know that value. However in order to compute the address > -? // of the return address we need the real "raw" sp. Since sparc already > -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's > -? // original sp we use that convention. > +? // of the return address we need the real "raw" sp. By convention we > +? // use sp() to mean "raw" sp and unextended_sp() to mean the caller's > +? // original sp. > > --- > > src/hotspot/cpu/ppc/jniTypes_ppc.hpp > > -? // stubGenerator_sparc.cpp) reverse the argument list constructed by > +? // stubGenerator_${CPU}.cpp) reverse the argument list constructed by > > Just replace sparc with ppc as done for other platforms. > > --- > > src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp > > -? // This greatly simplifies the cases here compared to sparc. > +? // This greatly simplifies the cases here. > > Just delete the comment as there is nothing to compare simplicity or complexity against. > > --- > > src/hotspot/share/c1/c1_LIRGenerator.cpp > > -? // In 64bit the type can be long, sparc doesn't have this assert > +? // In 64bit the type can be long > ?? // assert(offset.type()->tag() == intTag, "invalid type"); > > compiler folk should decide what to do here but I think the comment and commented out assert can just be deleted. > > --- > > src/hotspot/share/c1/c1_Runtime1.cpp > > -? case handle_exception_nofpu_id:? // Unused on sparc > +? case handle_exception_nofpu_id:? // unused. > > the new comment is incorrect as this case is not unused. I suggest just deleting the comment. > > Thanks, > David > ----- > >> >> >> ?? Tests passed tier1-4 >> >> >> ?? Thanks >> >> ?? Yumin >> From david.holmes at oracle.com Tue Sep 1 05:06:39 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 1 Sep 2020 15:06:39 +1000 Subject: Fatal errors when running JCK tests with JDK15/16 debug build In-Reply-To: References: <80c5c750-2a02-9bd9-d0b4-628481c71264@oracle.com> Message-ID: Hi Leonid, On 1/09/2020 10:42 am, leonid.kuskov at oracle.com wrote: > Hi, > > It's a known issue that was reported by Arno Zeller > (arno.zeller at sap.com) in the middle of June. The test > jvmti/GetAllStackTraces/gast001/gast00105/gast00105.html failed with the > same stack trace despite the fix ( JCK-7022500 lprintf in > jvmti/support.c is not MT-Safe) Please file a JCK's issue with details > to reproduce the failure. Interesting. The fix is supposed to make things thread-safe by using a RawMonitor to ensure only one thread can use lprintf at a time. I missed that in my initial analysis. But something is going wrong. Thanks, David > Thanks, > Leonid > > On 8/31/20 3:37 PM, David Holmes wrote: > >> On 1/09/2020 3:00 am, Doerr, Martin wrote: >>> Hi David, >>> >>> thanks for analyzing it. We need to exclude the test for now. >> >> Can you file a JCK bug? I can file one on our internal JCK Jira but >> I'm not sure what the right process is in this case. >> >> Thanks, >> David >> >>> Best regards, >>> Martin >>> >>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Montag, 31. August 2020 04:34 >>>> To: Doerr, Martin ; serviceability- >>>> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug >>>> build >>>> >>>> Hi Martin, >>>> >>>> On 29/08/2020 3:53 am, Doerr, Martin wrote: >>>>> Hi, >>>>> >>>>> we have seen the following fatal error more than 50 times since >>>>> 2020-05-25 in various JCK tests vm/jvmti. >>>>> >>>>> fatal error: String conversion failure: [check] ExitLock destroyed >>>>> >>>>> --> ?? [check] ExitLock exited >>>>> >>>>> (followed by garbage output) >>>>> >>>>> 8166358: Re-enable String verification in >>>>> java_lang_String::create_from_str() >>>>> >>>>> was pushed at that date which introduced the call to fatal. >>>>> >>>>> Stack (example from linuxppc64le, but also observed on x86 and >>>>> aarch64): >>>>> V? [libjvm.so+0xee242c] java_lang_String::create_from_str(char const*, >>>>> Thread*) [clone .part.158]+0x51c >>>>> V? [libjvm.so+0xee2530] java_lang_String::create_oop_from_str(char >>>>> const*, Thread*)+0x40 >>>>> V? [libjvm.so+0x1026a30]? jni_NewStringUTF+0x1e0 >>>>> C? [libjckjvmti.so+0x3ce4c]? logWrite+0x5c >>>>> C? [libjckjvmti.so+0x3cd20]? lprintf+0x170 >>>>> C? [libjckjvmti.so+0x485b8]? gast00104_agent_proc+0x254 >>>>> V? [libjvm.so+0x1218f0c] JvmtiAgentThread::call_start_function()+0x24c >>>>> V? [libjvm.so+0x193a8fc] JavaThread::thread_main_inner()+0x32c >>>>> V? [libjvm.so+0x19418a0]? Thread::call_run()+0x160 >>>>> V? [libjvm.so+0x15c9d0c]? thread_native_entry(Thread*)+0x18c >>>>> C? [libpthread.so.0+0x9b48]? start_thread+0x108 >>>>> >>>>> (Problem could have been there before but without this fatal message.) >>>>> >>>>> The messages are generated by: >>>>> >>>>> tests/vm/jvmti/GetAllStackTraces/gast001/gast00104/gast00104.c >>>>> >>>>> This looks like a race condition. The message changes while the VM >>>>> creates a String object from it. Has anybody seen this before? >>>> >>>> No but ... >>>> >>>>> Is it a test problem? I'm not familiar with the lprintf calls in >>>>> the test. >>>> >>>> ... the lprintf is part of the JCK support library (support.c if you >>>> have access to sources) and it uses a static buffer for the log >>>> messages >>>> and so it not thread-safe. This test creates a thread and both it and >>>> the main thread call lprintf concurrently. >>>> >>>> So this is a JCK test/test-library bug that appears to be exposed by >>>> the >>>> changes made in 8166358. >>>> >>>> Cheers, >>>> David >>>> ----- >>>> >>>>> Best regards, >>>>> >>>>> Martin >>>>> From stefan.karlsson at oracle.com Tue Sep 1 07:10:32 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 1 Sep 2020 09:10:32 +0200 Subject: RFR: 8252589: Code duplication in ParallelSPCleanupTask In-Reply-To: References: Message-ID: <35e41ece-a81b-8032-5b4b-1f38069577a1@oracle.com> Thanks, David. StefanK On 2020-09-01 00:55, David Holmes wrote: > Looks good! Nice simpification. > > Thanks, > David > > On 1/09/2020 2:46 am, Stefan Karlsson wrote: >> Hi all, >> >> Please review this small patch to remove some code duplication in >> ParallelSPCleanupTask. >> >> https://cr.openjdk.java.net/~stefank/8252589/webrev.01 >> https://bugs.openjdk.java.net/browse/JDK-8252589 >> >> I noticed this while reviewing the patches for JEP 376. I think this >> makes it more apparent what the individual sub-tasks are doing. >> >> Thanks, >> StefanK From serguei.spitsyn at oracle.com Tue Sep 1 08:39:44 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 1 Sep 2020 01:39:44 -0700 Subject: RFR(T) : 8252532 : use Utils.TEST_NATIVE_PATH instead of System.getProperty("test.nativepath") In-Reply-To: <93076ed6-f112-dd27-15d3-13f67cdf5de0@oracle.com> References: <93076ed6-f112-dd27-15d3-13f67cdf5de0@oracle.com> Message-ID: <226c299e-4661-1388-5382-daeca3780475@oracle.com> Hi Igor, This looks fine to me too. I also agree with David's suggestions. Thanks, Serguei On 8/30/20 21:53, David Holmes wrote: > Hi Igor, > > On 29/08/2020 1:52 pm, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 >>> 145 lines changed: 28 ins; 22 del; 95 mod; >> >> >> Hi all, >> >> could you please review this trivial clean up which replaces >> System.getProperty("test.nativepath") w/ Utils.TEST_NATIVE_PATH where >> appropriate? >> >> while updating these files, I've also cleaned them up a bit, removed >> unneeded imports, added/removed spaces, etc >> >> testing: runtime, serviceability and vmTestbase/nsk/jvmti/ tests on >> {linux,windows,macos}-x64 >> JBS: https://bugs.openjdk.java.net/browse/JDK-8252532 >> webrev: http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 > > Generally seems fine (though the fact the patch file contained a > series of changesets threw me initially!) > > test/hotspot/jtreg/runtime/signal/SigTestDriver.java > > ???????? // add test specific arguments w/o signame > ???????? cmd.addAll(Arrays.asList(args) > -???????????????????????? .subList(1, args.length)); > +??????????????? .subList(1, args.length)); > > Your changed line doesn't have the right indent. Can this just be put > on one line anyway: > > ???????? // add test specific arguments w/o signame > ???????? cmd.addAll(Arrays.asList(args).subList(1, args.length)); > > that seems better to me as the fact there is only one argument seems > clearer. Though for greater clarity perhaps: > > ???????? // add test specific arguments w/o signame > ???????? var argList = Arrays.asList(args).subList(1, args.length); > ???????? cmd.addAll(argList); > > -- > > +??????????????? Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) > +??????????????? .filter(s -> !s.isEmpty()) > +??????????????? .filter(s -> s.startsWith("-X")) > +??????????????? .flatMap(arg -> Stream.of("-vmopt", arg)) > +??????????????? .collect(Collectors.toList()); > > The preferred/common style for chained stream operations is to align > the dots: > > ????????? Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) > ??????????????? .filter(s -> !s.isEmpty()) > ??????????????? .filter(s -> s.startsWith("-X")) > ??????????????? .flatMap(arg -> Stream.of("-vmopt", arg)) > ??????????????? .collect(Collectors.toList()); > > --- > > test/lib/jdk/test/lib/process/ProcessTools.java > > -??????? System.out.println("\t" +? t + > -?????????????????????????? " stack: (length = " + stack.length + ")"); > +??????? System.out.println("\t" + t + > +??????????????? " stack: (length = " + stack.length + ")"); > > The original code is more stylistically correct - when breaking > arguments across lines the indent should align with the start of the > arguments. > > Similarly here: > > +??????? return String.format("--- ProcessLog ---%n" + > +??????????????????????? "cmd: %s%n" + > +??????????????????????? "exitvalue: %s%n" + > +??????????????????????? "stderr: %s%n" + > +??????????????????????? "stdout: %s%n", > +??????????????? getCommandLine(pb), exitValue, stderr, stdout); > > should be: > > +??????? return String.format("--- ProcessLog ---%n" + > +???????????????????????????? "cmd: %s%n" + > +???????????????????????????? "exitvalue: %s%n" + > +???????????????????????????? "stderr: %s%n" + > +???????????????????????????? "stdout: %s%n", > +???????????????????????????? getCommandLine(pb), exitValue, stderr, > stdout); > > and here: > > +??????? String executable = Paths.get(Utils.TEST_NATIVE_PATH, > executableName) > +??????????????? .toAbsolutePath() > +??????????????? .toString(); > > indentation again. > > Thanks, > David > ----- > >> Thanks, >> -- Igor >> From albert.m.yang at oracle.com Tue Sep 1 09:22:59 2020 From: albert.m.yang at oracle.com (Albert Yang) Date: Tue, 1 Sep 2020 11:22:59 +0200 Subject: RFR: 8252093: formula used to calculate decaying variance in numberSeq In-Reply-To: <89a6cfee-093b-3381-730b-bab94a13a55b@oracle.com> References: <54b1cc85-9f2d-cdf8-c513-1cbee4fa7f3d@oracle.com> <3577e2ab-f633-71ca-0b94-c7a39d6ca631@oracle.com> <9d8182b5-94f7-d0ef-c3aa-6edb5b11a399@oracle.com> <89a6cfee-093b-3381-730b-bab94a13a55b@oracle.com> Message-ID: No significant change is observed for SPECjbb2015 scores with and without this patch, using G1, Shenandoah and Z. GC logs show slightly more GC cycles (before: 357 and after: 389) in ZGC with this patch, and no significant change for G1 and Shenandoah. -- /Albert From stefan.johansson at oracle.com Tue Sep 1 09:26:39 2020 From: stefan.johansson at oracle.com (stefan.johansson at oracle.com) Date: Tue, 1 Sep 2020 11:26:39 +0200 Subject: RFR: 8252093: formula used to calculate decaying variance in numberSeq In-Reply-To: References: <54b1cc85-9f2d-cdf8-c513-1cbee4fa7f3d@oracle.com> <3577e2ab-f633-71ca-0b94-c7a39d6ca631@oracle.com> <9d8182b5-94f7-d0ef-c3aa-6edb5b11a399@oracle.com> <89a6cfee-093b-3381-730b-bab94a13a55b@oracle.com> Message-ID: <6e9f1f5e-6f59-f77c-740e-9f800018f8e2@oracle.com> On 2020-09-01 11:22, Albert Yang wrote: > No significant change is observed for SPECjbb2015 scores with and > without this patch, using G1, Shenandoah and Z. Sounds good. > > GC logs show slightly more GC cycles (before: 357 and after: 389) in ZGC > with this patch, and no significant change for G1 and Shenandoah. > Did you compare the total runtime of the two ZGC runs, the extra cycles could be caused warmup taking much longer in the second run. Stefan From albert.m.yang at oracle.com Tue Sep 1 09:42:27 2020 From: albert.m.yang at oracle.com (Albert Yang) Date: Tue, 1 Sep 2020 11:42:27 +0200 Subject: RFR: 8252093: formula used to calculate decaying variance in numberSeq In-Reply-To: <6e9f1f5e-6f59-f77c-740e-9f800018f8e2@oracle.com> References: <54b1cc85-9f2d-cdf8-c513-1cbee4fa7f3d@oracle.com> <3577e2ab-f633-71ca-0b94-c7a39d6ca631@oracle.com> <9d8182b5-94f7-d0ef-c3aa-6edb5b11a399@oracle.com> <89a6cfee-093b-3381-730b-bab94a13a55b@oracle.com> <6e9f1f5e-6f59-f77c-740e-9f800018f8e2@oracle.com> Message-ID: > Did you compare the total runtime of the two ZGC runs, Yes. > the extra cycles could be caused warmup taking much longer in the second run. Seem so. Before the patch, it takes 40 GC cycles to warmup, but after the patch, 78 GC cycles. If we take that into account, the actual GC cycles during the measurement (349-40=309 vs 380-78=302) are very close. ``` # before [2.610s][info][gc] GC(0) Garbage Collection (System.gc()) 2480M(2%)->176M(0%) [86.779s][info][gc] GC(1) Garbage Collection (System.gc()) 4796M(4%)->1464M(1%) [323.875s][info][gc] GC(12) Garbage Collection (System.gc()) 50576M(39%)->1646M(1%) [421.629s][info][gc] GC(17) Garbage Collection (System.gc()) 41726M(32%)->1628M(1%) [608.851s][info][gc] GC(33) Garbage Collection (System.gc()) 22748M(17%)->1826M(1%) [686.220s][info][gc] GC(39) Garbage Collection (System.gc()) 34834M(27%)->1834M(1%) [692.493s][info][gc] GC(40) Garbage Collection (System.gc()) 8180M(6%)->1346M(1%) [7867.090s][info][gc] GC(349) Garbage Collection (System.gc()) 21014M(16%)->424M(0%) [8045.531s][info][gc] GC(356) Garbage Collection (System.gc()) 60270M(46%)->1796M(1%) [8051.900s][info][gc] GC(357) Garbage Collection (System.gc()) 8286M(6%)->1350M(1%) ``` ``` # after [2.632s][info][gc] GC(0) Garbage Collection (System.gc()) 2480M(2%)->176M(0%) [86.816s][info][gc] GC(1) Garbage Collection (System.gc()) 4774M(4%)->1468M(1%) [313.601s][info][gc] GC(12) Garbage Collection (System.gc()) 55236M(42%)->280M(0%) [408.054s][info][gc] GC(18) Garbage Collection (System.gc()) 65486M(50%)->1784M(1%) [590.894s][info][gc] GC(35) Garbage Collection (System.gc()) 20584M(16%)->422M(0%) [690.631s][info][gc] GC(45) Garbage Collection (System.gc()) 22204M(17%)->1638M(1%) [791.317s][info][gc] GC(57) Garbage Collection (System.gc()) 21268M(16%)->436M(0%) [908.128s][info][gc] GC(68) Garbage Collection (System.gc()) 40480M(31%)->1838M(1%) [989.525s][info][gc] GC(77) Garbage Collection (System.gc()) 6198M(5%)->440M(0%) [995.913s][info][gc] GC(78) Garbage Collection (System.gc()) 7064M(5%)->1362M(1%) [8024.034s][info][gc] GC(380) Garbage Collection (System.gc()) 20714M(16%)->438M(0%) [8196.327s][info][gc] GC(388) Garbage Collection (System.gc()) 12494M(10%)->436M(0%) [8202.587s][info][gc] GC(389) Garbage Collection (System.gc()) 6920M(5%)->1354M(1%) ``` -- /Albert From stefan.johansson at oracle.com Tue Sep 1 09:55:15 2020 From: stefan.johansson at oracle.com (stefan.johansson at oracle.com) Date: Tue, 1 Sep 2020 11:55:15 +0200 Subject: RFR: 8252093: formula used to calculate decaying variance in numberSeq In-Reply-To: References: <54b1cc85-9f2d-cdf8-c513-1cbee4fa7f3d@oracle.com> <3577e2ab-f633-71ca-0b94-c7a39d6ca631@oracle.com> <9d8182b5-94f7-d0ef-c3aa-6edb5b11a399@oracle.com> <89a6cfee-093b-3381-730b-bab94a13a55b@oracle.com> <6e9f1f5e-6f59-f77c-740e-9f800018f8e2@oracle.com> Message-ID: Sounds good, I think this is enough testing. Thanks, Stefan On 2020-09-01 11:42, Albert Yang wrote: > > Did you compare the total runtime of the two ZGC runs, > Yes. > > > the extra cycles could be caused warmup taking much longer in the > second run. > Seem so. Before the patch, it takes 40 GC cycles to warmup, but after > the patch, 78 GC cycles. If we take that into account, the actual GC > cycles during the measurement (349-40=309 vs 380-78=302) are very close. > > ``` > # before > [2.610s][info][gc] GC(0) Garbage Collection (System.gc()) > 2480M(2%)->176M(0%) > [86.779s][info][gc] GC(1) Garbage Collection (System.gc()) > 4796M(4%)->1464M(1%) > [323.875s][info][gc] GC(12) Garbage Collection (System.gc()) > 50576M(39%)->1646M(1%) > [421.629s][info][gc] GC(17) Garbage Collection (System.gc()) > 41726M(32%)->1628M(1%) > [608.851s][info][gc] GC(33) Garbage Collection (System.gc()) > 22748M(17%)->1826M(1%) > [686.220s][info][gc] GC(39) Garbage Collection (System.gc()) > 34834M(27%)->1834M(1%) > [692.493s][info][gc] GC(40) Garbage Collection (System.gc()) > 8180M(6%)->1346M(1%) > [7867.090s][info][gc] GC(349) Garbage Collection (System.gc()) > 21014M(16%)->424M(0%) > [8045.531s][info][gc] GC(356) Garbage Collection (System.gc()) > 60270M(46%)->1796M(1%) > [8051.900s][info][gc] GC(357) Garbage Collection (System.gc()) > 8286M(6%)->1350M(1%) > ``` > > ``` > # after > [2.632s][info][gc] GC(0) Garbage Collection (System.gc()) > 2480M(2%)->176M(0%) > [86.816s][info][gc] GC(1) Garbage Collection (System.gc()) > 4774M(4%)->1468M(1%) > [313.601s][info][gc] GC(12) Garbage Collection (System.gc()) > 55236M(42%)->280M(0%) > [408.054s][info][gc] GC(18) Garbage Collection (System.gc()) > 65486M(50%)->1784M(1%) > [590.894s][info][gc] GC(35) Garbage Collection (System.gc()) > 20584M(16%)->422M(0%) > [690.631s][info][gc] GC(45) Garbage Collection (System.gc()) > 22204M(17%)->1638M(1%) > [791.317s][info][gc] GC(57) Garbage Collection (System.gc()) > 21268M(16%)->436M(0%) > [908.128s][info][gc] GC(68) Garbage Collection (System.gc()) > 40480M(31%)->1838M(1%) > [989.525s][info][gc] GC(77) Garbage Collection (System.gc()) > 6198M(5%)->440M(0%) > [995.913s][info][gc] GC(78) Garbage Collection (System.gc()) > 7064M(5%)->1362M(1%) > [8024.034s][info][gc] GC(380) Garbage Collection (System.gc()) > 20714M(16%)->438M(0%) > [8196.327s][info][gc] GC(388) Garbage Collection (System.gc()) > 12494M(10%)->436M(0%) > [8202.587s][info][gc] GC(389) Garbage Collection (System.gc()) > 6920M(5%)->1354M(1%) > ``` > From albert.m.yang at oracle.com Tue Sep 1 10:22:32 2020 From: albert.m.yang at oracle.com (Albert Yang) Date: Tue, 1 Sep 2020 12:22:32 +0200 Subject: RFR: 8252093: formula used to calculate decaying variance in numberSeq In-Reply-To: References: <54b1cc85-9f2d-cdf8-c513-1cbee4fa7f3d@oracle.com> <3577e2ab-f633-71ca-0b94-c7a39d6ca631@oracle.com> <9d8182b5-94f7-d0ef-c3aa-6edb5b11a399@oracle.com> <89a6cfee-093b-3381-730b-bab94a13a55b@oracle.com> <6e9f1f5e-6f59-f77c-740e-9f800018f8e2@oracle.com> Message-ID: <53d390c6-f7bf-fff9-35c2-9c3c853e2646@oracle.com> Thank you for the review. -- /Albert From jamsheed.c.m at oracle.com Tue Sep 1 12:36:17 2020 From: jamsheed.c.m at oracle.com (Jamsheed C M) Date: Tue, 1 Sep 2020 18:06:17 +0530 Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions In-Reply-To: <03df9364-817d-04d6-6434-80be93a66526@oracle.com> References: <442caa21-ca0a-f6eb-60a5-1e74bf994894@oracle.com> <03df9364-817d-04d6-6434-80be93a66526@oracle.com> Message-ID: Hi David, I reworked the patch, revised webrev here: http://cr.openjdk.java.net/~jcm/8249451/webrev.01/ In addition I moved UnlockFlagSaver fs(this) to more local scope. also removed changes done for JDK-8246727, as it will be separately handled by the bug. Testing: injected and tested async exceptions randomly at compilation request path and deopt path. Best regards, Jamsheed On 24/08/2020 11:06, Jamsheed C M wrote: > Hi David, > > Thank you for the review and feedback. Agree on all of them. I will > rework and get back. > > On 10/08/2020 07:33, David Holmes wrote: >> Hi Jamsheed, >> >> On 6/08/2020 10:07 pm, Jamsheed C M wrote: >>> Hi all, >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249451 >>> >>> webrev: http://cr.openjdk.java.net/~jcm/8249451/webrev.00/ >> >> Thanks for tackling this messy issue. Overall I like the use of TRAPS >> to more clearly document which methods can return with an exception >> pending. I think there are some problems with the proposed changes. >> I'll start with those comments and then move on to more general >> comments. >> >> src/hotspot/share/utilities/exceptions.cpp >> src/hotspot/share/utilities/exceptions.hpp >> >> I don't think the changes here are correct or safe in general. >> >> First, adding the new macro and function to only clear non-async >> exceptions is fine itself. But naming wise the fact only non-async >> exceptions are cleared should be evident, and there is no "check" >> involved (in the sense of the existing CHECK_ macros) so I suggest: >> >> s/CHECK_CLEAR_PENDING_EXCEPTION/CLEAR_PENDING_NONASYNC_EXCEPTIONS/ >> s/check_clear_pending_exception/clear_pending_nonasync_exceptions/ >> > Ok >> But changing the existing CHECK_AND_CLEAR macros to now leave async >> exceptions pending seems potentially dangerous as calling code may >> not be prepared for there to now be a pending exception. For example >> the use in thread.cpp: >> >> ?JDK_Version::set_runtime_name(get_java_runtime_name(THREAD)); >> ?JDK_Version::set_runtime_version(get_java_runtime_version(THREAD)); >> >> get_java_runtime_name() is currently guaranteed to clear all >> exceptions, so all the other code is known to be safe to call. But >> that would no longer be true. That said, this is VM initialization >> code and an async exception is impossible at this stage. >> >> I think I would rather see CHECK_AND_CLEAR left as-is, and an actual >> CHECK_AND_CLEAR_NONASYNC introduced for those users of >> CHECK_AND_CLEAR that can encounter async exceptions and which should >> not clear them. >> >> +?? if >> (!_pending_exception->is_a(SystemDictionary::ThreadDeath_klass()) && >> +?????? _pending_exception->klass() != >> SystemDictionary::InternalError_klass()) { >> > Ok >> Flagging all InternalErrors as async exceptions is probably also not >> correct. I don't see a good solution to this at the moment. I think >> we would need to introduce a new subclass of InternalError for the >> unsafe access error case**. Now it may be that all the other >> InternalError usages are "impossible" in the context of where the new >> macros are to be used, but that is very difficult to establish or >> assert. >> >> ** Or perhaps we could inject a field that allows the VM to identify >> instances related to unsafe access errors ... Ideally of course these >> unsafe access errors would be distinct from the async exception >> mechanism - something I would still like to pursue. >> > Ok >> --- >> >> General comments ... >> >> There is a general change from "JavaThread* thread" to "Thread* >> THREAD" (or TRAPS) to allow the use of the CHECK macros. This is >> unfortunate because the fact the thread is restricted to being a >> JavaThread is no longer evident in the method signatures. That is a >> flaw with the TRAPS/CHECK mechanism unfortunately :( . But as the >> methods no longer take a JavaThread* arg, they should assert that >> THREAD->is_Java_thread(). I will also look at an RFE to have >> as_JavaThread() to avoid the need for separate assertion checks >> before casting from "Thread*" to "JavaThread*". >> > Ok >> Note there's no need to use CHECK when the enclosing method is going >> to return immediately after the call that contains the CHECK. It just >> adds unnecessary checking of the exception state. The use of TRAPS >> shows that the methods may return with an exception pending. I've >> flagged all such occurrences I spotted below. >> > Ok >> --- >> >> +?? // Only metaspace OOM is expected. no Java code executed. >> >> Nit: s/no/No >> >> >> src/hotspot/share/compiler/compilationPolicy.cpp >> >> >> ?410?????? method_invocation_event(method, CHECK_NULL); >> ?489?????? CompileBroker::compile_method(m, InvocationEntryBci, >> comp_level, m, hot_count, CompileTask::Reason_InvocationCount, CHECK); >> >> Nit: there's no need to use CHECK here. >> >> --- >> >> src/hotspot/share/compiler/tieredThresholdPolicy.cpp >> >> ?504???? method_invocation_event(method, inlinee, comp_level, nm, >> CHECK_NULL); >> ?570???????? compile(mh, bci, CompLevel_simple, CHECK); >> ?581???????? compile(mh, bci, CompLevel_simple, CHECK); >> ?595???? CompileBroker::compile_method(mh, bci, level, mh, hot_count, >> CompileTask::Reason_Tiered, CHECK); >> 1062?????? compile(mh, InvocationEntryBci, next_level, CHECK); >> >> Nit: there's no need to use CHECK here. >> >> 814 void TieredThresholdPolicy::create_mdo(const methodHandle& mh, >> Thread* THREAD) { >> >> Thank you for correcting this misuse of the THREAD name on a >> JavaThread* type. >> >> --- >> >> src/hotspot/share/interpreter/linkResolver.cpp >> >> ?128?? CompilationPolicy::compile_if_required(selected_method, CHECK); >> >> Nit: there's no need to use CHECK here. >> >> --- >> >> src/hotspot/share/jvmci/compilerRuntime.cpp >> >> ?260???? CompilationPolicy::policy()->event(emh, mh, >> InvocationEntryBci, InvocationEntryBci, CompLevel_aot, cm, CHECK); >> ?280???? nmethod* osr_nm = CompilationPolicy::policy()->event(emh, >> mh, branch_bci, target_bci, CompLevel_aot, cm, CHECK); >> >> Nit: there's no need to use CHECK here. >> >> --- >> >> src/hotspot/share/jvmci/jvmciRuntime.cpp >> >> ?102???????? // Donot clear probable async exceptions. >> >> typo: s/Donot/Do not/ >> >> --- >> >> src/hotspot/share/runtime/deoptimization.cpp >> >> 1686 void Deoptimization::load_class_by_index(const >> constantPoolHandle& constant_pool, int index) { >> >> This method should be declared with TRAPS now. >> >> 1693???? // Donot clear probable Async Exceptions. >> >> typo: s/Donot/Do not/ >> >> > Ok >>> testing : mach1-5(links in jbs) >> >> There is very little existing testing that will actually test the key >> changes you have made here. You will need to do direct >> fault-injection testing anywhere you now allow async exceptions to >> remain, to see if the calling code can tolerate that. It will be >> difficult to test thoroughly. >> > Ok >> Thanks again for tackling this difficult problem! > > Best regards, > > Jamsheed > >> >> David >> ----- >> >>> >>> While working on JDK-8246381 it was noticed that compilation request >>> path clears all exceptions(including async) and doesn't propagate[1]. >>> >>> Fix: patch restores the propagation behavior for the probable async >>> exceptions. >>> >>> Compilation request path propagate exception as in [2]. MDO and >>> MethodCounter doesn't expect any exception other than metaspace >>> OOM(added comments). >>> >>> Deoptimization path doesn't clear probable async exceptions and take >>> unpack_exception path for non uncommontraps. >>> >>> Added java_lang_InternalError to well known classes. >>> >>> Request for review. >>> >>> Best Regards, >>> >>> Jamsheed >>> >>> [1] w.r.t changes done for JDK-7131259 >>> >>> [2] >>> >>> ???? (a) >>> ???? -----> c1_Runtime1.cpp/interpreterRuntime.cpp/compilerRuntime.cpp >>> ?????? | >>> ??????? ----- compilationPolicy.cpp/tieredThresholdPolicy.cpp >>> ????????? | >>> ?????????? ------ compileBroker.cpp >>> >>> ???? (b) >>> ???? Xcomp versions >>> ???? ------> compilationPolicy.cpp >>> ??????? | >>> ???????? ------> compileBroker.cpp >>> >>> ???? (c) >>> >>> ???? Direct call to? compile_method in compileBroker.cpp >>> >>> ???? JVMCI bootstrap, whitebox, replayCompile. >>> >>> From leonid.kuskov at oracle.com Tue Sep 1 00:42:18 2020 From: leonid.kuskov at oracle.com (leonid.kuskov at oracle.com) Date: Mon, 31 Aug 2020 17:42:18 -0700 Subject: Fatal errors when running JCK tests with JDK15/16 debug build In-Reply-To: <80c5c750-2a02-9bd9-d0b4-628481c71264@oracle.com> References: <80c5c750-2a02-9bd9-d0b4-628481c71264@oracle.com> Message-ID: Hi, It's a known issue that was reported by Arno Zeller (arno.zeller at sap.com) in the middle of June. The test jvmti/GetAllStackTraces/gast001/gast00105/gast00105.html failed with the same stack trace despite the fix ( JCK-7022500 lprintf in jvmti/support.c is not MT-Safe) Please file a JCK's issue with details to reproduce the failure. Thanks, Leonid On 8/31/20 3:37 PM, David Holmes wrote: > On 1/09/2020 3:00 am, Doerr, Martin wrote: >> Hi David, >> >> thanks for analyzing it. We need to exclude the test for now. > > Can you file a JCK bug? I can file one on our internal JCK Jira but > I'm not sure what the right process is in this case. > > Thanks, > David > >> Best regards, >> Martin >> >> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Montag, 31. August 2020 04:34 >>> To: Doerr, Martin ; serviceability- >>> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug >>> build >>> >>> Hi Martin, >>> >>> On 29/08/2020 3:53 am, Doerr, Martin wrote: >>>> Hi, >>>> >>>> we have seen the following fatal error more than 50 times since >>>> 2020-05-25 in various JCK tests vm/jvmti. >>>> >>>> fatal error: String conversion failure: [check] ExitLock destroyed >>>> >>>> --> ?? [check] ExitLock exited >>>> >>>> (followed by garbage output) >>>> >>>> 8166358: Re-enable String verification in >>>> java_lang_String::create_from_str() >>>> >>>> was pushed at that date which introduced the call to fatal. >>>> >>>> Stack (example from linuxppc64le, but also observed on x86 and >>>> aarch64): >>>> V? [libjvm.so+0xee242c] java_lang_String::create_from_str(char const*, >>>> Thread*) [clone .part.158]+0x51c >>>> V? [libjvm.so+0xee2530] java_lang_String::create_oop_from_str(char >>>> const*, Thread*)+0x40 >>>> V? [libjvm.so+0x1026a30]? jni_NewStringUTF+0x1e0 >>>> C? [libjckjvmti.so+0x3ce4c]? logWrite+0x5c >>>> C? [libjckjvmti.so+0x3cd20]? lprintf+0x170 >>>> C? [libjckjvmti.so+0x485b8]? gast00104_agent_proc+0x254 >>>> V? [libjvm.so+0x1218f0c] JvmtiAgentThread::call_start_function()+0x24c >>>> V? [libjvm.so+0x193a8fc] JavaThread::thread_main_inner()+0x32c >>>> V? [libjvm.so+0x19418a0]? Thread::call_run()+0x160 >>>> V? [libjvm.so+0x15c9d0c]? thread_native_entry(Thread*)+0x18c >>>> C? [libpthread.so.0+0x9b48]? start_thread+0x108 >>>> >>>> (Problem could have been there before but without this fatal message.) >>>> >>>> The messages are generated by: >>>> >>>> tests/vm/jvmti/GetAllStackTraces/gast001/gast00104/gast00104.c >>>> >>>> This looks like a race condition. The message changes while the VM >>>> creates a String object from it. Has anybody seen this before? >>> >>> No but ... >>> >>>> Is it a test problem? I'm not familiar with the lprintf calls in >>>> the test. >>> >>> ... the lprintf is part of the JCK support library (support.c if you >>> have access to sources) and it uses a static buffer for the log >>> messages >>> and so it not thread-safe. This test creates a thread and both it and >>> the main thread call lprintf concurrently. >>> >>> So this is a JCK test/test-library bug that appears to be exposed by >>> the >>> changes made in 8166358. >>> >>> Cheers, >>> David >>> ----- >>> >>>> Best regards, >>>> >>>> Martin >>>> From bob.vandette at oracle.com Tue Sep 1 12:51:08 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 1 Sep 2020 08:51:08 -0400 Subject: Regression in JDK15/16: CGroup v2 support In-Reply-To: <260D8760-CB66-43FE-8870-1AD7A2A6336E@microsoft.com> References: <260D8760-CB66-43FE-8870-1AD7A2A6336E@microsoft.com> Message-ID: Bruno, I can take a look. Is there a bugs.openjdk.java.net bug filed yet? Please send me information on this issue. I can?t locate bug 9066610 in the bug database (bugreport.java.com). Bob. > On Aug 30, 2020, at 6:59 PM, Bruno Borges wrote: > > Hi all, > > Just wanted to check if anyone at Oracle had a chance to review bug 9066610 that was submitted last week. > > We would like to continue our discussion and a proposed fix for evaluation. > > ?On 2020-08-26, 12:06 AM, "David Holmes" wrote: > > Please note that the discuss at openjdk.java.net mailing list is not the > appropriate place to discuss this kind of issue. hotspot-runtime-dev > would be the correct place to discuss this. > > Thanks, > David > > On 25/08/2020 7:29 pm, Bruno Borges wrote: >> No worries. >> >> I am not aware of any other way for non-OpenJDK Authors to submit a bug to OpenJDK except through bugreport.java.com. If there is, happy to follow that for any future issue. >> >> bb. >> >> On 2020-08-25, 1:38 AM, "Severin Gehwolf" wrote: >> >> Hi Bruno, >> >> On Tue, 2020-08-25 at 08:05 +0000, Bruno Borges wrote: >>> Hi Severin, >>> >>> Issue created: 9066610. >> >> Thanks. We might need somebody from Oracle to push this one through to >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2F&data=02%7C01%7CBruno.Borges%40microsoft.com%7C60dffad37c904298679408d8498e85f3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637340223777936292&sdata=upvvC%2BHc49Il6L%2B6koJYhVj2hwJhPxDBlApIwKH6pN0%3D&reserved=0. 9XXXX bugs are AFAIK created by the web- >> interface and need active triage to show up as JDK-8XXXX bugs. >> >>> Charlie has a fix to propose. >> >> Great! Looking forward to it. >> >> Thanks, >> Severin >> >> > From david.holmes at oracle.com Tue Sep 1 13:23:48 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 1 Sep 2020 23:23:48 +1000 Subject: Regression in JDK15/16: CGroup v2 support In-Reply-To: References: <260D8760-CB66-43FE-8870-1AD7A2A6336E@microsoft.com> Message-ID: On 1/09/2020 10:51 pm, Bob Vandette wrote: > Bruno, > > I can take a look. Is there a bugs.openjdk.java.net bug filed yet? > > Please send me information on this issue. I can?t locate bug 9066610 in the bug database (bugreport.java.com). That was the JI number. It is now: https://bugs.openjdk.java.net/browse/JDK-8252359 David > Bob. > > >> On Aug 30, 2020, at 6:59 PM, Bruno Borges wrote: >> >> Hi all, >> >> Just wanted to check if anyone at Oracle had a chance to review bug 9066610 that was submitted last week. >> >> We would like to continue our discussion and a proposed fix for evaluation. >> >> ?On 2020-08-26, 12:06 AM, "David Holmes" wrote: >> >> Please note that the discuss at openjdk.java.net mailing list is not the >> appropriate place to discuss this kind of issue. hotspot-runtime-dev >> would be the correct place to discuss this. >> >> Thanks, >> David >> >> On 25/08/2020 7:29 pm, Bruno Borges wrote: >>> No worries. >>> >>> I am not aware of any other way for non-OpenJDK Authors to submit a bug to OpenJDK except through bugreport.java.com. If there is, happy to follow that for any future issue. >>> >>> bb. >>> >>> On 2020-08-25, 1:38 AM, "Severin Gehwolf" wrote: >>> >>> Hi Bruno, >>> >>> On Tue, 2020-08-25 at 08:05 +0000, Bruno Borges wrote: >>>> Hi Severin, >>>> >>>> Issue created: 9066610. >>> >>> Thanks. We might need somebody from Oracle to push this one through to >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2F&data=02%7C01%7CBruno.Borges%40microsoft.com%7C60dffad37c904298679408d8498e85f3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637340223777936292&sdata=upvvC%2BHc49Il6L%2B6koJYhVj2hwJhPxDBlApIwKH6pN0%3D&reserved=0. 9XXXX bugs are AFAIK created by the web- >>> interface and need active triage to show up as JDK-8XXXX bugs. >>> >>>> Charlie has a fix to propose. >>> >>> Great! Looking forward to it. >>> >>> Thanks, >>> Severin >>> >>> >> > From ioi.lam at oracle.com Tue Sep 1 13:24:20 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 1 Sep 2020 06:24:20 -0700 Subject: RFR(M) 8252526 Remove excessive inclusion of jvmti.h and jvmtiExport.hpp In-Reply-To: References: <610aa37d-3f5e-cae7-197a-b3294b217aac@oracle.com> <198a1835-a17b-a83f-145b-e49f5ed37e10@oracle.com> <687bda99-e689-7c53-ee29-f624029f1d1b@oracle.com> Message-ID: On 8/31/20 9:38 PM, David Holmes wrote: > On 1/09/2020 2:00 pm, Ioi Lam wrote: >> On 8/31/20 6:17 PM, Ioi Lam wrote: >>> On 8/31/20 4:05 PM, David Holmes wrote: >>>> Hi Ioi, >>>> >>>> I haven't looked at the code changes ... >>>> >>>> On 1/09/2020 4:13 am, Ioi Lam wrote: >>>>> https://bugs.openjdk.java.net/browse/JDK-8252526 >>>>> http://cr.openjdk.java.net/~iklam/jdk16/8252526-fix-jvmti-hpp.v01/ >>>>> >>>>> (I marked this RFR as "M" because 63 files have changed, but most >>>>> of the are >>>>> just adding a missing #include "prims/jvmtiExport.hpp"). >>>>> >>>>> jvmti.h is included 905 times and jvmtiExport.hpp is included 776 >>>>> times >>>>> (by 971 hotspot .o files). Most of these are unnecessarily >>>>> included by the >>>>> following 3 popular header files: >>>>> >>>>> [1] javaClasses.hpp: ThreadStatus is rarely used, and should be moved >>>>> ?? ? to javaThreadStatus.hpp. I also converted the enum to an C++ 11 >>>>> ?? ? enum class for better type safety. (see also JDK-8247938). >>>>> >>>>> [2] os.hpp: No need to include jvm.h. Use forward declaration >>>>> ?? ? "typedef struct _jvmtiTimerInfo jvmtiTimerInfo;" instead. >>>> >>>> That does not seem reasonable to me. It is one thing to do a simple >>>> forward declaration of a class but this is an internal detail of >>>> JVMTI which os.hpp has no business knowing about. >>>> >>> >>> How about changing jvmti.h from: >>> >>> struct _jvmtiTimerInfo; >>> typedef struct _jvmtiTimerInfo jvmtiTimerInfo; >>> >>> to >>> >>> struct jvmtiTimerInfo; >>> typedef struct jvmtiTimerInfo jvmtiTimerInfo; >>> >>> Then os.hpp can declare: >>> >>> struct jvmtiTimerInfo; > > Does that actually work? If so that's equivalent to "class Foo;" > forward declarations and so is acceptable. Yes, it works, but I would need to file a CSR because jvmti.h is included in the JDK .... Thanks - Ioi ======= $ hg diff diff -r aaa4245df83a src/hotspot/share/prims/jvmtiH.xsl --- a/src/hotspot/share/prims/jvmtiH.xsl??? Mon Aug 31 11:03:13 2020 -0700 +++ b/src/hotspot/share/prims/jvmtiH.xsl??? Mon Aug 31 23:24:27 2020 -0700 @@ -406,11 +406,11 @@ ? ? -? struct _ +? struct ?? ?? ; ? -? typedef struct _ +? typedef struct ?? ?? ?? @@ -419,7 +419,7 @@ ? ? -? struct _ +? struct ?? ?? { ? diff -r aaa4245df83a src/hotspot/share/runtime/os.hpp --- a/src/hotspot/share/runtime/os.hpp??? Mon Aug 31 11:03:13 2020 -0700 +++ b/src/hotspot/share/runtime/os.hpp??? Mon Aug 31 23:24:27 2020 -0700 @@ -51,7 +51,7 @@ ?class OSThread; ?class Mutex; -typedef struct _jvmtiTimerInfo jvmtiTimerInfo; +struct jvmtiTimerInfo; ?template class GrowableArray; ======= From daniel.daugherty at oracle.com Tue Sep 1 13:34:51 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 1 Sep 2020 09:34:51 -0400 Subject: RFR(M) 8252526 Remove excessive inclusion of jvmti.h and jvmtiExport.hpp In-Reply-To: References: <610aa37d-3f5e-cae7-197a-b3294b217aac@oracle.com> <198a1835-a17b-a83f-145b-e49f5ed37e10@oracle.com> <687bda99-e689-7c53-ee29-f624029f1d1b@oracle.com> Message-ID: <7f275aab-565d-f566-702e-2e6dd6a6d3d7@oracle.com> On 9/1/20 9:24 AM, Ioi Lam wrote: > > > On 8/31/20 9:38 PM, David Holmes wrote: >> On 1/09/2020 2:00 pm, Ioi Lam wrote: >>> On 8/31/20 6:17 PM, Ioi Lam wrote: >>>> On 8/31/20 4:05 PM, David Holmes wrote: >>>>> Hi Ioi, >>>>> >>>>> I haven't looked at the code changes ... >>>>> >>>>> On 1/09/2020 4:13 am, Ioi Lam wrote: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8252526 >>>>>> http://cr.openjdk.java.net/~iklam/jdk16/8252526-fix-jvmti-hpp.v01/ >>>>>> >>>>>> (I marked this RFR as "M" because 63 files have changed, but most >>>>>> of the are >>>>>> just adding a missing #include "prims/jvmtiExport.hpp"). >>>>>> >>>>>> jvmti.h is included 905 times and jvmtiExport.hpp is included 776 >>>>>> times >>>>>> (by 971 hotspot .o files). Most of these are unnecessarily >>>>>> included by the >>>>>> following 3 popular header files: >>>>>> >>>>>> [1] javaClasses.hpp: ThreadStatus is rarely used, and should be >>>>>> moved >>>>>> ?? ? to javaThreadStatus.hpp. I also converted the enum to an C++ 11 >>>>>> ?? ? enum class for better type safety. (see also JDK-8247938). >>>>>> >>>>>> [2] os.hpp: No need to include jvm.h. Use forward declaration >>>>>> ?? ? "typedef struct _jvmtiTimerInfo jvmtiTimerInfo;" instead. >>>>> >>>>> That does not seem reasonable to me. It is one thing to do a >>>>> simple forward declaration of a class but this is an internal >>>>> detail of JVMTI which os.hpp has no business knowing about. >>>>> >>>> >>>> How about changing jvmti.h from: >>>> >>>> struct _jvmtiTimerInfo; >>>> typedef struct _jvmtiTimerInfo jvmtiTimerInfo; >>>> >>>> to >>>> >>>> struct jvmtiTimerInfo; >>>> typedef struct jvmtiTimerInfo jvmtiTimerInfo; >>>> >>>> Then os.hpp can declare: >>>> >>>> struct jvmtiTimerInfo; >> >> Does that actually work? If so that's equivalent to "class Foo;" >> forward declarations and so is acceptable. > Yes, it works, but I would need to file a CSR because jvmti.h is > included in the JDK .... Yes, jvmti.h is included in the JDK, but the name exposed by the spec is 'jvmtiTimerInfo'. The '_jvmtiTimerInfo' name isn't exposed by the spec at all. You could change the name to 'DO_NOT_USE_THIS_NAME_jvmtiTimerInfo' and nothing should break from a JVM/TI spec POV. Of course, if there is code out in the world that depended on that name, then it will break, but I would argue that code is broken. Short version: I think '_jvmtiTimerInfo' is an implementation detail and you don't need a CSR for correctness. You might want a CSR for advice since this is an odd situation, but, strictly from an API POV, I don't think you need one. Dan > > Thanks > - Ioi > > ======= > $ hg diff > diff -r aaa4245df83a src/hotspot/share/prims/jvmtiH.xsl > --- a/src/hotspot/share/prims/jvmtiH.xsl??? Mon Aug 31 11:03:13 2020 > -0700 > +++ b/src/hotspot/share/prims/jvmtiH.xsl??? Mon Aug 31 23:24:27 2020 > -0700 > @@ -406,11 +406,11 @@ > ? > > ? > -? struct _ > +? struct > ?? > ?? ; > ? > -? typedef struct _ > +? typedef struct > ?? > ?? > ?? > @@ -419,7 +419,7 @@ > ? > > ? > -? struct _ > +? struct > ?? > ?? { > ? > diff -r aaa4245df83a src/hotspot/share/runtime/os.hpp > --- a/src/hotspot/share/runtime/os.hpp??? Mon Aug 31 11:03:13 2020 -0700 > +++ b/src/hotspot/share/runtime/os.hpp??? Mon Aug 31 23:24:27 2020 -0700 > @@ -51,7 +51,7 @@ > ?class OSThread; > ?class Mutex; > > -typedef struct _jvmtiTimerInfo jvmtiTimerInfo; > +struct jvmtiTimerInfo; > > ?template class GrowableArray; > ======= > > > > > > > From coleen.phillimore at oracle.com Tue Sep 1 13:43:41 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 1 Sep 2020 09:43:41 -0400 Subject: RFR (T) 8252652: Buggy looking null check in ServiceThread::oops_do() Message-ID: <4446c3a2-9801-8224-cbc3-5785e958afbc@oracle.com> Summary: Remove the null check. Tested with jvmti tests and tier1 on all platforms. open webrev at http://cr.openjdk.java.net/~coleenp/2020/8252652.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8252652 Thanks, Coleen From stefan.karlsson at oracle.com Tue Sep 1 13:46:23 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 1 Sep 2020 15:46:23 +0200 Subject: RFR (T) 8252652: Buggy looking null check in ServiceThread::oops_do() In-Reply-To: <4446c3a2-9801-8224-cbc3-5785e958afbc@oracle.com> References: <4446c3a2-9801-8224-cbc3-5785e958afbc@oracle.com> Message-ID: <1f7bd40d-ef2b-34ad-61de-7e5a8aeab394@oracle.com> Looks good. Thanks for fixing! StefanK On 2020-09-01 15:43, Coleen Phillimore wrote: > Summary: Remove the null check. > > Tested with jvmti tests and tier1 on all platforms. > > open webrev at http://cr.openjdk.java.net/~coleenp/2020/8252652.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8252652 > > Thanks, > Coleen From thomas.schatzl at oracle.com Tue Sep 1 14:19:11 2020 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 1 Sep 2020 16:19:11 +0200 Subject: RFR (T) 8252652: Buggy looking null check in ServiceThread::oops_do() In-Reply-To: <4446c3a2-9801-8224-cbc3-5785e958afbc@oracle.com> References: <4446c3a2-9801-8224-cbc3-5785e958afbc@oracle.com> Message-ID: Hi, On 01.09.20 15:43, Coleen Phillimore wrote: > Summary: Remove the null check. > > Tested with jvmti tests and tier1 on all platforms. > > open webrev at http://cr.openjdk.java.net/~coleenp/2020/8252652.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8252652 > > Thanks, > Coleen looks good to me too. Thomas From coleen.phillimore at oracle.com Tue Sep 1 14:19:48 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 1 Sep 2020 10:19:48 -0400 Subject: RFR (T) 8252652: Buggy looking null check in ServiceThread::oops_do() In-Reply-To: <1f7bd40d-ef2b-34ad-61de-7e5a8aeab394@oracle.com> References: <4446c3a2-9801-8224-cbc3-5785e958afbc@oracle.com> <1f7bd40d-ef2b-34ad-61de-7e5a8aeab394@oracle.com> Message-ID: <57a980fe-7517-9585-ce07-b061a34eb146@oracle.com> Thanks Stefan and thanks for finding it. Coleen On 9/1/20 9:46 AM, Stefan Karlsson wrote: > Looks good. Thanks for fixing! > > StefanK > > On 2020-09-01 15:43, Coleen Phillimore wrote: >> Summary: Remove the null check. >> >> Tested with jvmti tests and tier1 on all platforms. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2020/8252652.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8252652 >> >> Thanks, >> Coleen From coleen.phillimore at oracle.com Tue Sep 1 14:23:45 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 1 Sep 2020 10:23:45 -0400 Subject: RFR (T) 8252652: Buggy looking null check in ServiceThread::oops_do() In-Reply-To: References: <4446c3a2-9801-8224-cbc3-5785e958afbc@oracle.com> Message-ID: Thanks Thomas.? I should have waited for your second review to push it. Coleen On 9/1/20 10:19 AM, Thomas Schatzl wrote: > Hi, > > On 01.09.20 15:43, Coleen Phillimore wrote: >> Summary: Remove the null check. >> >> Tested with jvmti tests and tier1 on all platforms. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2020/8252652.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8252652 >> >> Thanks, >> Coleen > > ? looks good to me too. > > Thomas From thomas.schatzl at oracle.com Tue Sep 1 14:27:42 2020 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 1 Sep 2020 16:27:42 +0200 Subject: RFR (T) 8252652: Buggy looking null check in ServiceThread::oops_do() In-Reply-To: References: <4446c3a2-9801-8224-cbc3-5785e958afbc@oracle.com> Message-ID: <0b194a3b-1805-5eaf-39a4-793533695b85@oracle.com> Hi, On 01.09.20 16:23, Coleen Phillimore wrote: > Thanks Thomas.? I should have waited for your second review to push it. > Coleen np. Thomas From bob.vandette at oracle.com Tue Sep 1 15:08:06 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 1 Sep 2020 11:08:06 -0400 Subject: Regression in JDK15/16: CGroup v2 support In-Reply-To: References: <260D8760-CB66-43FE-8870-1AD7A2A6336E@microsoft.com> Message-ID: <43BD8476-04B8-4BC3-AA35-55F0ED6BCC0A@oracle.com> Thanks David. The problem seems to be caused by improper scanning of /proc/mountinfo. Failing System 439 434 0:33 /docker/0a67832faae434cc8fa0a942e04a308bb8b120b5099018f1ee09f1d7a52746b7 /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime master:22 - cgroup memory rw,memory Working System 36 31 0:31 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,memory When the cgroupv2 support was added we started scanning the second ?cgroup? entry in /proc/mountinfo as the file system type. According to the man pages, the first ?cgroup? is the file system type. Entry position 10 in [1]mountinfo is ?mount source: filesystem-specific information?. All systems we?ve tested on thus far have had two cgroup strings so this regression was not caught. I?ll work on a fix. [1] https://man7.org/linux/man-pages/man5/proc.5.htm Bob. > On Sep 1, 2020, at 9:23 AM, David Holmes wrote: > > On 1/09/2020 10:51 pm, Bob Vandette wrote: >> Bruno, >> I can take a look. Is there a bugs.openjdk.java.net bug filed yet? >> Please send me information on this issue. I can?t locate bug 9066610 in the bug database (bugreport.java.com). > > That was the JI number. It is now: > > https://bugs.openjdk.java.net/browse/JDK-8252359 > > David > >> Bob. >>> On Aug 30, 2020, at 6:59 PM, Bruno Borges wrote: >>> >>> Hi all, >>> >>> Just wanted to check if anyone at Oracle had a chance to review bug 9066610 that was submitted last week. >>> >>> We would like to continue our discussion and a proposed fix for evaluation. >>> >>> ?On 2020-08-26, 12:06 AM, "David Holmes" wrote: >>> >>> Please note that the discuss at openjdk.java.net mailing list is not the >>> appropriate place to discuss this kind of issue. hotspot-runtime-dev >>> would be the correct place to discuss this. >>> >>> Thanks, >>> David >>> >>> On 25/08/2020 7:29 pm, Bruno Borges wrote: >>>> No worries. >>>> >>>> I am not aware of any other way for non-OpenJDK Authors to submit a bug to OpenJDK except through bugreport.java.com. If there is, happy to follow that for any future issue. >>>> >>>> bb. >>>> >>>> On 2020-08-25, 1:38 AM, "Severin Gehwolf" wrote: >>>> >>>> Hi Bruno, >>>> >>>> On Tue, 2020-08-25 at 08:05 +0000, Bruno Borges wrote: >>>>> Hi Severin, >>>>> >>>>> Issue created: 9066610. >>>> >>>> Thanks. We might need somebody from Oracle to push this one through to >>>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2F&data=02%7C01%7CBruno.Borges%40microsoft.com%7C60dffad37c904298679408d8498e85f3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637340223777936292&sdata=upvvC%2BHc49Il6L%2B6koJYhVj2hwJhPxDBlApIwKH6pN0%3D&reserved=0. 9XXXX bugs are AFAIK created by the web- >>>> interface and need active triage to show up as JDK-8XXXX bugs. >>>> >>>>> Charlie has a fix to propose. >>>> >>>> Great! Looking forward to it. >>>> >>>> Thanks, >>>> Severin >>>> >>>> >>> From harold.seigel at oracle.com Tue Sep 1 15:31:17 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Tue, 1 Sep 2020 11:31:17 -0400 Subject: RFR 8250984: Memory Docker tests fail on some Linux kernels w/o swap limit capabilities Message-ID: Hi, Please review this fix to enable docker tests TestMemoryAwareness.java and TestDockerMemoryMetrics.java to run on Linux kernels configured without swap limit capabilities. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr/webrev/index.html JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8250984 The modified tests were run on Linux kernels with and without swap limit capabilities. Thanks, Harold From erik.osterlund at oracle.com Tue Sep 1 15:53:18 2020 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 1 Sep 2020 17:53:18 +0200 Subject: RFR: 8252661: Change SafepointMechanism terminology to talk less about "blocking" Message-ID: <47e8f00d-dc4d-4ebe-922e-f086cf11d323@oracle.com> Hi, The SafepointMechanism class has been used to perform safepoint operations, originally. Now we also perform handshake operations, and soon also concurrent stack processing, using the same hooks. Therefore, names such as SafepointMechanism::should_block no longer sound right, when the real question is whether it should process a pending operation (be that a safepoint, handshake or whatever else). Naming is hard, so I don't want this discussion in my concurrent stack processing patch. I have a webrev with proposed naming changes to better reflect how this is used: http://cr.openjdk.java.net/~eosterlund/8252661/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8252661 Thanks, /Erik From bob.vandette at oracle.com Tue Sep 1 16:04:38 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 1 Sep 2020 12:04:38 -0400 Subject: RFR 8250984: Memory Docker tests fail on some Linux kernels w/o swap limit capabilities In-Reply-To: References: Message-ID: <8E9ADAB4-1877-40EF-9BB0-D35A30572AE8@oracle.com> I really dislike encoding all these strings in our tests that could possibly change. I wish we did something like check for the existence of /sys/fs/cgroup/memory/memsw.limit_in_bytes assuming that this file is not present when swap limiting is disabled. The problem with this approach and yours is that we need to make that these fixes we can run on docker, podman, cgroupv1 and cgroupv2. Others are struggling with these types of issues ? https://github.com/containers/podman/issues/6365 The Metrics API I added provides for the possibility that the call to getMemoryAndSwapLimit could fail. Perhaps the test should be checking for not supported and fix the API implementation to report the correct error (if it doesn?t already). /** * Returns the maximum amount of physical memory and swap space, * in bytes, that can be allocated in the Isolation Group. * * @return The maximum amount of memory in bytes or -1 if * there is no limit set or -2 if this metric is not supported. * */ public long getMemoryAndSwapLimit(); My .02$ Bob. > On Sep 1, 2020, at 11:31 AM, Harold Seigel wrote: > > Hi, > > Please review this fix to enable docker tests TestMemoryAwareness.java and TestDockerMemoryMetrics.java to run on Linux kernels configured without swap limit capabilities. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8250984 > > The modified tests were run on Linux kernels with and without swap limit capabilities. > > Thanks, Harold > From gerard.ziemski at oracle.com Tue Sep 1 16:36:19 2020 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Tue, 1 Sep 2020 11:36:19 -0500 Subject: Attention AIX developers - factoring out POSIX signal code (JDK-8252324) In-Reply-To: References: <022e6096-a388-dc79-f959-00a2d933ae89@oracle.com> <1d5921d1-1cb4-b286-a95b-7e2d4b60ced0@oracle.com> Message-ID: Thank you Martin, Thomas for your feedback. On 8/31/20 12:20 PM, Doerr, Martin wrote: > > However, jdk/jdk doesn?t support as400 PASE, so I?d be fine with using > Gerard?s new POSIX code and removing all AIX specific stuff which was > built to support as400 PASE. > > I believe the semaphore stuff works on AIX, so we don?t need ?#if > !defined(AIX)?, right? > Just to make sure I understand it correctly: we don't need AIX specific semaphore code (i.e. local_sem_init(), local_sem_post() and local_sem_wait()), so we can remove it and go back to using runtime Semaphore for BSD/Linux/AIX? What other PASE specific code is currently in singlas_posix.cpp that should be removed? I see some extra signal unblocking code in os::signal() that is for AIX platform (with a comment that it applies to both AIX and PASE). That should stay in though, right? Is this something you want me to do, or will you clean that up yourself as appropriate? @Thomas, I assigned https://bugs.openjdk.java.net/browse/JDK-825253 to myself as per your suggestion. cheers From vladimir.kozlov at oracle.com Tue Sep 1 17:35:13 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 1 Sep 2020 10:35:13 -0700 Subject: 8248337: sparc related code clean up after solaris removal In-Reply-To: References: Message-ID: <8476b337-f034-19ce-e502-313da7d048c7@oracle.com> On 8/31/20 6:18 PM, David Holmes wrote: > Hi Yumin, > > On 1/09/2020 7:32 am, Yumin Qi wrote: >> HI, >> >> ?? Please review for >> >> ?? bug: https://bugs.openjdk.java.net/browse/JDK-8248337 >> >> ?? webrev:http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-01/ >> >> >> ?? Summary: After Solaris supported files removed from repo, there are some remnants which needs cleaning up. Some >> comments are not correct, and some refer to wrong files. > > Those changes are mostly okay but I have a few minor issues/suggestions below. > >> There is a flag seems only useful for Sparc: UseRDPCForConstantTableBase, which got removed in this patch . > > Despite the description of the flag it is far from clear that the use of the flag affects sparc only. It affects the > pinned() function so seems somewhat platform agnostic in that sense - which is why this was not dealt with in the SPARC > removal process. I think this needs closer examination by the compiler folk, with a recommendation on whether it > can/should be changed or not. Regardless as this is a product flag then I think this change should be factored out and > we go through the appropriate deprecate/obsolete/expire process. The flag was used to use special SPARC instruction for CPUs supporting it to load base of Constant table. It is useless for other platforms. MachConstantBaseNode::pinned() method can be removed because it inherits the method from Node::pinned() which returns 'false' too. And I agree with David that it should be done separately because it is product flag. > >> Also in postaloc.cpp, the delay slot seems is only for sparc too, but I am not sure about that. Most of the patch are >> in comment section. > > It refers to spill slot not delay slot. I don't see anything obviously sparc specific about that block of code. Please, leave the code as it is. As David said it is about normal spill slots for all platforms. I am not sure it is SPARC specific currently with all platforms OpenJDK supports. If you want you can file RFE to replace code with assert and ask community to run a lot of testing to see if we hit the assert. > > Specific comments: > > src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp > > -// 64 bits items (sparc abi) even though java would only store > +// 64 bits items even though java would only store > > Should "(sparc abi)" be replaced with "(Aarch64 abi)" as you did for other platforms? > > --- > > src/hotspot/cpu/arm/frame_arm.hpp (and other files) > > ?? // The interpreter and adapters will extend the frame of the caller. > ?? // Since oopMaps are based on the sp of the caller before extension > -? // we need to know that value. However in order to compute the address > -? // of the return address we need the real "raw" sp. Since sparc already > -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's > -? // original sp we use that convention. > +? // we need to know that value. However in order to compute the return > +? // address we need the real "raw" sp. > > I think this is losing too much information as it no longer describes the convention. I would suggest: > > ?? // The interpreter and adapters will extend the frame of the caller. > ?? // Since oopMaps are based on the sp of the caller before extension > ?? // we need to know that value. However in order to compute the address > -? // of the return address we need the real "raw" sp. Since sparc already > -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's > -? // original sp we use that convention. > +? // of the return address we need the real "raw" sp. By convention we > +? // use sp() to mean "raw" sp and unextended_sp() to mean the caller's > +? // original sp. > > --- > > src/hotspot/cpu/ppc/jniTypes_ppc.hpp > > -? // stubGenerator_sparc.cpp) reverse the argument list constructed by > +? // stubGenerator_${CPU}.cpp) reverse the argument list constructed by > > Just replace sparc with ppc as done for other platforms. > > --- > > src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp > > -? // This greatly simplifies the cases here compared to sparc. > +? // This greatly simplifies the cases here. > > Just delete the comment as there is nothing to compare simplicity or complexity against. > > --- > > src/hotspot/share/c1/c1_LIRGenerator.cpp > > -? // In 64bit the type can be long, sparc doesn't have this assert > +? // In 64bit the type can be long > ?? // assert(offset.type()->tag() == intTag, "invalid type"); > > compiler folk should decide what to do here but I think the comment and commented out assert can just be deleted. Yes, remove commented assert too. Originally it was platform specific code - the assert was there for 32-bit. Thanks, Vladimir K > > --- > > src/hotspot/share/c1/c1_Runtime1.cpp > > -? case handle_exception_nofpu_id:? // Unused on sparc > +? case handle_exception_nofpu_id:? // unused. > > the new comment is incorrect as this case is not unused. I suggest just deleting the comment. > > Thanks, > David > ----- > >> >> >> ?? Tests passed tier1-4 >> >> >> ?? Thanks >> >> ?? Yumin >> From yumin.qi at oracle.com Tue Sep 1 17:59:07 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Tue, 1 Sep 2020 10:59:07 -0700 Subject: 8248337: sparc related code clean up after solaris removal In-Reply-To: <8476b337-f034-19ce-e502-313da7d048c7@oracle.com> References: <8476b337-f034-19ce-e502-313da7d048c7@oracle.com> Message-ID: <7ee63f44-18e3-9567-a098-857c98638bdc@oracle.com> HI, Vladimir ? Thanks for review! On 9/1/20 10:35 AM, Vladimir Kozlov wrote: > On 8/31/20 6:18 PM, David Holmes wrote: >> Hi Yumin, >> >> On 1/09/2020 7:32 am, Yumin Qi wrote: >>> HI, >>> >>> ?? Please review for >>> >>> ?? bug: https://bugs.openjdk.java.net/browse/JDK-8248337 >>> >>> webrev:http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-01/ >>> >>> >>> ?? Summary: After Solaris supported files removed from repo, there are some remnants which needs cleaning up. Some comments are not correct, and some refer to wrong files. >> >> Those changes are mostly okay but I have a few minor issues/suggestions below. >> >>> There is a flag seems only useful for Sparc: UseRDPCForConstantTableBase, which got removed in this patch . >> >> Despite the description of the flag it is far from clear that the use of the flag affects sparc only. It affects the pinned() function so seems somewhat platform agnostic in that sense - which is why this was not dealt with in the SPARC removal process. I think this needs closer examination by the compiler folk, with a recommendation on whether it can/should be changed or not. Regardless as this is a product flag then I think this change should be factored out and we go through the appropriate deprecate/obsolete/expire process. > > The flag was used to use special SPARC instruction for CPUs supporting it to load base of Constant table. > It is useless for other platforms. MachConstantBaseNode::pinned() method can be removed because it inherits the method from Node::pinned() which returns 'false' too. > > And I agree with David that it should be done separately because it is product flag. > I will file a bug for this be handled separately. >> >>> Also in postaloc.cpp, the delay slot seems is only for sparc too, but I am not sure about that. Most of the patch are in comment section. >> >> It refers to spill slot not delay slot. I don't see anything obviously sparc specific about that block of code. > > Please, leave the code as it is. As David said it is about normal spill slots for all platforms. > I am not sure it is SPARC specific currently with all platforms OpenJDK supports. > If you want you can file RFE to replace code with assert and ask community to run a lot of testing to see if we hit the assert. > OK, I will keep it as it was. I will file a RFE and assign it to the right group (compiler) for further investigation. >> >> Specific comments: >> >> src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp >> >> -// 64 bits items (sparc abi) even though java would only store >> +// 64 bits items even though java would only store >> >> Should "(sparc abi)" be replaced with "(Aarch64 abi)" as you did for other platforms? >> >> --- >> >> src/hotspot/cpu/arm/frame_arm.hpp (and other files) >> >> ??? // The interpreter and adapters will extend the frame of the caller. >> ??? // Since oopMaps are based on the sp of the caller before extension >> -? // we need to know that value. However in order to compute the address >> -? // of the return address we need the real "raw" sp. Since sparc already >> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >> -? // original sp we use that convention. >> +? // we need to know that value. However in order to compute the return >> +? // address we need the real "raw" sp. >> >> I think this is losing too much information as it no longer describes the convention. I would suggest: >> >> ??? // The interpreter and adapters will extend the frame of the caller. >> ??? // Since oopMaps are based on the sp of the caller before extension >> ??? // we need to know that value. However in order to compute the address >> -? // of the return address we need the real "raw" sp. Since sparc already >> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >> -? // original sp we use that convention. >> +? // of the return address we need the real "raw" sp. By convention we >> +? // use sp() to mean "raw" sp and unextended_sp() to mean the caller's >> +? // original sp. >> >> --- >> >> src/hotspot/cpu/ppc/jniTypes_ppc.hpp >> >> -? // stubGenerator_sparc.cpp) reverse the argument list constructed by >> +? // stubGenerator_${CPU}.cpp) reverse the argument list constructed by >> >> Just replace sparc with ppc as done for other platforms. >> >> --- >> >> src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp >> >> -? // This greatly simplifies the cases here compared to sparc. >> +? // This greatly simplifies the cases here. >> >> Just delete the comment as there is nothing to compare simplicity or complexity against. >> >> --- >> >> src/hotspot/share/c1/c1_LIRGenerator.cpp >> >> -? // In 64bit the type can be long, sparc doesn't have this assert >> +? // In 64bit the type can be long >> ??? // assert(offset.type()->tag() == intTag, "invalid type"); >> >> compiler folk should decide what to do here but I think the comment and commented out assert can just be deleted. > > Yes, remove commented assert too. Originally it was platform specific code - the assert was there for 32-bit. > OK, will remove? them all. Thanks Yumin > Thanks, > Vladimir K > >> >> --- >> >> src/hotspot/share/c1/c1_Runtime1.cpp >> >> -? case handle_exception_nofpu_id:? // Unused on sparc >> +? case handle_exception_nofpu_id:? // unused. >> >> the new comment is incorrect as this case is not unused. I suggest just deleting the comment. >> >> Thanks, >> David >> ----- >> >>> >>> >>> ?? Tests passed tier1-4 >>> >>> >>> ?? Thanks >>> >>> ?? Yumin >>> From harold.seigel at oracle.com Tue Sep 1 19:02:42 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Tue, 1 Sep 2020 15:02:42 -0400 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" Message-ID: Hi, Please review this change to hotspot test vmTestbase/nsk/stress/stack/stack016.java.? The test calls a recursive method and keeps track of the number of repetitions needed to cause an exception.? It then runs a bunch of threads that call the recursive method for a multiple of the repetition number, expecting each of them to get a StackOverflowError or OutOfMemoryError exception.? Occasionally, the test fails because one of the threads does not throw an exception. This change tries to fix this in two ways.? One, by making sure that the thread used to determine the number of repetitions gets a StackOverflowError or OutOfMemoryError exception, and not some other unexpected exception.? The other way is to run the test twice, once with -Xcomp and once with -Xint, to ensure that thread stack consumption doesn't vary because the original thread called an interpreted method and a subsequent thread called a compiled method. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8252249.stack/webrev/index.html JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8252249 The modified test was tested on Mac OS, Linux x64, and Windows. Thanks, Harold From harold.seigel at oracle.com Tue Sep 1 19:43:58 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Tue, 1 Sep 2020 15:43:58 -0400 Subject: RFR 8250984: Memory Docker tests fail on some Linux kernels w/o swap limit capabilities In-Reply-To: <8E9ADAB4-1877-40EF-9BB0-D35A30572AE8@oracle.com> References: <8E9ADAB4-1877-40EF-9BB0-D35A30572AE8@oracle.com> Message-ID: <781327d5-6c8b-d7a9-e011-2c4debafd1b6@oracle.com> Hi Bob, Thanks for looking at this! I'll investigate the problem some more based on your suggestions. Harold On 9/1/2020 12:04 PM, Bob Vandette wrote: > I really dislike encoding all these strings in our tests that could possibly change. > > I wish we did something like check for the existence of /sys/fs/cgroup/memory/memsw.limit_in_bytes > assuming that this file is not present when swap limiting is disabled. The problem with this approach > and yours is that we need to make that these fixes we can run on docker, podman, cgroupv1 and cgroupv2. > > Others are struggling with these types of issues ? > > https://github.com/containers/podman/issues/6365 > > The Metrics API I added provides for the possibility that the call to getMemoryAndSwapLimit > could fail. Perhaps the test should be checking for not supported and fix the API implementation > to report the correct error (if it doesn?t already). > > /** > * Returns the maximum amount of physical memory and swap space, > * in bytes, that can be allocated in the Isolation Group. > * > * @return The maximum amount of memory in bytes or -1 if > * there is no limit set or -2 if this metric is not supported. > * > */ > public long getMemoryAndSwapLimit(); > > My .02$ > > Bob. > >> On Sep 1, 2020, at 11:31 AM, Harold Seigel wrote: >> >> Hi, >> >> Please review this fix to enable docker tests TestMemoryAwareness.java and TestDockerMemoryMetrics.java to run on Linux kernels configured without swap limit capabilities. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8250984 >> >> The modified tests were run on Linux kernels with and without swap limit capabilities. >> >> Thanks, Harold >> From sgehwolf at redhat.com Tue Sep 1 19:55:51 2020 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 01 Sep 2020 21:55:51 +0200 Subject: RFR 8250984: Memory Docker tests fail on some Linux kernels w/o swap limit capabilities In-Reply-To: References: Message-ID: <8b8a0f3ef269096bb958bf045d7f60fa0de8dd4e.camel@redhat.com> Hi, On Tue, 2020-09-01 at 11:31 -0400, Harold Seigel wrote: > Hi, > > Please review this fix to enable docker tests TestMemoryAwareness.java > and TestDockerMemoryMetrics.java to run on Linux kernels configured > without swap limit capabilities. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr/webrev/index.html Can we be sure that the message is the same for docker and podman for example? The message seems docker specific[1]. I'll try to test this on a podman system with swap turned off. Thanks, Severin [1] https://github.com/moby/moby/blob/5fc12449d830ae9005138fb3d3782728fa8d137a/daemon/daemon_unix.go#L368 > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8250984 > > The modified tests were run on Linux kernels with and without swap limit > capabilities. > > Thanks, Harold > From martin.doerr at sap.com Tue Sep 1 20:11:51 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 1 Sep 2020 20:11:51 +0000 Subject: Attention AIX developers - factoring out POSIX signal code (JDK-8252324) In-Reply-To: References: <022e6096-a388-dc79-f959-00a2d933ae89@oracle.com> <1d5921d1-1cb4-b286-a95b-7e2d4b60ced0@oracle.com> Message-ID: Hi Gerard, it compiles without errors and warnings with the following patch. I don?t think we still need this old stuff, but I hope Thomas can find some time to double-check. I haven?t run it through our testing, yet. Best regards, Martin diff -r ec42084221a6 src/hotspot/os/aix/os_aix.cpp --- a/src/hotspot/os/aix/os_aix.cpp Tue Sep 01 21:54:57 2020 +0200 +++ b/src/hotspot/os/aix/os_aix.cpp Tue Sep 01 22:05:45 2020 +0200 @@ -1563,61 +1563,16 @@ os::signal_notify(sig); } -void* os::user_handler() { - return CAST_FROM_FN_PTR(void*, UserHandler); -} - extern "C" { typedef void (*sa_handler_t)(int); typedef void (*sa_sigaction_t)(int, siginfo_t *, void *); } -void* os::signal(int signal_number, void* handler) { - struct sigaction sigAct, oldSigAct; - - sigfillset(&(sigAct.sa_mask)); - - // Do not block out synchronous signals in the signal handler. - // Blocking synchronous signals only makes sense if you can really - // be sure that those signals won't happen during signal handling, - // when the blocking applies. Normal signal handlers are lean and - // do not cause signals. But our signal handlers tend to be "risky" - // - secondary SIGSEGV, SIGILL, SIGBUS' may and do happen. - // On AIX, PASE there was a case where a SIGSEGV happened, followed - // by a SIGILL, which was blocked due to the signal mask. The process - // just hung forever. Better to crash from a secondary signal than to hang. - sigdelset(&(sigAct.sa_mask), SIGSEGV); - sigdelset(&(sigAct.sa_mask), SIGBUS); - sigdelset(&(sigAct.sa_mask), SIGILL); - sigdelset(&(sigAct.sa_mask), SIGFPE); - sigdelset(&(sigAct.sa_mask), SIGTRAP); - - sigAct.sa_flags = SA_RESTART|SA_SIGINFO; - - sigAct.sa_handler = CAST_TO_FN_PTR(sa_handler_t, handler); - - if (sigaction(signal_number, &sigAct, &oldSigAct)) { - // -1 means registration failed - return (void *)-1; - } - - return CAST_FROM_FN_PTR(void*, oldSigAct.sa_handler); -} - -void os::signal_raise(int signal_number) { - ::raise(signal_number); -} - // // The following code is moved from os.cpp for making this // code platform specific, which it is by its very nature. // -// Will be modified when max signal is changed to be dynamic -int os::sigexitnum_pd() { - return NSIG; -} - // a counter for each possible signal value static volatile jint pending_signals[NSIG+1] = { 0 }; @@ -1687,11 +1642,6 @@ local_sem_init(); } -void os::signal_notify(int sig) { - Atomic::inc(&pending_signals[sig]); - local_sem_post(); -} - static int check_pending_signals() { for (;;) { for (int i = 0; i < NSIG + 1; i++) { @@ -1728,9 +1678,6 @@ } } -int os::signal_wait() { - return check_pending_signals(); -} //////////////////////////////////////////////////////////////////////////////// // Virtual Memory diff -r ec42084221a6 src/hotspot/os/posix/signals_posix.cpp --- a/src/hotspot/os/posix/signals_posix.cpp Tue Sep 01 21:54:57 2020 +0200 +++ b/src/hotspot/os/posix/signals_posix.cpp Tue Sep 01 22:05:45 2020 +0200 @@ -95,12 +95,7 @@ #endif // sun.misc.Signal support -#if !defined(AIX) - static Semaphore* sig_semaphore = NULL; -#else - static sem_t sig_semaphore; - static msemaphore* p_sig_msem = 0; -#endif +static Semaphore* sig_semaphore = NULL; // a counter for each possible signal value static volatile jint pending_signals[NSIG+1] = { 0 }; @@ -272,71 +267,17 @@ // sun.misc.Signal support static void local_sem_init() { -#if !defined(AIX) sig_semaphore = new Semaphore(); -#else - if (os::Aix::on_aix()) { - int rc = ::sem_init(&sig_semaphore, 0, 0); - guarantee(rc != -1, "sem_init failed"); - } else { - // Memory semaphores must live in shared mem. - guarantee0(p_sig_msem == NULL); - p_sig_msem = (msemaphore*)os::reserve_memory(sizeof(msemaphore), NULL); - guarantee(p_sig_msem, "Cannot allocate memory for memory semaphore"); - guarantee(::msem_init(p_sig_msem, 0) == p_sig_msem, "msem_init failed"); - } -#endif } // Wrapper functions for: sem_init(), sem_post(), sem_wait() -// On AIX, we use sem_init(), sem_post(), sem_wait() -// On Pase, we need to use msem_lock() and msem_unlock(), because Posix Semaphores -// do not seem to work at all on PASE (unimplemented, will cause SIGILL). -// Note that just using msem_.. APIs for both PASE and AIX is not an option either, as -// on AIX, msem_..() calls are suspected of causing problems. static void local_sem_post() { -#if !defined(AIX) sig_semaphore->signal(); -#else - static bool warn_only_once = false; - if (os::Aix::on_aix()) { - int rc = ::sem_post(&sig_semaphore); - if (rc == -1 && !warn_only_once) { - trcVerbose("sem_post failed (errno = %d, %s)", errno, os::errno_name(errno)); - warn_only_once = true; - } - } else { - guarantee0(p_sig_msem != NULL); - int rc = ::msem_unlock(p_sig_msem, 0); - if (rc == -1 && !warn_only_once) { - trcVerbose("msem_unlock failed (errno = %d, %s)", errno, os::errno_name(errno)); - warn_only_once = true; - } - } - #endif } static void local_sem_wait() { -#if !defined(AIX) sig_semaphore->wait(); -#else - static bool warn_only_once = false; - if (os::Aix::on_aix()) { - int rc = ::sem_wait(&sig_semaphore); - if (rc == -1 && !warn_only_once) { - trcVerbose("sem_wait failed (errno = %d, %s)", errno, os::errno_name(errno)); - warn_only_once = true; - } - } else { - guarantee0(p_sig_msem != NULL); // must init before use - int rc = ::msem_lock(p_sig_msem, 0); - if (rc == -1 && !warn_only_once) { - trcVerbose("msem_lock failed (errno = %d, %s)", errno, os::errno_name(errno)); - warn_only_once = true; - } - } - #endif } void PosixSignals::jdk_misc_signal_init() { @@ -1573,7 +1514,6 @@ os::SuspendResume::State current = osthread->sr.state(); // TODO: reconcile the differences betrween Linux/BSD vs AIX here! -#if !defined(AIX) if (current == os::SuspendResume::SR_SUSPEND_REQUEST) { suspend_save_context(osthread, siginfo, context); @@ -1617,45 +1557,6 @@ } else { // ignore } -#else - if (current == os::SuspendResume::SR_SUSPEND_REQUEST) { - suspend_save_context(osthread, siginfo, context); - - // attempt to switch the state, we assume we had a SUSPEND_REQUEST - os::SuspendResume::State state = osthread->sr.suspended(); - if (state == os::SuspendResume::SR_SUSPENDED) { - sigset_t suspend_set; // signals for sigsuspend() - sigemptyset(&suspend_set); - - // get current set of blocked signals and unblock resume signal - pthread_sigmask(SIG_BLOCK, NULL, &suspend_set); - sigdelset(&suspend_set, SR_signum); - - // wait here until we are resumed - while (1) { - sigsuspend(&suspend_set); - - os::SuspendResume::State result = osthread->sr.running(); - if (result == os::SuspendResume::SR_RUNNING) { - break; - } - } - - } else if (state == os::SuspendResume::SR_RUNNING) { - // request was cancelled, continue - } else { - ShouldNotReachHere(); - } - - resume_clear_context(osthread); - } else if (current == os::SuspendResume::SR_RUNNING) { - // request was cancelled, continue - } else if (current == os::SuspendResume::SR_WAKEUP_REQUEST) { - // ignore - } else { - ShouldNotReachHere(); - } -#endif errno = old_errno; } @@ -1721,7 +1622,6 @@ } // TODO: reconcile the differences betrween Linux/BSD vs AIX here! -#if !defined(AIX) if (sr_notify(osthread) != 0) { ShouldNotReachHere(); } @@ -1745,44 +1645,6 @@ } } } -#else - if (sr_notify(osthread) != 0) { - // try to cancel, switch to running - - os::SuspendResume::State result = osthread->sr.cancel_suspend(); - if (result == os::SuspendResume::SR_RUNNING) { - // cancelled - return false; - } else if (result == os::SuspendResume::SR_SUSPENDED) { - // somehow managed to suspend - return true; - } else { - ShouldNotReachHere(); - return false; - } - } - - // managed to send the signal and switch to SUSPEND_REQUEST, now wait for SUSPENDED - - for (int n = 0; !osthread->sr.is_suspended(); n++) { - for (int i = 0; i < RANDOMLY_LARGE_INTEGER2 && !osthread->sr.is_suspended(); i++) { - os::naked_yield(); - } - - // timeout, try to cancel the request - if (n >= RANDOMLY_LARGE_INTEGER) { - os::SuspendResume::State cancelled = osthread->sr.cancel_suspend(); - if (cancelled == os::SuspendResume::SR_RUNNING) { - return false; - } else if (cancelled == os::SuspendResume::SR_SUSPENDED) { - return true; - } else { - ShouldNotReachHere(); - return false; - } - } - } -#endif guarantee(osthread->sr.is_suspended(), "Must be suspended"); return true; @@ -1799,7 +1661,6 @@ } // TODO: reconcile the differences betrween Linux/BSD vs AIX here! -#if !defined(AIX) while (true) { if (sr_notify(osthread) == 0) { if (sr_semaphore.timedwait(2)) { @@ -1811,19 +1672,6 @@ ShouldNotReachHere(); } } -#else - while (!osthread->sr.is_running()) { - if (sr_notify(osthread) == 0) { - for (int n = 0; n < RANDOMLY_LARGE_INTEGER && !osthread->sr.is_running(); n++) { - for (int i = 0; i < 100 && !osthread->sr.is_running(); i++) { - os::naked_yield(); - } - } - } else { - ShouldNotReachHere(); - } - } -#endif - + guarantee(osthread->sr.is_running(), "Must be running!"); } diff -r ec42084221a6 src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp --- a/src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp Tue Sep 01 21:54:57 2020 +0200 +++ b/src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp Tue Sep 01 22:05:45 2020 +0200 @@ -51,6 +51,7 @@ #include "runtime/stubRoutines.hpp" #include "runtime/thread.inline.hpp" #include "runtime/timer.hpp" +#include "signals_posix.hpp" #include "utilities/events.hpp" #include "utilities/vmError.hpp" #ifdef COMPILER1 @@ -225,7 +226,7 @@ JavaThread* thread = NULL; VMThread* vmthread = NULL; - if (os::Aix::signal_handlers_are_installed) { + if (PosixSignals::are_signal_handlers_installed()) { if (t != NULL) { if(t->is_Java_thread()) { thread = (JavaThread*)t; From: gerard ziemski Sent: Dienstag, 1. September 2020 18:36 To: Doerr, Martin ; Thomas St?fe Cc: hotspot-runtime-dev at openjdk.java.net; Schmidt, Lutz ; Lindenmaier, Goetz ; Stuefe, Thomas Subject: Re: Attention AIX developers - factoring out POSIX signal code (JDK-8252324) Thank you Martin, Thomas for your feedback. On 8/31/20 12:20 PM, Doerr, Martin wrote: However, jdk/jdk doesn?t support as400 PASE, so I?d be fine with using Gerard?s new POSIX code and removing all AIX specific stuff which was built to support as400 PASE. I believe the semaphore stuff works on AIX, so we don?t need ?#if !defined(AIX)?, right? Just to make sure I understand it correctly: we don't need AIX specific semaphore code (i.e. local_sem_init(), local_sem_post() and local_sem_wait()), so we can remove it and go back to using runtime Semaphore for BSD/Linux/AIX? What other PASE specific code is currently in singlas_posix.cpp that should be removed? I see some extra signal unblocking code in os::signal() that is for AIX platform (with a comment that it applies to both AIX and PASE). That should stay in though, right? Is this something you want me to do, or will you clean that up yourself as appropriate? @Thomas, I assigned https://bugs.openjdk.java.net/browse/JDK-825253 to myself as per your suggestion. cheers From igor.ignatyev at oracle.com Tue Sep 1 20:21:56 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 1 Sep 2020 13:21:56 -0700 Subject: RFR(T) : 8252532 : use Utils.TEST_NATIVE_PATH instead of System.getProperty("test.nativepath") In-Reply-To: <226c299e-4661-1388-5382-daeca3780475@oracle.com> References: <93076ed6-f112-dd27-15d3-13f67cdf5de0@oracle.com> <226c299e-4661-1388-5382-daeca3780475@oracle.com> Message-ID: <964517DC-D9E0-41AC-A838-0B1F89FC4191@oracle.com> Hi Serguei, David, thanks for your reviews, I've updated the patch to address David's comments: http://cr.openjdk.java.net/~iignatyev//8252532/webrev.01/ (whole) http://cr.openjdk.java.net/~iignatyev//8252532/webrev.0-1/ (incremental) Thanks, -- Igor > On Sep 1, 2020, at 1:39 AM, serguei.spitsyn at oracle.com wrote: > > Hi Igor, > > This looks fine to me too. > I also agree with David's suggestions. > > Thanks, > Serguei > > > On 8/30/20 21:53, David Holmes wrote: >> Hi Igor, >> >> On 29/08/2020 1:52 pm, Igor Ignatyev wrote: >>> http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 >>>> 145 lines changed: 28 ins; 22 del; 95 mod; >>> >>> >>> Hi all, >>> >>> could you please review this trivial clean up which replaces System.getProperty("test.nativepath") w/ Utils.TEST_NATIVE_PATH where appropriate? >>> >>> while updating these files, I've also cleaned them up a bit, removed unneeded imports, added/removed spaces, etc >>> >>> testing: runtime, serviceability and vmTestbase/nsk/jvmti/ tests on {linux,windows,macos}-x64 >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8252532 >>> webrev: http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 >> >> Generally seems fine (though the fact the patch file contained a series of changesets threw me initially!) >> >> test/hotspot/jtreg/runtime/signal/SigTestDriver.java >> >> // add test specific arguments w/o signame >> cmd.addAll(Arrays.asList(args) >> - .subList(1, args.length)); >> + .subList(1, args.length)); >> >> Your changed line doesn't have the right indent. Can this just be put on one line anyway: >> >> // add test specific arguments w/o signame >> cmd.addAll(Arrays.asList(args).subList(1, args.length)); >> >> that seems better to me as the fact there is only one argument seems clearer. Though for greater clarity perhaps: >> >> // add test specific arguments w/o signame >> var argList = Arrays.asList(args).subList(1, args.length); >> cmd.addAll(argList); >> agree >> -- >> >> + Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) >> + .filter(s -> !s.isEmpty()) >> + .filter(s -> s.startsWith("-X")) >> + .flatMap(arg -> Stream.of("-vmopt", arg)) >> + .collect(Collectors.toList()); >> >> The preferred/common style for chained stream operations is to align the dots: >> >> Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) >> .filter(s -> !s.isEmpty()) >> .filter(s -> s.startsWith("-X")) >> .flatMap(arg -> Stream.of("-vmopt", arg)) >> .collect(Collectors.toList()); >> `collect` is actually called on the result of `Stream.concat`, anyhow I've aligned the chained calls by dots. >> --- >> >> test/lib/jdk/test/lib/process/ProcessTools.java >> >> - System.out.println("\t" + t + >> - " stack: (length = " + stack.length + ")"); >> + System.out.println("\t" + t + >> + " stack: (length = " + stack.length + ")"); I've decided to put this on one line. >> >> The original code is more stylistically correct - when breaking arguments across lines the indent should align with the start of the arguments. >> >> Similarly here: >> >> + return String.format("--- ProcessLog ---%n" + >> + "cmd: %s%n" + >> + "exitvalue: %s%n" + >> + "stderr: %s%n" + >> + "stdout: %s%n", >> + getCommandLine(pb), exitValue, stderr, stdout); >> >> should be: >> >> + return String.format("--- ProcessLog ---%n" + >> + "cmd: %s%n" + >> + "exitvalue: %s%n" + >> + "stderr: %s%n" + >> + "stdout: %s%n", >> + getCommandLine(pb), exitValue, stderr, stdout); fixed >> >> and here: >> >> + String executable = Paths.get(Utils.TEST_NATIVE_PATH, executableName) >> + .toAbsolutePath() >> + .toString(); fixed >> >> indentation again. >> >> Thanks, >> David >> ----- >> >>> Thanks, >>> -- Igor >>> > From yumin.qi at oracle.com Tue Sep 1 22:40:13 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Tue, 1 Sep 2020 15:40:13 -0700 Subject: 8248337: sparc related code clean up after solaris removal In-Reply-To: <8476b337-f034-19ce-e502-313da7d048c7@oracle.com> References: <8476b337-f034-19ce-e502-313da7d048c7@oracle.com> Message-ID: <0e96ff53-2853-960f-3140-87d774345c05@oracle.com> HI, Vladimir and David ?? I have updated new webrev at: http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-02/ ?? Filed two issues to address your concern separately: ?? 1) 8252681: Retire flag UseRDPCForConstantTableBase after solaris removal https://bugs.openjdk.java.net/browse/JDK-8252681 ? 2) 8252682: investigate PhaseChaitin::post_allocate_copy_removal after solaris removal https://bugs.openjdk.java.net/browse/JDK-8252682 ? So following three files leave no change: share/opto/c2_globals.hpp share/opto/machnode.hpp share/opto/postaloc.cpp ? Also update some copyright year for several files. Thanks Yumin On 9/1/20 10:35 AM, Vladimir Kozlov wrote: > On 8/31/20 6:18 PM, David Holmes wrote: >> Hi Yumin, >> >> On 1/09/2020 7:32 am, Yumin Qi wrote: >>> HI, >>> >>> ?? Please review for >>> >>> ?? bug: https://bugs.openjdk.java.net/browse/JDK-8248337 >>> >>> webrev:http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-01/ >>> >>> >>> ?? Summary: After Solaris supported files removed from repo, there are some remnants which needs cleaning up. Some comments are not correct, and some refer to wrong files. >> >> Those changes are mostly okay but I have a few minor issues/suggestions below. >> >>> There is a flag seems only useful for Sparc: UseRDPCForConstantTableBase, which got removed in this patch . >> >> Despite the description of the flag it is far from clear that the use of the flag affects sparc only. It affects the pinned() function so seems somewhat platform agnostic in that sense - which is why this was not dealt with in the SPARC removal process. I think this needs closer examination by the compiler folk, with a recommendation on whether it can/should be changed or not. Regardless as this is a product flag then I think this change should be factored out and we go through the appropriate deprecate/obsolete/expire process. > > The flag was used to use special SPARC instruction for CPUs supporting it to load base of Constant table. > It is useless for other platforms. MachConstantBaseNode::pinned() method can be removed because it inherits the method from Node::pinned() which returns 'false' too. > > And I agree with David that it should be done separately because it is product flag. > >> >>> Also in postaloc.cpp, the delay slot seems is only for sparc too, but I am not sure about that. Most of the patch are in comment section. >> >> It refers to spill slot not delay slot. I don't see anything obviously sparc specific about that block of code. > > Please, leave the code as it is. As David said it is about normal spill slots for all platforms. > I am not sure it is SPARC specific currently with all platforms OpenJDK supports. > If you want you can file RFE to replace code with assert and ask community to run a lot of testing to see if we hit the assert. > >> >> Specific comments: >> >> src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp >> >> -// 64 bits items (sparc abi) even though java would only store >> +// 64 bits items even though java would only store >> >> Should "(sparc abi)" be replaced with "(Aarch64 abi)" as you did for other platforms? >> >> --- >> >> src/hotspot/cpu/arm/frame_arm.hpp (and other files) >> >> ??? // The interpreter and adapters will extend the frame of the caller. >> ??? // Since oopMaps are based on the sp of the caller before extension >> -? // we need to know that value. However in order to compute the address >> -? // of the return address we need the real "raw" sp. Since sparc already >> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >> -? // original sp we use that convention. >> +? // we need to know that value. However in order to compute the return >> +? // address we need the real "raw" sp. >> >> I think this is losing too much information as it no longer describes the convention. I would suggest: >> >> ??? // The interpreter and adapters will extend the frame of the caller. >> ??? // Since oopMaps are based on the sp of the caller before extension >> ??? // we need to know that value. However in order to compute the address >> -? // of the return address we need the real "raw" sp. Since sparc already >> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >> -? // original sp we use that convention. >> +? // of the return address we need the real "raw" sp. By convention we >> +? // use sp() to mean "raw" sp and unextended_sp() to mean the caller's >> +? // original sp. >> >> --- >> >> src/hotspot/cpu/ppc/jniTypes_ppc.hpp >> >> -? // stubGenerator_sparc.cpp) reverse the argument list constructed by >> +? // stubGenerator_${CPU}.cpp) reverse the argument list constructed by >> >> Just replace sparc with ppc as done for other platforms. >> >> --- >> >> src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp >> >> -? // This greatly simplifies the cases here compared to sparc. >> +? // This greatly simplifies the cases here. >> >> Just delete the comment as there is nothing to compare simplicity or complexity against. >> >> --- >> >> src/hotspot/share/c1/c1_LIRGenerator.cpp >> >> -? // In 64bit the type can be long, sparc doesn't have this assert >> +? // In 64bit the type can be long >> ??? // assert(offset.type()->tag() == intTag, "invalid type"); >> >> compiler folk should decide what to do here but I think the comment and commented out assert can just be deleted. > > Yes, remove commented assert too. Originally it was platform specific code - the assert was there for 32-bit. > > Thanks, > Vladimir K > >> >> --- >> >> src/hotspot/share/c1/c1_Runtime1.cpp >> >> -? case handle_exception_nofpu_id:? // Unused on sparc >> +? case handle_exception_nofpu_id:? // unused. >> >> the new comment is incorrect as this case is not unused. I suggest just deleting the comment. >> >> Thanks, >> David >> ----- >> >>> >>> >>> ?? Tests passed tier1-4 >>> >>> >>> ?? Thanks >>> >>> ?? Yumin >>> From thomas.stuefe at gmail.com Tue Sep 1 22:45:56 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 2 Sep 2020 00:45:56 +0200 Subject: Attention AIX developers - factoring out POSIX signal code (JDK-8252324) In-Reply-To: References: <022e6096-a388-dc79-f959-00a2d933ae89@oracle.com> <1d5921d1-1cb4-b286-a95b-7e2d4b60ced0@oracle.com> Message-ID: Hi Martin, off hand looks good. Thanks for taking care of this. Cheers, Thomas On Tue, Sep 1, 2020 at 10:11 PM Doerr, Martin wrote: > Hi Gerard, > > > > it compiles without errors and warnings with the following patch. I don?t > think we still need this old stuff, but I hope Thomas can find some time to > double-check. > > I haven?t run it through our testing, yet. > > > > Best regards, > > Martin > > > > > > diff -r ec42084221a6 src/hotspot/os/aix/os_aix.cpp > > --- a/src/hotspot/os/aix/os_aix.cpp Tue Sep 01 21:54:57 2020 +0200 > > +++ b/src/hotspot/os/aix/os_aix.cpp Tue Sep 01 22:05:45 2020 +0200 > > @@ -1563,61 +1563,16 @@ > > os::signal_notify(sig); > > } > > > > -void* os::user_handler() { > > - return CAST_FROM_FN_PTR(void*, UserHandler); > > -} > > - > > extern "C" { > > typedef void (*sa_handler_t)(int); > > typedef void (*sa_sigaction_t)(int, siginfo_t *, void *); > > } > > > > -void* os::signal(int signal_number, void* handler) { > > - struct sigaction sigAct, oldSigAct; > > - > > - sigfillset(&(sigAct.sa_mask)); > > - > > - // Do not block out synchronous signals in the signal handler. > > - // Blocking synchronous signals only makes sense if you can really > > - // be sure that those signals won't happen during signal handling, > > - // when the blocking applies. Normal signal handlers are lean and > > - // do not cause signals. But our signal handlers tend to be "risky" > > - // - secondary SIGSEGV, SIGILL, SIGBUS' may and do happen. > > - // On AIX, PASE there was a case where a SIGSEGV happened, followed > > - // by a SIGILL, which was blocked due to the signal mask. The process > > - // just hung forever. Better to crash from a secondary signal than to > hang. > > - sigdelset(&(sigAct.sa_mask), SIGSEGV); > > - sigdelset(&(sigAct.sa_mask), SIGBUS); > > - sigdelset(&(sigAct.sa_mask), SIGILL); > > - sigdelset(&(sigAct.sa_mask), SIGFPE); > > - sigdelset(&(sigAct.sa_mask), SIGTRAP); > > - > > - sigAct.sa_flags = SA_RESTART|SA_SIGINFO; > > - > > - sigAct.sa_handler = CAST_TO_FN_PTR(sa_handler_t, handler); > > - > > - if (sigaction(signal_number, &sigAct, &oldSigAct)) { > > - // -1 means registration failed > > - return (void *)-1; > > - } > > - > > - return CAST_FROM_FN_PTR(void*, oldSigAct.sa_handler); > > -} > > - > > -void os::signal_raise(int signal_number) { > > - ::raise(signal_number); > > -} > > - > > // > > // The following code is moved from os.cpp for making this > > // code platform specific, which it is by its very nature. > > // > > > > -// Will be modified when max signal is changed to be dynamic > > -int os::sigexitnum_pd() { > > - return NSIG; > > -} > > - > > // a counter for each possible signal value > > static volatile jint pending_signals[NSIG+1] = { 0 }; > > > > @@ -1687,11 +1642,6 @@ > > local_sem_init(); > > } > > > > -void os::signal_notify(int sig) { > > - Atomic::inc(&pending_signals[sig]); > > - local_sem_post(); > > -} > > - > > static int check_pending_signals() { > > for (;;) { > > for (int i = 0; i < NSIG + 1; i++) { > > @@ -1728,9 +1678,6 @@ > > } > > } > > > > -int os::signal_wait() { > > - return check_pending_signals(); > > -} > > > > > //////////////////////////////////////////////////////////////////////////////// > > // Virtual Memory > > diff -r ec42084221a6 src/hotspot/os/posix/signals_posix.cpp > > --- a/src/hotspot/os/posix/signals_posix.cpp Tue Sep 01 21:54:57 2020 > +0200 > > +++ b/src/hotspot/os/posix/signals_posix.cpp Tue Sep 01 22:05:45 2020 > +0200 > > @@ -95,12 +95,7 @@ > > #endif > > > > // sun.misc.Signal support > > -#if !defined(AIX) > > - static Semaphore* sig_semaphore = NULL; > > -#else > > - static sem_t sig_semaphore; > > - static msemaphore* p_sig_msem = 0; > > -#endif > > +static Semaphore* sig_semaphore = NULL; > > // a counter for each possible signal value > > static volatile jint pending_signals[NSIG+1] = { 0 }; > > > > @@ -272,71 +267,17 @@ > > // sun.misc.Signal support > > > > static void local_sem_init() { > > -#if !defined(AIX) > > sig_semaphore = new Semaphore(); > > -#else > > - if (os::Aix::on_aix()) { > > - int rc = ::sem_init(&sig_semaphore, 0, 0); > > - guarantee(rc != -1, "sem_init failed"); > > - } else { > > - // Memory semaphores must live in shared mem. > > - guarantee0(p_sig_msem == NULL); > > - p_sig_msem = (msemaphore*)os::reserve_memory(sizeof(msemaphore), > NULL); > > - guarantee(p_sig_msem, "Cannot allocate memory for memory semaphore"); > > - guarantee(::msem_init(p_sig_msem, 0) == p_sig_msem, "msem_init > failed"); > > - } > > -#endif > > } > > > > // Wrapper functions for: sem_init(), sem_post(), sem_wait() > > -// On AIX, we use sem_init(), sem_post(), sem_wait() > > -// On Pase, we need to use msem_lock() and msem_unlock(), because Posix > Semaphores > > -// do not seem to work at all on PASE (unimplemented, will cause SIGILL). > > -// Note that just using msem_.. APIs for both PASE and AIX is not an > option either, as > > -// on AIX, msem_..() calls are suspected of causing problems. > > > > static void local_sem_post() { > > -#if !defined(AIX) > > sig_semaphore->signal(); > > -#else > > - static bool warn_only_once = false; > > - if (os::Aix::on_aix()) { > > - int rc = ::sem_post(&sig_semaphore); > > - if (rc == -1 && !warn_only_once) { > > - trcVerbose("sem_post failed (errno = %d, %s)", errno, > os::errno_name(errno)); > > - warn_only_once = true; > > - } > > - } else { > > - guarantee0(p_sig_msem != NULL); > > - int rc = ::msem_unlock(p_sig_msem, 0); > > - if (rc == -1 && !warn_only_once) { > > - trcVerbose("msem_unlock failed (errno = %d, %s)", errno, > os::errno_name(errno)); > > - warn_only_once = true; > > - } > > - } > > - #endif > > } > > > > static void local_sem_wait() { > > -#if !defined(AIX) > > sig_semaphore->wait(); > > -#else > > - static bool warn_only_once = false; > > - if (os::Aix::on_aix()) { > > - int rc = ::sem_wait(&sig_semaphore); > > - if (rc == -1 && !warn_only_once) { > > - trcVerbose("sem_wait failed (errno = %d, %s)", errno, > os::errno_name(errno)); > > - warn_only_once = true; > > - } > > - } else { > > - guarantee0(p_sig_msem != NULL); // must init before use > > - int rc = ::msem_lock(p_sig_msem, 0); > > - if (rc == -1 && !warn_only_once) { > > - trcVerbose("msem_lock failed (errno = %d, %s)", errno, > os::errno_name(errno)); > > - warn_only_once = true; > > - } > > - } > > - #endif > > } > > > > void PosixSignals::jdk_misc_signal_init() { > > @@ -1573,7 +1514,6 @@ > > os::SuspendResume::State current = osthread->sr.state(); > > > > // TODO: reconcile the differences betrween Linux/BSD vs AIX here! > > -#if !defined(AIX) > > if (current == os::SuspendResume::SR_SUSPEND_REQUEST) { > > suspend_save_context(osthread, siginfo, context); > > > > @@ -1617,45 +1557,6 @@ > > } else { > > // ignore > > } > > -#else > > - if (current == os::SuspendResume::SR_SUSPEND_REQUEST) { > > - suspend_save_context(osthread, siginfo, context); > > - > > - // attempt to switch the state, we assume we had a SUSPEND_REQUEST > > - os::SuspendResume::State state = osthread->sr.suspended(); > > - if (state == os::SuspendResume::SR_SUSPENDED) { > > - sigset_t suspend_set; // signals for sigsuspend() > > - sigemptyset(&suspend_set); > > - > > - // get current set of blocked signals and unblock resume signal > > - pthread_sigmask(SIG_BLOCK, NULL, &suspend_set); > > - sigdelset(&suspend_set, SR_signum); > > - > > - // wait here until we are resumed > > - while (1) { > > - sigsuspend(&suspend_set); > > - > > - os::SuspendResume::State result = osthread->sr.running(); > > - if (result == os::SuspendResume::SR_RUNNING) { > > - break; > > - } > > - } > > - > > - } else if (state == os::SuspendResume::SR_RUNNING) { > > - // request was cancelled, continue > > - } else { > > - ShouldNotReachHere(); > > - } > > - > > - resume_clear_context(osthread); > > - } else if (current == os::SuspendResume::SR_RUNNING) { > > - // request was cancelled, continue > > - } else if (current == os::SuspendResume::SR_WAKEUP_REQUEST) { > > - // ignore > > - } else { > > - ShouldNotReachHere(); > > - } > > -#endif > > > > errno = old_errno; > > } > > @@ -1721,7 +1622,6 @@ > > } > > > > // TODO: reconcile the differences betrween Linux/BSD vs AIX here! > > -#if !defined(AIX) > > if (sr_notify(osthread) != 0) { > > ShouldNotReachHere(); > > } > > @@ -1745,44 +1645,6 @@ > > } > > } > > } > > -#else > > - if (sr_notify(osthread) != 0) { > > - // try to cancel, switch to running > > - > > - os::SuspendResume::State result = osthread->sr.cancel_suspend(); > > - if (result == os::SuspendResume::SR_RUNNING) { > > - // cancelled > > - return false; > > - } else if (result == os::SuspendResume::SR_SUSPENDED) { > > - // somehow managed to suspend > > - return true; > > - } else { > > - ShouldNotReachHere(); > > - return false; > > - } > > - } > > - > > - // managed to send the signal and switch to SUSPEND_REQUEST, now wait > for SUSPENDED > > - > > - for (int n = 0; !osthread->sr.is_suspended(); n++) { > > - for (int i = 0; i < RANDOMLY_LARGE_INTEGER2 && > !osthread->sr.is_suspended(); i++) { > > - os::naked_yield(); > > - } > > - > > - // timeout, try to cancel the request > > - if (n >= RANDOMLY_LARGE_INTEGER) { > > - os::SuspendResume::State cancelled = osthread->sr.cancel_suspend(); > > - if (cancelled == os::SuspendResume::SR_RUNNING) { > > - return false; > > - } else if (cancelled == os::SuspendResume::SR_SUSPENDED) { > > - return true; > > - } else { > > - ShouldNotReachHere(); > > - return false; > > - } > > - } > > - } > > -#endif > > > > guarantee(osthread->sr.is_suspended(), "Must be suspended"); > > return true; > > @@ -1799,7 +1661,6 @@ > > } > > > > // TODO: reconcile the differences betrween Linux/BSD vs AIX here! > > -#if !defined(AIX) > > while (true) { > > if (sr_notify(osthread) == 0) { > > if (sr_semaphore.timedwait(2)) { > > @@ -1811,19 +1672,6 @@ > > ShouldNotReachHere(); > > } > > } > > -#else > > - while (!osthread->sr.is_running()) { > > - if (sr_notify(osthread) == 0) { > > - for (int n = 0; n < RANDOMLY_LARGE_INTEGER && > !osthread->sr.is_running(); n++) { > > - for (int i = 0; i < 100 && !osthread->sr.is_running(); i++) { > > - os::naked_yield(); > > - } > > - } > > - } else { > > - ShouldNotReachHere(); > > - } > > - } > > -#endif > > - > > + > > guarantee(osthread->sr.is_running(), "Must be running!"); > > } > > diff -r ec42084221a6 src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp > > --- a/src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp Tue Sep 01 21:54:57 2020 > +0200 > > +++ b/src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp Tue Sep 01 22:05:45 2020 > +0200 > > @@ -51,6 +51,7 @@ > > #include "runtime/stubRoutines.hpp" > > #include "runtime/thread.inline.hpp" > > #include "runtime/timer.hpp" > > +#include "signals_posix.hpp" > > #include "utilities/events.hpp" > > #include "utilities/vmError.hpp" > > #ifdef COMPILER1 > > @@ -225,7 +226,7 @@ > > > > JavaThread* thread = NULL; > > VMThread* vmthread = NULL; > > - if (os::Aix::signal_handlers_are_installed) { > > + if (PosixSignals::are_signal_handlers_installed()) { > > if (t != NULL) { > > if(t->is_Java_thread()) { > > thread = (JavaThread*)t; > > > > > > *From:* gerard ziemski > *Sent:* Dienstag, 1. September 2020 18:36 > *To:* Doerr, Martin ; Thomas St?fe < > thomas.stuefe at gmail.com> > *Cc:* hotspot-runtime-dev at openjdk.java.net; Schmidt, Lutz < > lutz.schmidt at sap.com>; Lindenmaier, Goetz ; > Stuefe, Thomas > *Subject:* Re: Attention AIX developers - factoring out POSIX signal code > (JDK-8252324) > > > > Thank you Martin, Thomas for your feedback. > > On 8/31/20 12:20 PM, Doerr, Martin wrote: > > However, jdk/jdk doesn?t support as400 PASE, so I?d be fine with using > Gerard?s new POSIX code and removing all AIX specific stuff which was built > to support as400 PASE. > > I believe the semaphore stuff works on AIX, so we don?t need ?#if > !defined(AIX)?, right? > > Just to make sure I understand it correctly: we don't need AIX specific > semaphore code (i.e. local_sem_init(), local_sem_post() and > local_sem_wait()), so we can remove it and go back to using runtime > Semaphore for BSD/Linux/AIX? > > What other PASE specific code is currently in singlas_posix.cpp that > should be removed? I see some extra signal unblocking code in os::signal() > that is for AIX platform (with a comment that it applies to both AIX and > PASE). That should stay in though, right? > > Is this something you want me to do, or will you clean that up yourself as > appropriate? > > @Thomas, I assigned https://bugs.openjdk.java.net/browse/JDK-825253 to > myself as per your suggestion. > > > cheers > From david.holmes at oracle.com Tue Sep 1 23:41:38 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Sep 2020 09:41:38 +1000 Subject: RFR(M) 8252526 Remove excessive inclusion of jvmti.h and jvmtiExport.hpp In-Reply-To: <7f275aab-565d-f566-702e-2e6dd6a6d3d7@oracle.com> References: <610aa37d-3f5e-cae7-197a-b3294b217aac@oracle.com> <198a1835-a17b-a83f-145b-e49f5ed37e10@oracle.com> <687bda99-e689-7c53-ee29-f624029f1d1b@oracle.com> <7f275aab-565d-f566-702e-2e6dd6a6d3d7@oracle.com> Message-ID: On 1/09/2020 11:34 pm, Daniel D. Daugherty wrote: > On 9/1/20 9:24 AM, Ioi Lam wrote: >> >> >> On 8/31/20 9:38 PM, David Holmes wrote: >>> On 1/09/2020 2:00 pm, Ioi Lam wrote: >>>> On 8/31/20 6:17 PM, Ioi Lam wrote: >>>>> On 8/31/20 4:05 PM, David Holmes wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> I haven't looked at the code changes ... >>>>>> >>>>>> On 1/09/2020 4:13 am, Ioi Lam wrote: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8252526 >>>>>>> http://cr.openjdk.java.net/~iklam/jdk16/8252526-fix-jvmti-hpp.v01/ >>>>>>> >>>>>>> (I marked this RFR as "M" because 63 files have changed, but most >>>>>>> of the are >>>>>>> just adding a missing #include "prims/jvmtiExport.hpp"). >>>>>>> >>>>>>> jvmti.h is included 905 times and jvmtiExport.hpp is included 776 >>>>>>> times >>>>>>> (by 971 hotspot .o files). Most of these are unnecessarily >>>>>>> included by the >>>>>>> following 3 popular header files: >>>>>>> >>>>>>> [1] javaClasses.hpp: ThreadStatus is rarely used, and should be >>>>>>> moved >>>>>>> ?? ? to javaThreadStatus.hpp. I also converted the enum to an C++ 11 >>>>>>> ?? ? enum class for better type safety. (see also JDK-8247938). >>>>>>> >>>>>>> [2] os.hpp: No need to include jvm.h. Use forward declaration >>>>>>> ?? ? "typedef struct _jvmtiTimerInfo jvmtiTimerInfo;" instead. >>>>>> >>>>>> That does not seem reasonable to me. It is one thing to do a >>>>>> simple forward declaration of a class but this is an internal >>>>>> detail of JVMTI which os.hpp has no business knowing about. >>>>>> >>>>> >>>>> How about changing jvmti.h from: >>>>> >>>>> struct _jvmtiTimerInfo; >>>>> typedef struct _jvmtiTimerInfo jvmtiTimerInfo; >>>>> >>>>> to >>>>> >>>>> struct jvmtiTimerInfo; >>>>> typedef struct jvmtiTimerInfo jvmtiTimerInfo; >>>>> >>>>> Then os.hpp can declare: >>>>> >>>>> struct jvmtiTimerInfo; >>> >>> Does that actually work? If so that's equivalent to "class Foo;" >>> forward declarations and so is acceptable. >> Yes, it works, but I would need to file a CSR because jvmti.h is >> included in the JDK .... > > Yes, jvmti.h is included in the JDK, but the name exposed by the > spec is 'jvmtiTimerInfo'. The '_jvmtiTimerInfo' name isn't exposed by > the spec at all. > > You could change the name to 'DO_NOT_USE_THIS_NAME_jvmtiTimerInfo' and > nothing should break from a JVM/TI spec POV. Of course, if there is code > out in the world that depended on that name, then it will break, but I > would argue that code is broken. > > Short version: I think '_jvmtiTimerInfo' is an implementation detail > and you don't need a CSR for correctness. You might want a CSR for > advice since this is an odd situation, but, strictly from an API POV, > I don't think you need one. I agree no CSR needed this does not affect any exported API, nor is there a behaviour change via an exported API. David ----- > > Dan > > >> >> Thanks >> - Ioi >> >> ======= >> $ hg diff >> diff -r aaa4245df83a src/hotspot/share/prims/jvmtiH.xsl >> --- a/src/hotspot/share/prims/jvmtiH.xsl??? Mon Aug 31 11:03:13 2020 >> -0700 >> +++ b/src/hotspot/share/prims/jvmtiH.xsl??? Mon Aug 31 23:24:27 2020 >> -0700 >> @@ -406,11 +406,11 @@ >> ? >> >> ? >> -? struct _ >> +? struct >> ?? >> ?? ; >> ? >> -? typedef struct _ >> +? typedef struct >> ?? >> ?? >> ?? >> @@ -419,7 +419,7 @@ >> ? >> >> ? >> -? struct _ >> +? struct >> ?? >> ?? { >> ? >> diff -r aaa4245df83a src/hotspot/share/runtime/os.hpp >> --- a/src/hotspot/share/runtime/os.hpp??? Mon Aug 31 11:03:13 2020 -0700 >> +++ b/src/hotspot/share/runtime/os.hpp??? Mon Aug 31 23:24:27 2020 -0700 >> @@ -51,7 +51,7 @@ >> ?class OSThread; >> ?class Mutex; >> >> -typedef struct _jvmtiTimerInfo jvmtiTimerInfo; >> +struct jvmtiTimerInfo; >> >> ?template class GrowableArray; >> ======= >> >> >> >> >> >> >> > From vladimir.kozlov at oracle.com Tue Sep 1 23:48:58 2020 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 1 Sep 2020 16:48:58 -0700 Subject: 8248337: sparc related code clean up after solaris removal In-Reply-To: <0e96ff53-2853-960f-3140-87d774345c05@oracle.com> References: <8476b337-f034-19ce-e502-313da7d048c7@oracle.com> <0e96ff53-2853-960f-3140-87d774345c05@oracle.com> Message-ID: Looks good. Thanks, Vladimir K On 9/1/20 3:40 PM, Yumin Qi wrote: > HI, Vladimir and David > > ?? I have updated new webrev at: http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-02/ > > ?? Filed two issues to address your concern separately: > > ?? 1) 8252681: Retire flag UseRDPCForConstantTableBase after solaris removal > > https://bugs.openjdk.java.net/browse/JDK-8252681 > > ? 2) 8252682: investigate PhaseChaitin::post_allocate_copy_removal after solaris removal > > https://bugs.openjdk.java.net/browse/JDK-8252682 > > > ? So following three files leave no change: > > share/opto/c2_globals.hpp > > share/opto/machnode.hpp > > share/opto/postaloc.cpp > > ? Also update some copyright year for several files. > > > Thanks > > Yumin > > > On 9/1/20 10:35 AM, Vladimir Kozlov wrote: >> On 8/31/20 6:18 PM, David Holmes wrote: >>> Hi Yumin, >>> >>> On 1/09/2020 7:32 am, Yumin Qi wrote: >>>> HI, >>>> >>>> ?? Please review for >>>> >>>> ?? bug: https://bugs.openjdk.java.net/browse/JDK-8248337 >>>> >>>> webrev:http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-01/ >>>> >>>> >>>> ?? Summary: After Solaris supported files removed from repo, there are some remnants which needs cleaning up. Some >>>> comments are not correct, and some refer to wrong files. >>> >>> Those changes are mostly okay but I have a few minor issues/suggestions below. >>> >>>> There is a flag seems only useful for Sparc: UseRDPCForConstantTableBase, which got removed in this patch . >>> >>> Despite the description of the flag it is far from clear that the use of the flag affects sparc only. It affects the >>> pinned() function so seems somewhat platform agnostic in that sense - which is why this was not dealt with in the >>> SPARC removal process. I think this needs closer examination by the compiler folk, with a recommendation on whether >>> it can/should be changed or not. Regardless as this is a product flag then I think this change should be factored out >>> and we go through the appropriate deprecate/obsolete/expire process. >> >> The flag was used to use special SPARC instruction for CPUs supporting it to load base of Constant table. >> It is useless for other platforms. MachConstantBaseNode::pinned() method can be removed because it inherits the method >> from Node::pinned() which returns 'false' too. >> >> And I agree with David that it should be done separately because it is product flag. >> >>> >>>> Also in postaloc.cpp, the delay slot seems is only for sparc too, but I am not sure about that. Most of the patch >>>> are in comment section. >>> >>> It refers to spill slot not delay slot. I don't see anything obviously sparc specific about that block of code. >> >> Please, leave the code as it is. As David said it is about normal spill slots for all platforms. >> I am not sure it is SPARC specific currently with all platforms OpenJDK supports. >> If you want you can file RFE to replace code with assert and ask community to run a lot of testing to see if we hit >> the assert. >> >>> >>> Specific comments: >>> >>> src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp >>> >>> -// 64 bits items (sparc abi) even though java would only store >>> +// 64 bits items even though java would only store >>> >>> Should "(sparc abi)" be replaced with "(Aarch64 abi)" as you did for other platforms? >>> >>> --- >>> >>> src/hotspot/cpu/arm/frame_arm.hpp (and other files) >>> >>> ??? // The interpreter and adapters will extend the frame of the caller. >>> ??? // Since oopMaps are based on the sp of the caller before extension >>> -? // we need to know that value. However in order to compute the address >>> -? // of the return address we need the real "raw" sp. Since sparc already >>> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >>> -? // original sp we use that convention. >>> +? // we need to know that value. However in order to compute the return >>> +? // address we need the real "raw" sp. >>> >>> I think this is losing too much information as it no longer describes the convention. I would suggest: >>> >>> ??? // The interpreter and adapters will extend the frame of the caller. >>> ??? // Since oopMaps are based on the sp of the caller before extension >>> ??? // we need to know that value. However in order to compute the address >>> -? // of the return address we need the real "raw" sp. Since sparc already >>> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >>> -? // original sp we use that convention. >>> +? // of the return address we need the real "raw" sp. By convention we >>> +? // use sp() to mean "raw" sp and unextended_sp() to mean the caller's >>> +? // original sp. >>> >>> --- >>> >>> src/hotspot/cpu/ppc/jniTypes_ppc.hpp >>> >>> -? // stubGenerator_sparc.cpp) reverse the argument list constructed by >>> +? // stubGenerator_${CPU}.cpp) reverse the argument list constructed by >>> >>> Just replace sparc with ppc as done for other platforms. >>> >>> --- >>> >>> src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp >>> >>> -? // This greatly simplifies the cases here compared to sparc. >>> +? // This greatly simplifies the cases here. >>> >>> Just delete the comment as there is nothing to compare simplicity or complexity against. >>> >>> --- >>> >>> src/hotspot/share/c1/c1_LIRGenerator.cpp >>> >>> -? // In 64bit the type can be long, sparc doesn't have this assert >>> +? // In 64bit the type can be long >>> ??? // assert(offset.type()->tag() == intTag, "invalid type"); >>> >>> compiler folk should decide what to do here but I think the comment and commented out assert can just be deleted. >> >> Yes, remove commented assert too. Originally it was platform specific code - the assert was there for 32-bit. >> >> Thanks, >> Vladimir K >> >>> >>> --- >>> >>> src/hotspot/share/c1/c1_Runtime1.cpp >>> >>> -? case handle_exception_nofpu_id:? // Unused on sparc >>> +? case handle_exception_nofpu_id:? // unused. >>> >>> the new comment is incorrect as this case is not unused. I suggest just deleting the comment. >>> >>> Thanks, >>> David >>> ----- >>> >>>> >>>> >>>> ?? Tests passed tier1-4 >>>> >>>> >>>> ?? Thanks >>>> >>>> ?? Yumin >>>> From david.holmes at oracle.com Tue Sep 1 23:58:46 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Sep 2020 09:58:46 +1000 Subject: RFR(T) : 8252532 : use Utils.TEST_NATIVE_PATH instead of System.getProperty("test.nativepath") In-Reply-To: <93076ed6-f112-dd27-15d3-13f67cdf5de0@oracle.com> References: <93076ed6-f112-dd27-15d3-13f67cdf5de0@oracle.com> Message-ID: <9e962e6c-79c7-35c2-621d-e55a9173ece9@oracle.com> Hi Igor, Changes seem fine (incremental webrev didn't show them all though). There are other pre-existing indentation issues in those files but this is fine. Thanks, David On 31/08/2020 2:53 pm, David Holmes wrote: > Hi Igor, > > On 29/08/2020 1:52 pm, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 >>> 145 lines changed: 28 ins; 22 del; 95 mod; >> >> >> Hi all, >> >> could you please review this trivial clean up which replaces >> System.getProperty("test.nativepath") w/ Utils.TEST_NATIVE_PATH where >> appropriate? >> >> while updating these files, I've also cleaned them up a bit, removed >> unneeded imports, added/removed spaces, etc >> >> testing: runtime, serviceability and vmTestbase/nsk/jvmti/ tests on >> {linux,windows,macos}-x64 >> JBS: https://bugs.openjdk.java.net/browse/JDK-8252532 >> webrev: http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 > > Generally seems fine (though the fact the patch file contained a series > of changesets threw me initially!) > > test/hotspot/jtreg/runtime/signal/SigTestDriver.java > > ???????? // add test specific arguments w/o signame > ???????? cmd.addAll(Arrays.asList(args) > -???????????????????????? .subList(1, args.length)); > +??????????????? .subList(1, args.length)); > > Your changed line doesn't have the right indent. Can this just be put on > one line anyway: > > ???????? // add test specific arguments w/o signame > ???????? cmd.addAll(Arrays.asList(args).subList(1, args.length)); > > that seems better to me as the fact there is only one argument seems > clearer. Though for greater clarity perhaps: > > ???????? // add test specific arguments w/o signame > ???????? var argList = Arrays.asList(args).subList(1, args.length); > ???????? cmd.addAll(argList); > > -- > > +??????????????? Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) > +??????????????? .filter(s -> !s.isEmpty()) > +??????????????? .filter(s -> s.startsWith("-X")) > +??????????????? .flatMap(arg -> Stream.of("-vmopt", arg)) > +??????????????? .collect(Collectors.toList()); > > The preferred/common style for chained stream operations is to align the > dots: > > ????????? Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) > ??????????????? .filter(s -> !s.isEmpty()) > ??????????????? .filter(s -> s.startsWith("-X")) > ??????????????? .flatMap(arg -> Stream.of("-vmopt", arg)) > ??????????????? .collect(Collectors.toList()); > > --- > > test/lib/jdk/test/lib/process/ProcessTools.java > > -??????? System.out.println("\t" +? t + > -?????????????????????????? " stack: (length = " + stack.length + ")"); > +??????? System.out.println("\t" + t + > +??????????????? " stack: (length = " + stack.length + ")"); > > The original code is more stylistically correct - when breaking > arguments across lines the indent should align with the start of the > arguments. > > Similarly here: > > +??????? return String.format("--- ProcessLog ---%n" + > +??????????????????????? "cmd: %s%n" + > +??????????????????????? "exitvalue: %s%n" + > +??????????????????????? "stderr: %s%n" + > +??????????????????????? "stdout: %s%n", > +??????????????? getCommandLine(pb), exitValue, stderr, stdout); > > should be: > > +??????? return String.format("--- ProcessLog ---%n" + > +???????????????????????????? "cmd: %s%n" + > +???????????????????????????? "exitvalue: %s%n" + > +???????????????????????????? "stderr: %s%n" + > +???????????????????????????? "stdout: %s%n", > +???????????????????????????? getCommandLine(pb), exitValue, stderr, > stdout); > > and here: > > +??????? String executable = Paths.get(Utils.TEST_NATIVE_PATH, > executableName) > +??????????????? .toAbsolutePath() > +??????????????? .toString(); > > indentation again. > > Thanks, > David > ----- > >> Thanks, >> -- Igor >> From igor.ignatyev at oracle.com Wed Sep 2 00:34:53 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 1 Sep 2020 17:34:53 -0700 Subject: RFR(T) : 8252532 : use Utils.TEST_NATIVE_PATH instead of System.getProperty("test.nativepath") In-Reply-To: <9e962e6c-79c7-35c2-621d-e55a9173ece9@oracle.com> References: <93076ed6-f112-dd27-15d3-13f67cdf5de0@oracle.com> <9e962e6c-79c7-35c2-621d-e55a9173ece9@oracle.com> Message-ID: <6305BACE-B690-46D8-B932-4AB420CEDCCA@oracle.com> thanks Serguei, David, pushed. -- Igor > Hi Igor, > > The update looks good. > > Thanks, > Serguei > On Sep 1, 2020, at 4:58 PM, David Holmes wrote: > > Hi Igor, > > Changes seem fine (incremental webrev didn't show them all though). > > There are other pre-existing indentation issues in those files but this is fine. > > Thanks, > David > > On 31/08/2020 2:53 pm, David Holmes wrote: >> Hi Igor, >> On 29/08/2020 1:52 pm, Igor Ignatyev wrote: >>> http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 >>>> 145 lines changed: 28 ins; 22 del; 95 mod; >>> >>> >>> Hi all, >>> >>> could you please review this trivial clean up which replaces System.getProperty("test.nativepath") w/ Utils.TEST_NATIVE_PATH where appropriate? >>> >>> while updating these files, I've also cleaned them up a bit, removed unneeded imports, added/removed spaces, etc >>> >>> testing: runtime, serviceability and vmTestbase/nsk/jvmti/ tests on {linux,windows,macos}-x64 >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8252532 >>> webrev: http://cr.openjdk.java.net/~iignatyev//8252532/webrev.00 >> Generally seems fine (though the fact the patch file contained a series of changesets threw me initially!) >> test/hotspot/jtreg/runtime/signal/SigTestDriver.java >> // add test specific arguments w/o signame >> cmd.addAll(Arrays.asList(args) >> - .subList(1, args.length)); >> + .subList(1, args.length)); >> Your changed line doesn't have the right indent. Can this just be put on one line anyway: >> // add test specific arguments w/o signame >> cmd.addAll(Arrays.asList(args).subList(1, args.length)); >> that seems better to me as the fact there is only one argument seems clearer. Though for greater clarity perhaps: >> // add test specific arguments w/o signame >> var argList = Arrays.asList(args).subList(1, args.length); >> cmd.addAll(argList); >> -- >> + Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) >> + .filter(s -> !s.isEmpty()) >> + .filter(s -> s.startsWith("-X")) >> + .flatMap(arg -> Stream.of("-vmopt", arg)) >> + .collect(Collectors.toList()); >> The preferred/common style for chained stream operations is to align the dots: >> Arrays.stream(Utils.JAVA_OPTIONS.split(" "))) >> .filter(s -> !s.isEmpty()) >> .filter(s -> s.startsWith("-X")) >> .flatMap(arg -> Stream.of("-vmopt", arg)) >> .collect(Collectors.toList()); >> --- >> test/lib/jdk/test/lib/process/ProcessTools.java >> - System.out.println("\t" + t + >> - " stack: (length = " + stack.length + ")"); >> + System.out.println("\t" + t + >> + " stack: (length = " + stack.length + ")"); >> The original code is more stylistically correct - when breaking arguments across lines the indent should align with the start of the arguments. >> Similarly here: >> + return String.format("--- ProcessLog ---%n" + >> + "cmd: %s%n" + >> + "exitvalue: %s%n" + >> + "stderr: %s%n" + >> + "stdout: %s%n", >> + getCommandLine(pb), exitValue, stderr, stdout); >> should be: >> + return String.format("--- ProcessLog ---%n" + >> + "cmd: %s%n" + >> + "exitvalue: %s%n" + >> + "stderr: %s%n" + >> + "stdout: %s%n", >> + getCommandLine(pb), exitValue, stderr, stdout); >> and here: >> + String executable = Paths.get(Utils.TEST_NATIVE_PATH, executableName) >> + .toAbsolutePath() >> + .toString(); >> indentation again. >> Thanks, >> David >> ----- >>> Thanks, >>> -- Igor >>> From yumin.qi at oracle.com Wed Sep 2 03:12:45 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Tue, 1 Sep 2020 20:12:45 -0700 Subject: 8248337: sparc related code clean up after solaris removal In-Reply-To: References: <8476b337-f034-19ce-e502-313da7d048c7@oracle.com> <0e96ff53-2853-960f-3140-87d774345c05@oracle.com> Message-ID: Hi, Vladimir ? Thanks for re-review! Yumin On 9/1/20 4:48 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir K > > On 9/1/20 3:40 PM, Yumin Qi wrote: >> HI, Vladimir and David >> >> ??? I have updated new webrev at: http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-02/ >> >> ??? Filed two issues to address your concern separately: >> >> ??? 1) 8252681: Retire flag UseRDPCForConstantTableBase after solaris removal >> >> https://bugs.openjdk.java.net/browse/JDK-8252681 >> >> ?? 2) 8252682: investigate PhaseChaitin::post_allocate_copy_removal after solaris removal >> >> https://bugs.openjdk.java.net/browse/JDK-8252682 >> >> >> ?? So following three files leave no change: >> >> share/opto/c2_globals.hpp >> >> share/opto/machnode.hpp >> >> share/opto/postaloc.cpp >> >> ?? Also update some copyright year for several files. >> >> >> Thanks >> >> Yumin >> >> >> On 9/1/20 10:35 AM, Vladimir Kozlov wrote: >>> On 8/31/20 6:18 PM, David Holmes wrote: >>>> Hi Yumin, >>>> >>>> On 1/09/2020 7:32 am, Yumin Qi wrote: >>>>> HI, >>>>> >>>>> ?? Please review for >>>>> >>>>> ?? bug: https://bugs.openjdk.java.net/browse/JDK-8248337 >>>>> >>>>> webrev:http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-01/ >>>>> >>>>> >>>>> ?? Summary: After Solaris supported files removed from repo, there are some remnants which needs cleaning up. Some comments are not correct, and some refer to wrong files. >>>> >>>> Those changes are mostly okay but I have a few minor issues/suggestions below. >>>> >>>>> There is a flag seems only useful for Sparc: UseRDPCForConstantTableBase, which got removed in this patch . >>>> >>>> Despite the description of the flag it is far from clear that the use of the flag affects sparc only. It affects the pinned() function so seems somewhat platform agnostic in that sense - which is why this was not dealt with in the SPARC removal process. I think this needs closer examination by the compiler folk, with a recommendation on whether it can/should be changed or not. Regardless as this is a product flag then I think this change should be factored out and we go through the appropriate deprecate/obsolete/expire process. >>> >>> The flag was used to use special SPARC instruction for CPUs supporting it to load base of Constant table. >>> It is useless for other platforms. MachConstantBaseNode::pinned() method can be removed because it inherits the method from Node::pinned() which returns 'false' too. >>> >>> And I agree with David that it should be done separately because it is product flag. >>> >>>> >>>>> Also in postaloc.cpp, the delay slot seems is only for sparc too, but I am not sure about that. Most of the patch are in comment section. >>>> >>>> It refers to spill slot not delay slot. I don't see anything obviously sparc specific about that block of code. >>> >>> Please, leave the code as it is. As David said it is about normal spill slots for all platforms. >>> I am not sure it is SPARC specific currently with all platforms OpenJDK supports. >>> If you want you can file RFE to replace code with assert and ask community to run a lot of testing to see if we hit the assert. >>> >>>> >>>> Specific comments: >>>> >>>> src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp >>>> >>>> -// 64 bits items (sparc abi) even though java would only store >>>> +// 64 bits items even though java would only store >>>> >>>> Should "(sparc abi)" be replaced with "(Aarch64 abi)" as you did for other platforms? >>>> >>>> --- >>>> >>>> src/hotspot/cpu/arm/frame_arm.hpp (and other files) >>>> >>>> ??? // The interpreter and adapters will extend the frame of the caller. >>>> ??? // Since oopMaps are based on the sp of the caller before extension >>>> -? // we need to know that value. However in order to compute the address >>>> -? // of the return address we need the real "raw" sp. Since sparc already >>>> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >>>> -? // original sp we use that convention. >>>> +? // we need to know that value. However in order to compute the return >>>> +? // address we need the real "raw" sp. >>>> >>>> I think this is losing too much information as it no longer describes the convention. I would suggest: >>>> >>>> ??? // The interpreter and adapters will extend the frame of the caller. >>>> ??? // Since oopMaps are based on the sp of the caller before extension >>>> ??? // we need to know that value. However in order to compute the address >>>> -? // of the return address we need the real "raw" sp. Since sparc already >>>> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >>>> -? // original sp we use that convention. >>>> +? // of the return address we need the real "raw" sp. By convention we >>>> +? // use sp() to mean "raw" sp and unextended_sp() to mean the caller's >>>> +? // original sp. >>>> >>>> --- >>>> >>>> src/hotspot/cpu/ppc/jniTypes_ppc.hpp >>>> >>>> -? // stubGenerator_sparc.cpp) reverse the argument list constructed by >>>> +? // stubGenerator_${CPU}.cpp) reverse the argument list constructed by >>>> >>>> Just replace sparc with ppc as done for other platforms. >>>> >>>> --- >>>> >>>> src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp >>>> >>>> -? // This greatly simplifies the cases here compared to sparc. >>>> +? // This greatly simplifies the cases here. >>>> >>>> Just delete the comment as there is nothing to compare simplicity or complexity against. >>>> >>>> --- >>>> >>>> src/hotspot/share/c1/c1_LIRGenerator.cpp >>>> >>>> -? // In 64bit the type can be long, sparc doesn't have this assert >>>> +? // In 64bit the type can be long >>>> ??? // assert(offset.type()->tag() == intTag, "invalid type"); >>>> >>>> compiler folk should decide what to do here but I think the comment and commented out assert can just be deleted. >>> >>> Yes, remove commented assert too. Originally it was platform specific code - the assert was there for 32-bit. >>> >>> Thanks, >>> Vladimir K >>> >>>> >>>> --- >>>> >>>> src/hotspot/share/c1/c1_Runtime1.cpp >>>> >>>> -? case handle_exception_nofpu_id:? // Unused on sparc >>>> +? case handle_exception_nofpu_id:? // unused. >>>> >>>> the new comment is incorrect as this case is not unused. I suggest just deleting the comment. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> >>>>> >>>>> ?? Tests passed tier1-4 >>>>> >>>>> >>>>> ?? Thanks >>>>> >>>>> ?? Yumin >>>>> From david.holmes at oracle.com Wed Sep 2 04:07:17 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Sep 2020 14:07:17 +1000 Subject: 8248337: sparc related code clean up after solaris removal In-Reply-To: <0e96ff53-2853-960f-3140-87d774345c05@oracle.com> References: <8476b337-f034-19ce-e502-313da7d048c7@oracle.com> <0e96ff53-2853-960f-3140-87d774345c05@oracle.com> Message-ID: <7858cdf4-25cd-4e14-ec60-b3c256f59c9d@oracle.com> Hi Yumin, Update looks good. Thanks for filing the other RFEs. David On 2/09/2020 8:40 am, Yumin Qi wrote: > HI, Vladimir and David > > ?? I have updated new webrev at: > http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-02/ > > ?? Filed two issues to address your concern separately: > > ?? 1) 8252681: Retire flag UseRDPCForConstantTableBase after solaris > removal > > https://bugs.openjdk.java.net/browse/JDK-8252681 > > ? 2) 8252682: investigate PhaseChaitin::post_allocate_copy_removal > after solaris removal > > https://bugs.openjdk.java.net/browse/JDK-8252682 > > > ? So following three files leave no change: > > share/opto/c2_globals.hpp > > share/opto/machnode.hpp > > share/opto/postaloc.cpp > > ? Also update some copyright year for several files. > > > Thanks > > Yumin > > > On 9/1/20 10:35 AM, Vladimir Kozlov wrote: >> On 8/31/20 6:18 PM, David Holmes wrote: >>> Hi Yumin, >>> >>> On 1/09/2020 7:32 am, Yumin Qi wrote: >>>> HI, >>>> >>>> ?? Please review for >>>> >>>> ?? bug: https://bugs.openjdk.java.net/browse/JDK-8248337 >>>> >>>> webrev:http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-01/ >>>> >>>> >>>> ?? Summary: After Solaris supported files removed from repo, there >>>> are some remnants which needs cleaning up. Some comments are not >>>> correct, and some refer to wrong files. >>> >>> Those changes are mostly okay but I have a few minor >>> issues/suggestions below. >>> >>>> There is a flag seems only useful for Sparc: >>>> UseRDPCForConstantTableBase, which got removed in this patch . >>> >>> Despite the description of the flag it is far from clear that the use >>> of the flag affects sparc only. It affects the pinned() function so >>> seems somewhat platform agnostic in that sense - which is why this >>> was not dealt with in the SPARC removal process. I think this needs >>> closer examination by the compiler folk, with a recommendation on >>> whether it can/should be changed or not. Regardless as this is a >>> product flag then I think this change should be factored out and we >>> go through the appropriate deprecate/obsolete/expire process. >> >> The flag was used to use special SPARC instruction for CPUs supporting >> it to load base of Constant table. >> It is useless for other platforms. MachConstantBaseNode::pinned() >> method can be removed because it inherits the method from >> Node::pinned() which returns 'false' too. >> >> And I agree with David that it should be done separately because it is >> product flag. >> >>> >>>> Also in postaloc.cpp, the delay slot seems is only for sparc too, >>>> but I am not sure about that. Most of the patch are in comment section. >>> >>> It refers to spill slot not delay slot. I don't see anything >>> obviously sparc specific about that block of code. >> >> Please, leave the code as it is. As David said it is about normal >> spill slots for all platforms. >> I am not sure it is SPARC specific currently with all platforms >> OpenJDK supports. >> If you want you can file RFE to replace code with assert and ask >> community to run a lot of testing to see if we hit the assert. >> >>> >>> Specific comments: >>> >>> src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp >>> >>> -// 64 bits items (sparc abi) even though java would only store >>> +// 64 bits items even though java would only store >>> >>> Should "(sparc abi)" be replaced with "(Aarch64 abi)" as you did for >>> other platforms? >>> >>> --- >>> >>> src/hotspot/cpu/arm/frame_arm.hpp (and other files) >>> >>> ??? // The interpreter and adapters will extend the frame of the caller. >>> ??? // Since oopMaps are based on the sp of the caller before extension >>> -? // we need to know that value. However in order to compute the >>> address >>> -? // of the return address we need the real "raw" sp. Since sparc >>> already >>> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the >>> caller's >>> -? // original sp we use that convention. >>> +? // we need to know that value. However in order to compute the return >>> +? // address we need the real "raw" sp. >>> >>> I think this is losing too much information as it no longer describes >>> the convention. I would suggest: >>> >>> ??? // The interpreter and adapters will extend the frame of the caller. >>> ??? // Since oopMaps are based on the sp of the caller before extension >>> ??? // we need to know that value. However in order to compute the >>> address >>> -? // of the return address we need the real "raw" sp. Since sparc >>> already >>> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the >>> caller's >>> -? // original sp we use that convention. >>> +? // of the return address we need the real "raw" sp. By convention we >>> +? // use sp() to mean "raw" sp and unextended_sp() to mean the caller's >>> +? // original sp. >>> >>> --- >>> >>> src/hotspot/cpu/ppc/jniTypes_ppc.hpp >>> >>> -? // stubGenerator_sparc.cpp) reverse the argument list constructed by >>> +? // stubGenerator_${CPU}.cpp) reverse the argument list constructed by >>> >>> Just replace sparc with ppc as done for other platforms. >>> >>> --- >>> >>> src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp >>> >>> -? // This greatly simplifies the cases here compared to sparc. >>> +? // This greatly simplifies the cases here. >>> >>> Just delete the comment as there is nothing to compare simplicity or >>> complexity against. >>> >>> --- >>> >>> src/hotspot/share/c1/c1_LIRGenerator.cpp >>> >>> -? // In 64bit the type can be long, sparc doesn't have this assert >>> +? // In 64bit the type can be long >>> ??? // assert(offset.type()->tag() == intTag, "invalid type"); >>> >>> compiler folk should decide what to do here but I think the comment >>> and commented out assert can just be deleted. >> >> Yes, remove commented assert too. Originally it was platform specific >> code - the assert was there for 32-bit. >> >> Thanks, >> Vladimir K >> >>> >>> --- >>> >>> src/hotspot/share/c1/c1_Runtime1.cpp >>> >>> -? case handle_exception_nofpu_id:? // Unused on sparc >>> +? case handle_exception_nofpu_id:? // unused. >>> >>> the new comment is incorrect as this case is not unused. I suggest >>> just deleting the comment. >>> >>> Thanks, >>> David >>> ----- >>> >>>> >>>> >>>> ?? Tests passed tier1-4 >>>> >>>> >>>> ?? Thanks >>>> >>>> ?? Yumin >>>> From david.holmes at oracle.com Wed Sep 2 05:54:48 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Sep 2020 15:54:48 +1000 Subject: RFR: 8252661: Change SafepointMechanism terminology to talk less about "blocking" In-Reply-To: <47e8f00d-dc4d-4ebe-922e-f086cf11d323@oracle.com> References: <47e8f00d-dc4d-4ebe-922e-f086cf11d323@oracle.com> Message-ID: <96ca0637-90c6-8eee-f850-63550898482f@oracle.com> Hi Erik, Okay let's play ... :) On 2/09/2020 1:53 am, Erik ?sterlund wrote: > Hi, > > The SafepointMechanism class has been used to perform safepoint > operations, originally. Now we also perform handshake operations, and > soon also concurrent stack processing, using the same hooks. > Therefore, names such as SafepointMechanism::should_block no longer > sound right, when the real question is whether it should process a > pending operation (be that a safepoint, handshake or whatever else). Here's the proposed set of name changes: block_or_handshake -> process_operation block_if_requested -> process_operation_if_requested block_if_requested_slow -> process_operation_if_requested_slow should_block -> should_process_operation So "block" is wrong when you want to process a handshake or concurrent stack scanning operation. block_or_handshake is currently accurate, whereas the other block* methods ignore the "or handshake" part. But names that enumerate choices don't scale well and become unwieldy even with two choices. So we need a verb that captures the need to do something. "process" is not terrible if you consider blocking to be a form of processing, but I personally don't like it in that context in combination with "operation". The multiplicity of what this now does reminds me of has_special_runtime_exit_condition/handle_special_runtime_exit_condition. I don't like the existing "should" naming as the subject is wrong. This reads well to me: if (thread->should_block()) ... whereas this is somewhat jarring: if (SafepointMechanism::should_block(thread)) ... and would be better phrased as: if (SafepointMechanism::needs_to_block(thread)) ... Generalising that: if (SafepointMechanism::is_active_for(thread)) ... when then leads me (back) to: SafepointMechanism::process(thread); SafepointMechanism::process_if_requested(thread); SafepointMechanism::process_if_requested_slow(thread); Next player please ... :) Cheers, David ----- > Naming is hard, so I don't want this discussion in my concurrent stack > processing patch. > I have a webrev with proposed naming changes to better reflect how this > is used: > http://cr.openjdk.java.net/~eosterlund/8252661/webrev.00/ > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8252661 > > Thanks, > /Erik From david.holmes at oracle.com Wed Sep 2 06:26:50 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Sep 2020 16:26:50 +1000 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: References: Message-ID: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> Hi Harold, On 2/09/2020 5:02 am, Harold Seigel wrote: > Hi, > > Please review this change to hotspot test > vmTestbase/nsk/stress/stack/stack016.java.? The test calls a recursive > method and keeps track of the number of repetitions needed to cause an > exception.? It then runs a bunch of threads that call the recursive > method for a multiple of the repetition number, expecting each of them > to get a StackOverflowError or OutOfMemoryError exception. Occasionally, > the test fails because one of the threads does not throw an exception. > > This change tries to fix this in two ways.? One, by making sure that the > thread used to determine the number of repetitions gets a > StackOverflowError or OutOfMemoryError exception, and not some other > unexpected exception. That's a good improvement, though other exceptions seem unlikely - have you actually observed any unexpected exceptions? > The other way is to run the test twice, once with > -Xcomp and once with -Xint, to ensure that thread stack consumption > doesn't vary because the original thread called an interpreted method > and a subsequent thread called a compiled method. Right - running interpreted would give a different maximum recursion depth from running under the JIT, and when we bumped up the stack depth we would have run far more iterations and so JIT'd more code - thus breaking the test. So forcing fully interpreted or fully compiled seems a good way to stabilise things. But I would suggest that you keep the required condition vm.compMode != "Xcomp" to minimise the runs of this test. When executed as part of an Xcomp run both @run's will behave exactly the same way (Xcomp) - and that is the same as the second @run in a non-Xcomp run. So no point running the Xcomp version three times. Thanks, David ----- > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8252249.stack/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8252249 > > The modified test was tested on Mac OS, Linux x64, and Windows. > > Thanks, Harold > From erik.osterlund at oracle.com Wed Sep 2 06:37:37 2020 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Wed, 2 Sep 2020 08:37:37 +0200 Subject: RFR: 8252661: Change SafepointMechanism terminology to talk less about "blocking" In-Reply-To: <96ca0637-90c6-8eee-f850-63550898482f@oracle.com> References: <96ca0637-90c6-8eee-f850-63550898482f@oracle.com> Message-ID: <4882643F-E6B4-476D-9EB8-EDE67729030B@oracle.com> Hi David, Not bad! I tend to agree, and like it. How do we feel about ?yield? instead of process though? We yield the normal execution to do... something. Like this: SafepointMechanism::yield(thread); SafepointMechanism::yield_if_requested(thread); SafepointMechanism::yield_if_requested_slow(thread); It is yet a bit more abstract I think. Oh and a few characters shorter. What do you think? Yield vs process? Next player! Thanks, /Erik > On 2 Sep 2020, at 07:54, David Holmes wrote: > > ?Hi Erik, > > Okay let's play ... :) > >> On 2/09/2020 1:53 am, Erik ?sterlund wrote: >> Hi, >> The SafepointMechanism class has been used to perform safepoint operations, originally. Now we also perform handshake operations, and >> soon also concurrent stack processing, using the same hooks. >> Therefore, names such as SafepointMechanism::should_block no longer >> sound right, when the real question is whether it should process a >> pending operation (be that a safepoint, handshake or whatever else). > > Here's the proposed set of name changes: > > block_or_handshake -> process_operation > block_if_requested -> process_operation_if_requested > block_if_requested_slow -> process_operation_if_requested_slow > should_block -> should_process_operation > > So "block" is wrong when you want to process a handshake or concurrent stack scanning operation. > > block_or_handshake is currently accurate, whereas the other block* methods ignore the "or handshake" part. But names that enumerate choices don't scale well and become unwieldy even with two choices. > > So we need a verb that captures the need to do something. "process" is not terrible if you consider blocking to be a form of processing, but I personally don't like it in that context in combination with "operation". > > The multiplicity of what this now does reminds me of has_special_runtime_exit_condition/handle_special_runtime_exit_condition. > > I don't like the existing "should" naming as the subject is wrong. This reads well to me: > > if (thread->should_block()) ... > > whereas this is somewhat jarring: > > if (SafepointMechanism::should_block(thread)) ... > > and would be better phrased as: > > if (SafepointMechanism::needs_to_block(thread)) ... > > Generalising that: > > if (SafepointMechanism::is_active_for(thread)) ... > > when then leads me (back) to: > > SafepointMechanism::process(thread); > SafepointMechanism::process_if_requested(thread); > SafepointMechanism::process_if_requested_slow(thread); > > Next player please ... :) > > Cheers, > David > ----- > >> Naming is hard, so I don't want this discussion in my concurrent stack processing patch. >> I have a webrev with proposed naming changes to better reflect how this is used: >> http://cr.openjdk.java.net/~eosterlund/8252661/webrev.00/ >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8252661 >> Thanks, >> /Erik From david.holmes at oracle.com Wed Sep 2 06:56:39 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Sep 2020 16:56:39 +1000 Subject: RFR: 8252661: Change SafepointMechanism terminology to talk less about "blocking" In-Reply-To: <4882643F-E6B4-476D-9EB8-EDE67729030B@oracle.com> References: <96ca0637-90c6-8eee-f850-63550898482f@oracle.com> <4882643F-E6B4-476D-9EB8-EDE67729030B@oracle.com> Message-ID: On 2/09/2020 4:37 pm, Erik ?sterlund wrote: > Hi David, > > Not bad! I tend to agree, and like it. > > How do we feel about ?yield? instead of process though? We yield the normal execution to do... something. Like this: > > SafepointMechanism::yield(thread); > SafepointMechanism::yield_if_requested(thread); > SafepointMechanism::yield_if_requested_slow(thread); > > It is yet a bit more abstract I think. Oh and a few characters shorter. > > What do you think? Yield vs process? Sorry yield => Thread.yield => sched_yield - scheduling! yield sounds very much like block as well. So I'd vote for process over yield. Cheers, David ----- > Next player! > > Thanks, > /Erik > >> On 2 Sep 2020, at 07:54, David Holmes wrote: >> >> ?Hi Erik, >> >> Okay let's play ... :) >> >>> On 2/09/2020 1:53 am, Erik ?sterlund wrote: >>> Hi, >>> The SafepointMechanism class has been used to perform safepoint operations, originally. Now we also perform handshake operations, and >>> soon also concurrent stack processing, using the same hooks. >>> Therefore, names such as SafepointMechanism::should_block no longer >>> sound right, when the real question is whether it should process a >>> pending operation (be that a safepoint, handshake or whatever else). >> >> Here's the proposed set of name changes: >> >> block_or_handshake -> process_operation >> block_if_requested -> process_operation_if_requested >> block_if_requested_slow -> process_operation_if_requested_slow >> should_block -> should_process_operation >> >> So "block" is wrong when you want to process a handshake or concurrent stack scanning operation. >> >> block_or_handshake is currently accurate, whereas the other block* methods ignore the "or handshake" part. But names that enumerate choices don't scale well and become unwieldy even with two choices. >> >> So we need a verb that captures the need to do something. "process" is not terrible if you consider blocking to be a form of processing, but I personally don't like it in that context in combination with "operation". >> >> The multiplicity of what this now does reminds me of has_special_runtime_exit_condition/handle_special_runtime_exit_condition. >> >> I don't like the existing "should" naming as the subject is wrong. This reads well to me: >> >> if (thread->should_block()) ... >> >> whereas this is somewhat jarring: >> >> if (SafepointMechanism::should_block(thread)) ... >> >> and would be better phrased as: >> >> if (SafepointMechanism::needs_to_block(thread)) ... >> >> Generalising that: >> >> if (SafepointMechanism::is_active_for(thread)) ... >> >> when then leads me (back) to: >> >> SafepointMechanism::process(thread); >> SafepointMechanism::process_if_requested(thread); >> SafepointMechanism::process_if_requested_slow(thread); >> >> Next player please ... :) >> >> Cheers, >> David >> ----- >> >>> Naming is hard, so I don't want this discussion in my concurrent stack processing patch. >>> I have a webrev with proposed naming changes to better reflect how this is used: >>> http://cr.openjdk.java.net/~eosterlund/8252661/webrev.00/ >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8252661 >>> Thanks, >>> /Erik > From shade at redhat.com Wed Sep 2 07:00:46 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 2 Sep 2020 09:00:46 +0200 Subject: RFR (XS/T) 8252691: Build failure after JDK-8252481 Message-ID: <2d4d779d-8c86-f665-3c30-150733d82759@redhat.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8252691 Fix: diff -r 33aa4ce7622f src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp --- a/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp Wed Sep 02 11:47:59 2020 +0530 +++ b/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp Wed Sep 02 08:59:13 2020 +0200 @@ -82,4 +82,5 @@ #include "runtime/vmThread.hpp" #include "services/mallocTracker.hpp" +#include "services/memTracker.hpp" #include "utilities/powerOfTwo.hpp" Testing: local builds -- Thanks, -Aleksey From goetz.lindenmaier at sap.com Wed Sep 2 07:10:50 2020 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 2 Sep 2020 07:10:50 +0000 Subject: RFR (XS/T) 8252691: Build failure after JDK-8252481 In-Reply-To: <2d4d779d-8c86-f665-3c30-150733d82759@redhat.com> References: <2d4d779d-8c86-f665-3c30-150733d82759@redhat.com> Message-ID: Hi Aleksey, The change looks good. Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev > On Behalf Of Aleksey Shipilev > Sent: Wednesday, September 2, 2020 9:01 AM > To: hotspot-runtime-dev at openjdk.java.net > Subject: RFR (XS/T) 8252691: Build failure after JDK-8252481 > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8252691 > > Fix: > > diff -r 33aa4ce7622f > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp > --- a/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp Wed Sep > 02 11:47:59 2020 +0530 > +++ b/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp Wed Sep > 02 08:59:13 2020 +0200 > @@ -82,4 +82,5 @@ > #include "runtime/vmThread.hpp" > #include "services/mallocTracker.hpp" > +#include "services/memTracker.hpp" > #include "utilities/powerOfTwo.hpp" > > Testing: local builds > > -- > Thanks, > -Aleksey From shade at redhat.com Wed Sep 2 07:14:59 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 2 Sep 2020 09:14:59 +0200 Subject: RFR (XS/T) 8252691: Build failure after JDK-8252481 In-Reply-To: References: <2d4d779d-8c86-f665-3c30-150733d82759@redhat.com> Message-ID: <4364156f-89b1-b6d9-bd4a-53c72e439c94@redhat.com> On 9/2/20 9:10 AM, Lindenmaier, Goetz wrote: > The change looks good. Thanks, pushed. -- -Aleksey From erik.osterlund at oracle.com Wed Sep 2 08:32:17 2020 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 2 Sep 2020 10:32:17 +0200 Subject: RFR: 8252661: Change SafepointMechanism terminology to talk less about "blocking" In-Reply-To: References: <96ca0637-90c6-8eee-f850-63550898482f@oracle.com> <4882643F-E6B4-476D-9EB8-EDE67729030B@oracle.com> Message-ID: Hi David, Sounds like we have a winner (process) unless anyone else has other suggestions? Thanks, /Erik On 2020-09-02 08:56, David Holmes wrote: > On 2/09/2020 4:37 pm, Erik ?sterlund wrote: >> Hi David, >> >> Not bad! I tend to agree, and like it. >> >> How do we feel about ?yield? instead of process though? We yield the >> normal execution to do... something. Like this: >> >> SafepointMechanism::yield(thread); >> SafepointMechanism::yield_if_requested(thread); >> SafepointMechanism::yield_if_requested_slow(thread); >> >> It is yet a bit more abstract I think. Oh and a few characters shorter. >> >> What do you think? Yield vs process? > > Sorry yield => Thread.yield => sched_yield? - scheduling! > > yield sounds very much like block as well. > > So I'd vote for process over yield. > > Cheers, > David > ----- > >> Next player! >> >> Thanks, >> /Erik >> >>> On 2 Sep 2020, at 07:54, David Holmes wrote: >>> >>> ?Hi Erik, >>> >>> Okay let's play ... :) >>> >>>> On 2/09/2020 1:53 am, Erik ?sterlund wrote: >>>> Hi, >>>> The SafepointMechanism class has been used to perform safepoint >>>> operations, originally. Now we also perform handshake operations, and >>>> soon also concurrent stack processing, using the same hooks. >>>> Therefore, names such as SafepointMechanism::should_block no longer >>>> sound right, when the real question is whether it should process a >>>> pending operation (be that a safepoint, handshake or whatever else). >>> >>> Here's the proposed set of name changes: >>> >>> block_or_handshake -> process_operation >>> block_if_requested -> process_operation_if_requested >>> block_if_requested_slow -> process_operation_if_requested_slow >>> should_block -> should_process_operation >>> >>> So "block" is wrong when you want to process a handshake or >>> concurrent stack scanning operation. >>> >>> block_or_handshake is currently accurate, whereas the other block* >>> methods ignore the "or handshake" part. But names that enumerate >>> choices don't scale well and become unwieldy even with two choices. >>> >>> So we need a verb that captures the need to do something. "process" >>> is not terrible if you consider blocking to be a form of processing, >>> but I personally don't like it in that context in combination with >>> "operation". >>> >>> The multiplicity of what this now does reminds me of >>> has_special_runtime_exit_condition/handle_special_runtime_exit_condition. >>> >>> >>> I don't like the existing "should" naming as the subject is wrong. >>> This reads well to me: >>> >>> if (thread->should_block()) ... >>> >>> whereas this is somewhat jarring: >>> >>> if (SafepointMechanism::should_block(thread)) ... >>> >>> and would be better phrased as: >>> >>> if (SafepointMechanism::needs_to_block(thread)) ... >>> >>> Generalising that: >>> >>> if (SafepointMechanism::is_active_for(thread)) ... >>> >>> when then leads me (back) to: >>> >>> SafepointMechanism::process(thread); >>> SafepointMechanism::process_if_requested(thread); >>> SafepointMechanism::process_if_requested_slow(thread); >>> >>> Next player please ... :) >>> >>> Cheers, >>> David >>> ----- >>> >>>> Naming is hard, so I don't want this discussion in my concurrent >>>> stack processing patch. >>>> I have a webrev with proposed naming changes to better reflect how >>>> this is used: >>>> http://cr.openjdk.java.net/~eosterlund/8252661/webrev.00/ >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8252661 >>>> Thanks, >>>> /Erik >> From sgehwolf at redhat.com Wed Sep 2 09:44:31 2020 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 02 Sep 2020 11:44:31 +0200 Subject: Regression in JDK15/16: CGroup v2 support In-Reply-To: References: <260D8760-CB66-43FE-8870-1AD7A2A6336E@microsoft.com> Message-ID: Hi Bruno, On Tue, 2020-09-01 at 23:23 +1000, David Holmes wrote: > That was the JI number. It is now: > > https://bugs.openjdk.java.net/browse/JDK-8252359 I've done some investigation on this bug with Bob. Could you please check whether the Java API via Metrics.java is affected? My tests seem to indidcate not, but it would be good if you could confirm. Easiest way to verify is by running: $ java -XshowSettings:system -version in a container with memory/cpu limits on your setup. Thanks, Severin From harold.seigel at oracle.com Wed Sep 2 12:20:38 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Wed, 2 Sep 2020 08:20:38 -0400 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> References: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> Message-ID: <0c25c3e9-a6ca-30b4-5f9e-b4905a659f03@oracle.com> Hi David, Thanks for reviewing this.? Please see comments inline. Let me know if you need to see a new webrev. Thanks, Harold On 9/2/2020 2:26 AM, David Holmes wrote: > Hi Harold, > > On 2/09/2020 5:02 am, Harold Seigel wrote: >> Hi, >> >> Please review this change to hotspot test >> vmTestbase/nsk/stress/stack/stack016.java.? The test calls a >> recursive method and keeps track of the number of repetitions needed >> to cause an exception.? It then runs a bunch of threads that call the >> recursive method for a multiple of the repetition number, expecting >> each of them to get a StackOverflowError or OutOfMemoryError >> exception. Occasionally, the test fails because one of the threads >> does not throw an exception. >> >> This change tries to fix this in two ways.? One, by making sure that >> the thread used to determine the number of repetitions gets a >> StackOverflowError or OutOfMemoryError exception, and not some other >> unexpected exception. > > That's a good improvement, though other exceptions seem unlikely - > have you actually observed any unexpected exceptions? No, but they would be hard to see since the test was eating them. > >> The other way is to run the test twice, once with -Xcomp and once >> with -Xint, to ensure that thread stack consumption doesn't vary >> because the original thread called an interpreted method and a >> subsequent thread called a compiled method. > > Right - running interpreted would give a different maximum recursion > depth from running under the JIT, and when we bumped up the stack > depth we would have run far more iterations and so JIT'd more code - > thus breaking the test. So forcing fully interpreted or fully compiled > seems a good way to stabilise things. > > But I would suggest that you keep the required condition > > vm.compMode != "Xcomp" > > to minimise the runs of this test. When executed as part of an Xcomp > run both @run's will behave exactly the same way (Xcomp) - and that is > the same as the second @run in a non-Xcomp run. So no point running > the Xcomp version three times. Thanks for pointing this out.? I'll restore 'vm-compMode != "Xcomp"' before pushing the change. > > Thanks, > David > ----- > >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8252249.stack/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8252249 >> >> The modified test was tested on Mac OS, Linux x64, and Windows. >> >> Thanks, Harold >> From patric.hedlin at oracle.com Wed Sep 2 12:22:34 2020 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Wed, 2 Sep 2020 14:22:34 +0200 Subject: [16]RFR(S):8249092:InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code In-Reply-To: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com> References: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com> Message-ID: <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> Hi Zhuoren, I don't actually know what behaviour to expect from the Unsafe atomics in this test-case but perhaps you could re-cap the original problem (addressed in JDK-8246051) since it seems to raise some questions. Did you have an example (real code) where this behaviour is essential? Best regards, Patric Hedlin (Including hotspot-runtime-dev at openjdk.java.net) On 2020-09-01 07:35, Wang Zhuo(Zhuoren) wrote: > Hi, this is a fix for a test case. > In -Xcomp mode, compiler/unsafe/TestUnsafeUnalignedSwap.java will fail > because the catch misses the error due to async exception. > This patch uses a loop to make sure the error can be caught. Also > -Xcomp is added in test. > BUG: https://bugs.openjdk.java.net/browse/JDK-8249092 > Patch: http://cr.openjdk.java.net/~wzhuo/8249092/webrev.00/ > > > Regards, > Zhuoren > From coleen.phillimore at oracle.com Wed Sep 2 12:52:56 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Wed, 2 Sep 2020 08:52:56 -0400 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: References: Message-ID: <0378269f-f079-d4fa-07d3-beb51910a4b6@oracle.com> Yes, this looks good!? Thanks for adding the {}s. Coleen On 9/1/20 3:02 PM, Harold Seigel wrote: > Hi, > > Please review this change to hotspot test > vmTestbase/nsk/stress/stack/stack016.java.? The test calls a recursive > method and keeps track of the number of repetitions needed to cause an > exception.? It then runs a bunch of threads that call the recursive > method for a multiple of the repetition number, expecting each of them > to get a StackOverflowError or OutOfMemoryError exception.? > Occasionally, the test fails because one of the threads does not throw > an exception. > > This change tries to fix this in two ways.? One, by making sure that > the thread used to determine the number of repetitions gets a > StackOverflowError or OutOfMemoryError exception, and not some other > unexpected exception.? The other way is to run the test twice, once > with -Xcomp and once with -Xint, to ensure that thread stack > consumption doesn't vary because the original thread called an > interpreted method and a subsequent thread called a compiled method. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8252249.stack/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8252249 > > The modified test was tested on Mac OS, Linux x64, and Windows. > > Thanks, Harold > From harold.seigel at oracle.com Wed Sep 2 12:58:18 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Wed, 2 Sep 2020 12:58:18 +0000 (UTC) Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: <0378269f-f079-d4fa-07d3-beb51910a4b6@oracle.com> References: <0378269f-f079-d4fa-07d3-beb51910a4b6@oracle.com> Message-ID: Thanks Coleen! Harold On 9/2/2020 8:52 AM, Coleen Phillimore wrote: > > Yes, this looks good!? Thanks for adding the {}s. > Coleen > > On 9/1/20 3:02 PM, Harold Seigel wrote: >> Hi, >> >> Please review this change to hotspot test >> vmTestbase/nsk/stress/stack/stack016.java.? The test calls a >> recursive method and keeps track of the number of repetitions needed >> to cause an exception.? It then runs a bunch of threads that call the >> recursive method for a multiple of the repetition number, expecting >> each of them to get a StackOverflowError or OutOfMemoryError >> exception.? Occasionally, the test fails because one of the threads >> does not throw an exception. >> >> This change tries to fix this in two ways.? One, by making sure that >> the thread used to determine the number of repetitions gets a >> StackOverflowError or OutOfMemoryError exception, and not some other >> unexpected exception.? The other way is to run the test twice, once >> with -Xcomp and once with -Xint, to ensure that thread stack >> consumption doesn't vary because the original thread called an >> interpreted method and a subsequent thread called a compiled method. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8252249.stack/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8252249 >> >> The modified test was tested on Mac OS, Linux x64, and Windows. >> >> Thanks, Harold >> > From david.holmes at oracle.com Wed Sep 2 13:02:16 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Sep 2020 23:02:16 +1000 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: <0c25c3e9-a6ca-30b4-5f9e-b4905a659f03@oracle.com> References: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> <0c25c3e9-a6ca-30b4-5f9e-b4905a659f03@oracle.com> Message-ID: <2d0cf63d-fba2-7524-1915-04da06a7e29a@oracle.com> Hi Harold, No need for updated webrev. David On 2/09/2020 10:20 pm, Harold Seigel wrote: > Hi David, > > Thanks for reviewing this.? Please see comments inline. > > Let me know if you need to see a new webrev. > > Thanks, Harold > > On 9/2/2020 2:26 AM, David Holmes wrote: >> Hi Harold, >> >> On 2/09/2020 5:02 am, Harold Seigel wrote: >>> Hi, >>> >>> Please review this change to hotspot test >>> vmTestbase/nsk/stress/stack/stack016.java.? The test calls a >>> recursive method and keeps track of the number of repetitions needed >>> to cause an exception.? It then runs a bunch of threads that call the >>> recursive method for a multiple of the repetition number, expecting >>> each of them to get a StackOverflowError or OutOfMemoryError >>> exception. Occasionally, the test fails because one of the threads >>> does not throw an exception. >>> >>> This change tries to fix this in two ways.? One, by making sure that >>> the thread used to determine the number of repetitions gets a >>> StackOverflowError or OutOfMemoryError exception, and not some other >>> unexpected exception. >> >> That's a good improvement, though other exceptions seem unlikely - >> have you actually observed any unexpected exceptions? > No, but they would be hard to see since the test was eating them. >> >>> The other way is to run the test twice, once with -Xcomp and once >>> with -Xint, to ensure that thread stack consumption doesn't vary >>> because the original thread called an interpreted method and a >>> subsequent thread called a compiled method. >> >> Right - running interpreted would give a different maximum recursion >> depth from running under the JIT, and when we bumped up the stack >> depth we would have run far more iterations and so JIT'd more code - >> thus breaking the test. So forcing fully interpreted or fully compiled >> seems a good way to stabilise things. >> >> But I would suggest that you keep the required condition >> >> vm.compMode != "Xcomp" >> >> to minimise the runs of this test. When executed as part of an Xcomp >> run both @run's will behave exactly the same way (Xcomp) - and that is >> the same as the second @run in a non-Xcomp run. So no point running >> the Xcomp version three times. > Thanks for pointing this out.? I'll restore 'vm-compMode != "Xcomp"' > before pushing the change. >> >> Thanks, >> David >> ----- >> >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8252249.stack/webrev/index.html >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8252249 >>> >>> The modified test was tested on Mac OS, Linux x64, and Windows. >>> >>> Thanks, Harold >>> From harold.seigel at oracle.com Wed Sep 2 13:04:27 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Wed, 2 Sep 2020 09:04:27 -0400 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: <2d0cf63d-fba2-7524-1915-04da06a7e29a@oracle.com> References: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> <0c25c3e9-a6ca-30b4-5f9e-b4905a659f03@oracle.com> <2d0cf63d-fba2-7524-1915-04da06a7e29a@oracle.com> Message-ID: Thanks! Harold On 9/2/2020 9:02 AM, David Holmes wrote: > Hi Harold, > > No need for updated webrev. > > David > > On 2/09/2020 10:20 pm, Harold Seigel wrote: >> Hi David, >> >> Thanks for reviewing this.? Please see comments inline. >> >> Let me know if you need to see a new webrev. >> >> Thanks, Harold >> >> On 9/2/2020 2:26 AM, David Holmes wrote: >>> Hi Harold, >>> >>> On 2/09/2020 5:02 am, Harold Seigel wrote: >>>> Hi, >>>> >>>> Please review this change to hotspot test >>>> vmTestbase/nsk/stress/stack/stack016.java.? The test calls a >>>> recursive method and keeps track of the number of repetitions >>>> needed to cause an exception.? It then runs a bunch of threads that >>>> call the recursive method for a multiple of the repetition number, >>>> expecting each of them to get a StackOverflowError or >>>> OutOfMemoryError exception. Occasionally, the test fails because >>>> one of the threads does not throw an exception. >>>> >>>> This change tries to fix this in two ways.? One, by making sure >>>> that the thread used to determine the number of repetitions gets a >>>> StackOverflowError or OutOfMemoryError exception, and not some >>>> other unexpected exception. >>> >>> That's a good improvement, though other exceptions seem unlikely - >>> have you actually observed any unexpected exceptions? >> No, but they would be hard to see since the test was eating them. >>> >>>> The other way is to run the test twice, once with -Xcomp and once >>>> with -Xint, to ensure that thread stack consumption doesn't vary >>>> because the original thread called an interpreted method and a >>>> subsequent thread called a compiled method. >>> >>> Right - running interpreted would give a different maximum recursion >>> depth from running under the JIT, and when we bumped up the stack >>> depth we would have run far more iterations and so JIT'd more code - >>> thus breaking the test. So forcing fully interpreted or fully >>> compiled seems a good way to stabilise things. >>> >>> But I would suggest that you keep the required condition >>> >>> vm.compMode != "Xcomp" >>> >>> to minimise the runs of this test. When executed as part of an Xcomp >>> run both @run's will behave exactly the same way (Xcomp) - and that >>> is the same as the second @run in a non-Xcomp run. So no point >>> running the Xcomp version three times. >> Thanks for pointing this out.? I'll restore 'vm-compMode != "Xcomp"' >> before pushing the change. >>> >>> Thanks, >>> David >>> ----- >>> >>>> >>>> Open Webrev: >>>> http://cr.openjdk.java.net/~hseigel/bug_8252249.stack/webrev/index.html >>>> >>>> >>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8252249 >>>> >>>> The modified test was tested on Mac OS, Linux x64, and Windows. >>>> >>>> Thanks, Harold >>>> From richard.reingruber at sap.com Wed Sep 2 13:48:12 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 2 Sep 2020 13:48:12 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> Message-ID: Hi Robbin, // taking the discussion back to the mailing lists > I still don't understand why you don't deoptimize the objects inside the > handshake/safepoint instead? This is unfortunately not possible. Deoptimizing objects includes reallocating scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This cannot be done at a safepoint or handshake. 1. The vm thread is not allowed to allocate on the java heap See for instance assertions in ParallelScavengeHeap::mem_allocate() https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp#L258 This is not easy to change, I suppose, because it will be difficult to gc if necessary. 2. Using a direct handshake would not work either. The problem there is again gc. Let J be the JavaThread that is executing the direct handshake. The vm would deadlock if the vm thread waits for J to execute the closure of a handshake-all and J waits for the vm thread to execute a gc vm operation. Patricio Chilano made me aware of this: https://bugs.openjdk.java.net/browse/JDK-8230594 Cheers, Richard. -----Original Message----- From: Robbin Ehn Sent: Mittwoch, 2. September 2020 13:56 To: Reingruber, Richard Cc: Lindenmaier, Goetz ; Vladimir Kozlov ; David Holmes Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi, I still don't understand why you don't deoptimize the objects inside the handshake/safepoint instead? E.g. JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the code from: eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over the stack, so: void GetOwnedMonitorInfoClosure::do_thread(Thread *target) { assert(target->is_Java_thread(), "just checking"); JavaThread *jt = (JavaThread *)target; if (!jt->is_exiting() && (jt->threadObj() != NULL)) { + if (EscapeBarrier::deoptimize_objects(jt, MaxJavaStackTraceDepth)) { _result = ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, _owned_monitors_list); } else { _result = JVMTI_ERROR_OUT_OF_MEMORY; } } } Why try 'suspend' the thread first? When we de-optimize all threads why not just in the following safepoint? E.g. VM_HeapWalkOperation::doit() { + EscapeBarrier::deoptimize_objects_all_threads(); ... } Thanks, Robbin From igor.ignatyev at oracle.com Wed Sep 2 14:45:25 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 2 Sep 2020 07:45:25 -0700 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> References: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> Message-ID: <1EA617E1-97C7-4548-9B6B-FB0612407066@oracle.com> Hi Harold, vm.compMode != "Xcomp" will exclude all combinations which has Xcomp, e.g. '-Xcomp -XX:-TieredCompilation', so it would be waste iff all other flags from Xcomp configurations are also used w/o Xcomp, which I don't think is generally true. given vm flags from jtreg actions are appended to external flags, running this test w/ '-Xcomp -XX:-TieredCompilation' (assuming there is no @requires) will actually result in two different (and one might argue expected) runs: -Xcomp -XX:-TieredCompilation -Xint -Xss448K nsk.stress.stack.stack016 -eager -Xcomp -XX:-TieredCompilation -Xcomp -Xss448K nsk.stress.stack.stack016 -eager thus I don't think '@requires vm.compMode' is needed here. if you think that '-Xint' run should be excluded when one specifies -Xcomp externally, I'd suggest splitting the test into two /* * @test id=Xint * @requires vm.compMode == null | vm.compMode == "Xint" * @requires (vm.opt.DeoptimizeALot != true) * @library /vmTestbase * @build nsk.share.Terminator * @run main/othervm/timeout=900 -Xint -Xss448K nsk.stress.stack.stack016 -eager */ /* * @test id=Xcomp * @requires vm.compMode == null | vm.compMode == "Xcomp" * @requires (vm.opt.DeoptimizeALot != true) * @library /vmTestbase * @build nsk.share.Terminator * @run main/othervm/timeout=900 -Xcomp -Xss448K nsk.stress.stack.stack016 -eager */ in this case Xint run will be run only if there is no Xint/Xmixed/Xbatch/Xcomp specified externally or Xint is specified, and similarly for Xcomp. Thanks, -- Igor > On Sep 1, 2020, at 11:26 PM, David Holmes wrote: > > I would suggest that you keep the required condition > > vm.compMode != "Xcomp" > > to minimise the runs of this test. When executed as part of an Xcomp run both @run's will behave exactly the same way (Xcomp) - and that is the same as the second @run in a non-Xcomp run. So no point running the Xcomp version three times. From bob.vandette at oracle.com Wed Sep 2 14:47:05 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 2 Sep 2020 10:47:05 -0400 Subject: Regression in JDK15/16: CGroup v2 support In-Reply-To: References: <260D8760-CB66-43FE-8870-1AD7A2A6336E@microsoft.com> Message-ID: <0FE6C692-292C-4852-B956-3395325E9F39@oracle.com> I wouldn?t expect the Metrics APIs to have a problem. It looks for the file system type in the correct position. CgroupV1Subsystem.java try (Stream lines = CgroupUtil.readFilePrivileged(Paths.get("/proc/self/mountinfo"))) { lines.filter(line -> line.contains(" - cgroup ?)) CgroupV2Subsystem.java try (Stream lines = CgroupUtil.readFilePrivileged(Paths.get("/proc/self/mountinfo"))) { String l = lines.filter(line -> line.contains(" - cgroup2 ?)) Bob. > On Sep 2, 2020, at 5:44 AM, Severin Gehwolf wrote: > > Hi Bruno, > > On Tue, 2020-09-01 at 23:23 +1000, David Holmes wrote: >> That was the JI number. It is now: >> >> https://bugs.openjdk.java.net/browse/JDK-8252359 > > I've done some investigation on this bug with Bob. > > Could you please check whether the Java API via Metrics.java is > affected? My tests seem to indidcate not, but it would be good if you > could confirm. Easiest way to verify is by running: > > $ java -XshowSettings:system -version > > in a container with memory/cpu limits on your setup. > > Thanks, > Severin > From robbin.ehn at oracle.com Wed Sep 2 14:54:13 2020 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 2 Sep 2020 16:54:13 +0200 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> Message-ID: <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> Hi Richard, On 2020-09-02 15:48, Reingruber, Richard wrote: > Hi Robbin, > > // taking the discussion back to the mailing lists > > > I still don't understand why you don't deoptimize the objects inside the > > handshake/safepoint instead? So for handshakes using asynch handshake and allowing blocking inside would fix that. (future fix, I'm working on that now) For safepoint, since we have suspended all threads, ~'safepointed them' with a JavaThread, you _could_ just execute the action directly (e.g. skipping VM_HeapWalkOperation safepoint) since they are suppose to be safely suspended until the destructor of EB, no? So I suggest future work to instead just execute the safepoint with the requesting JT instead of having a this special safepoiting mechanism. Since you are missing above functionality I see why you went this way. If you need to push it, it's fine by me. Thanks for explaining once again :) /Robbin > > This is unfortunately not possible. Deoptimizing objects includes reallocating > scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This > cannot be done at a safepoint or handshake. > > 1. The vm thread is not allowed to allocate on the java heap > See for instance assertions in ParallelScavengeHeap::mem_allocate() > https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKya9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ > > This is not easy to change, I suppose, because it will be difficult to gc if > necessary. > > 2. Using a direct handshake would not work either. The problem there is again > gc. Let J be the JavaThread that is executing the direct handshake. The vm > would deadlock if the vm thread waits for J to execute the closure of a > handshake-all and J waits for the vm thread to execute a gc vm operation. > Patricio Chilano made me aware of this: https://bugs.openjdk.java.net/browse/JDK-8230594 > > Cheers, Richard. > > -----Original Message----- > From: Robbin Ehn > Sent: Mittwoch, 2. September 2020 13:56 > To: Reingruber, Richard > Cc: Lindenmaier, Goetz ; Vladimir Kozlov ; David Holmes > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi, > > I still don't understand why you don't deoptimize the objects inside the > handshake/safepoint instead? > > E.g. > > JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the code > from: > eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over the > stack, so: > > void > GetOwnedMonitorInfoClosure::do_thread(Thread *target) { > assert(target->is_Java_thread(), "just checking"); > JavaThread *jt = (JavaThread *)target; > > if (!jt->is_exiting() && (jt->threadObj() != NULL)) { > + if (EscapeBarrier::deoptimize_objects(jt, MaxJavaStackTraceDepth)) { > _result = > ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, > _owned_monitors_list); > } else { > _result = JVMTI_ERROR_OUT_OF_MEMORY; > } > } > } > > Why try 'suspend' the thread first? > > > When we de-optimize all threads why not just in the following safepoint? > E.g. > VM_HeapWalkOperation::doit() { > + EscapeBarrier::deoptimize_objects_all_threads(); > ... > } > > Thanks, Robbin > > From richard.reingruber at sap.com Wed Sep 2 15:15:13 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 2 Sep 2020 15:15:13 +0000 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check Message-ID: Hi, please help review this fix for a race condition in JavaThread::java_suspend_self_with_safepoint_check() that allows a suspended thread to continue executing java for an arbitrary long time (see repro test attached to bug report). Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8252521 The fix is to add a do-while-loop to java_suspend_self_with_safepoint_check() that checks if the current thread was suspended again after returning from java_suspend_self() and before restoring the original thread state. The check is performed after restoring the original state because then we are guaranteed to see the suspend request issued before the requester observed that target to be _thread_blocked and executed VM_ThreadSuspend. Thanks, Richard. From yumin.qi at oracle.com Wed Sep 2 16:17:43 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Wed, 2 Sep 2020 09:17:43 -0700 Subject: 8248337: sparc related code clean up after solaris removal In-Reply-To: <7858cdf4-25cd-4e14-ec60-b3c256f59c9d@oracle.com> References: <8476b337-f034-19ce-e502-313da7d048c7@oracle.com> <0e96ff53-2853-960f-3140-87d774345c05@oracle.com> <7858cdf4-25cd-4e14-ec60-b3c256f59c9d@oracle.com> Message-ID: <9a5dc120-b7cc-6ce9-3346-77403e5e534d@oracle.com> HI, David ? Thanks for re-review! Yumin On 9/1/20 9:07 PM, David Holmes wrote: > Hi Yumin, > > Update looks good. Thanks for filing the other RFEs. > > David > > On 2/09/2020 8:40 am, Yumin Qi wrote: >> HI, Vladimir and David >> >> ??? I have updated new webrev at: http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-02/ >> >> ??? Filed two issues to address your concern separately: >> >> ??? 1) 8252681: Retire flag UseRDPCForConstantTableBase after solaris removal >> >> https://bugs.openjdk.java.net/browse/JDK-8252681 >> >> ?? 2) 8252682: investigate PhaseChaitin::post_allocate_copy_removal after solaris removal >> >> https://bugs.openjdk.java.net/browse/JDK-8252682 >> >> >> ?? So following three files leave no change: >> >> share/opto/c2_globals.hpp >> >> share/opto/machnode.hpp >> >> share/opto/postaloc.cpp >> >> ?? Also update some copyright year for several files. >> >> >> Thanks >> >> Yumin >> >> >> On 9/1/20 10:35 AM, Vladimir Kozlov wrote: >>> On 8/31/20 6:18 PM, David Holmes wrote: >>>> Hi Yumin, >>>> >>>> On 1/09/2020 7:32 am, Yumin Qi wrote: >>>>> HI, >>>>> >>>>> ?? Please review for >>>>> >>>>> ?? bug: https://bugs.openjdk.java.net/browse/JDK-8248337 >>>>> >>>>> webrev:http://cr.openjdk.java.net/~minqi/2020/8248337/webrev-01/ >>>>> >>>>> >>>>> ?? Summary: After Solaris supported files removed from repo, there are some remnants which needs cleaning up. Some comments are not correct, and some refer to wrong files. >>>> >>>> Those changes are mostly okay but I have a few minor issues/suggestions below. >>>> >>>>> There is a flag seems only useful for Sparc: UseRDPCForConstantTableBase, which got removed in this patch . >>>> >>>> Despite the description of the flag it is far from clear that the use of the flag affects sparc only. It affects the pinned() function so seems somewhat platform agnostic in that sense - which is why this was not dealt with in the SPARC removal process. I think this needs closer examination by the compiler folk, with a recommendation on whether it can/should be changed or not. Regardless as this is a product flag then I think this change should be factored out and we go through the appropriate deprecate/obsolete/expire process. >>> >>> The flag was used to use special SPARC instruction for CPUs supporting it to load base of Constant table. >>> It is useless for other platforms. MachConstantBaseNode::pinned() method can be removed because it inherits the method from Node::pinned() which returns 'false' too. >>> >>> And I agree with David that it should be done separately because it is product flag. >>> >>>> >>>>> Also in postaloc.cpp, the delay slot seems is only for sparc too, but I am not sure about that. Most of the patch are in comment section. >>>> >>>> It refers to spill slot not delay slot. I don't see anything obviously sparc specific about that block of code. >>> >>> Please, leave the code as it is. As David said it is about normal spill slots for all platforms. >>> I am not sure it is SPARC specific currently with all platforms OpenJDK supports. >>> If you want you can file RFE to replace code with assert and ask community to run a lot of testing to see if we hit the assert. >>> >>>> >>>> Specific comments: >>>> >>>> src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp >>>> >>>> -// 64 bits items (sparc abi) even though java would only store >>>> +// 64 bits items even though java would only store >>>> >>>> Should "(sparc abi)" be replaced with "(Aarch64 abi)" as you did for other platforms? >>>> >>>> --- >>>> >>>> src/hotspot/cpu/arm/frame_arm.hpp (and other files) >>>> >>>> ??? // The interpreter and adapters will extend the frame of the caller. >>>> ??? // Since oopMaps are based on the sp of the caller before extension >>>> -? // we need to know that value. However in order to compute the address >>>> -? // of the return address we need the real "raw" sp. Since sparc already >>>> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >>>> -? // original sp we use that convention. >>>> +? // we need to know that value. However in order to compute the return >>>> +? // address we need the real "raw" sp. >>>> >>>> I think this is losing too much information as it no longer describes the convention. I would suggest: >>>> >>>> ??? // The interpreter and adapters will extend the frame of the caller. >>>> ??? // Since oopMaps are based on the sp of the caller before extension >>>> ??? // we need to know that value. However in order to compute the address >>>> -? // of the return address we need the real "raw" sp. Since sparc already >>>> -? // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's >>>> -? // original sp we use that convention. >>>> +? // of the return address we need the real "raw" sp. By convention we >>>> +? // use sp() to mean "raw" sp and unextended_sp() to mean the caller's >>>> +? // original sp. >>>> >>>> --- >>>> >>>> src/hotspot/cpu/ppc/jniTypes_ppc.hpp >>>> >>>> -? // stubGenerator_sparc.cpp) reverse the argument list constructed by >>>> +? // stubGenerator_${CPU}.cpp) reverse the argument list constructed by >>>> >>>> Just replace sparc with ppc as done for other platforms. >>>> >>>> --- >>>> >>>> src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp >>>> >>>> -? // This greatly simplifies the cases here compared to sparc. >>>> +? // This greatly simplifies the cases here. >>>> >>>> Just delete the comment as there is nothing to compare simplicity or complexity against. >>>> >>>> --- >>>> >>>> src/hotspot/share/c1/c1_LIRGenerator.cpp >>>> >>>> -? // In 64bit the type can be long, sparc doesn't have this assert >>>> +? // In 64bit the type can be long >>>> ??? // assert(offset.type()->tag() == intTag, "invalid type"); >>>> >>>> compiler folk should decide what to do here but I think the comment and commented out assert can just be deleted. >>> >>> Yes, remove commented assert too. Originally it was platform specific code - the assert was there for 32-bit. >>> >>> Thanks, >>> Vladimir K >>> >>>> >>>> --- >>>> >>>> src/hotspot/share/c1/c1_Runtime1.cpp >>>> >>>> -? case handle_exception_nofpu_id:? // Unused on sparc >>>> +? case handle_exception_nofpu_id:? // unused. >>>> >>>> the new comment is incorrect as this case is not unused. I suggest just deleting the comment. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> >>>>> >>>>> ?? Tests passed tier1-4 >>>>> >>>>> >>>>> ?? Thanks >>>>> >>>>> ?? Yumin >>>>> From bob.vandette at oracle.com Wed Sep 2 17:54:09 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 2 Sep 2020 13:54:09 -0400 Subject: RFR: 8252359 - HotSpot Not Identifying it is Running in a Container Message-ID: <114257D0-5401-4E8C-BD09-87F109D74C5D@oracle.com> Problem: Hotspot does not properly detect that it?s running in a container on Mac and Windows docker desktop based containers. BUG: https://bugs.openjdk.java.net/browse/JDK-8252359 WEBREV: http://cr.openjdk.java.net/~bobv/8252359/webrev.01 CAUSE: The problem is caused by improper scanning of /proc/mountinfo. Failing System 439 434 0:33 /docker/0a67832faae434cc8fa0a942e04a308bb8b120b5099018f1ee09f1d7a52746b7 /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime master:22 - cgroup memory rw,memory Working System 36 31 0:31 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,memory When the cgroupv2 support was added [2] we started scanning the second ?cgroup? entry in /proc/mountinfo as the file system type. According to the man pages, the first ?cgroup? is the file system type. Entry position 10 in [1]mountinfo is ?mount source: filesystem-specific information?. All systems we?ve tested on thus far have had two cgroup strings so this problem was not caught. The fix is to scan the proper file system type entry and ignore the second entry. I also updated the container test to validate that we do indeed look to match the correct token in the mountinfo string. TESTING: Ran the container tests on cgroupv1 and cgroupv2 systems along with the original failure configuration (Docker on Mac). Bob. [1] https://man7.org/linux/man-pages/man5/proc.5.htm [2] https://bugs.openjdk.java.net/browse/JDK-8230305 From harold.seigel at oracle.com Wed Sep 2 20:21:44 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Wed, 2 Sep 2020 16:21:44 -0400 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: <1EA617E1-97C7-4548-9B6B-FB0612407066@oracle.com> References: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> <1EA617E1-97C7-4548-9B6B-FB0612407066@oracle.com> Message-ID: Hi Igor, I think that the test needs to only run twice, once with -Xcomp and once with -Xint.? It does not need to run with multiple -Xcomp configurations.? It needs only to run with consistent stack sizes, so that stack consumption doesn't vary because some threads run an interpreted method and others run a compiled method. My understanding is that "vm.compMode != "Xcomp" will cause the test to be run without -Xcomp.? The test will then run using its internal -Xint flag and then its -Xcomp flags.? Which I think is what we want. Thanks, Harold On 9/2/2020 10:45 AM, Igor Ignatyev wrote: > Hi Harold, > > vm.compMode != "Xcomp" will exclude all combinations which has Xcomp, > e.g. '-Xcomp -XX:-TieredCompilation', so it would be waste iff all > other flags from Xcomp configurations are also used w/o Xcomp, which I > don't think is generally true. > > given vm flags from jtreg actions are appended to external flags, > running this test w/ '-Xcomp -XX:-TieredCompilation' (assuming there > is no @requires) will actually result in two different (and one might > argue expected) runs: > ?-Xcomp -XX:-TieredCompilation -Xint -Xss448K > nsk.stress.stack.stack016?-eager > ?-Xcomp -XX:-TieredCompilation -Xcomp -Xss448K > nsk.stress.stack.stack016?-eager > > thus I don't think '@requires vm.compMode' is needed here. > > if you think that '-Xint' run should be excluded when one specifies > -Xcomp externally, I'd suggest splitting the test into two > > /* > ?* @test id=Xint > ?* @requires vm.compMode == null | vm.compMode == "Xint" > ?* @requires (vm.opt.DeoptimizeALot != true) > ?* @library /vmTestbase > ?* @build nsk.share.Terminator > ?* @run main/othervm/timeout=900 -Xint -Xss448K > nsk.stress.stack.stack016 -eager > ?*/ > > /* > ?* @test id=Xcomp > ?* @requires vm.compMode == null | vm.compMode == "Xcomp" > ?* @requires (vm.opt.DeoptimizeALot != true) > ?* @library /vmTestbase > ?* @build nsk.share.Terminator > ?* @run main/othervm/timeout=900 -Xcomp?-Xss448K > nsk.stress.stack.stack016 -eager > ?*/ > > in this case Xint run will be run only if there is no > Xint/Xmixed/Xbatch/Xcomp specified externally or Xint is specified, > and similarly for Xcomp. > > Thanks, > -- Igor > > > >> On Sep 1, 2020, at 11:26 PM, David Holmes > > wrote: >> >> I would suggest that you keep the required condition >> >> vm.compMode != "Xcomp" >> >> to minimise the runs of this test. When executed as part of an Xcomp >> run both @run's will behave exactly the same way (Xcomp) - and that >> is the same as the second @run in a non-Xcomp run. So no point >> running the Xcomp version three times. > From igor.ignatyev at oracle.com Wed Sep 2 20:33:12 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 2 Sep 2020 13:33:12 -0700 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: References: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> <1EA617E1-97C7-4548-9B6B-FB0612407066@oracle.com> Message-ID: <491BDA6D-46A1-4859-9465-38171C083B7F@oracle.com> Hi Harold, currently, the configurations which a test is run defined by test execution configuration (which is tiers/tasks-definitions in Oracle's case, and I'd guess others their own configuration matrixes), not a test (tests are, however, responsible for filtering out incompatible/meaningless configurations). this particular test is included in runs w/ tens different flag combinations, w/ around a half of them having -Xcomp; and `vm.compMode != "Xcomp"` effectively removes the test from all these runs. Thanks, -- Igor > On Sep 2, 2020, at 1:21 PM, Harold Seigel wrote: > > Hi Igor, > > I think that the test needs to only run twice, once with -Xcomp and once with -Xint. It does not need to run with multiple -Xcomp configurations. It needs only to run with consistent stack sizes, so that stack consumption doesn't vary because some threads run an interpreted method and others run a compiled method. > > My understanding is that "vm.compMode != "Xcomp" will cause the test to be run without -Xcomp. The test will then run using its internal -Xint flag and then its -Xcomp flags. Which I think is what we want. > > Thanks, Harold > > On 9/2/2020 10:45 AM, Igor Ignatyev wrote: >> Hi Harold, >> >> vm.compMode != "Xcomp" will exclude all combinations which has Xcomp, e.g. '-Xcomp -XX:-TieredCompilation', so it would be waste iff all other flags from Xcomp configurations are also used w/o Xcomp, which I don't think is generally true. >> >> given vm flags from jtreg actions are appended to external flags, running this test w/ '-Xcomp -XX:-TieredCompilation' (assuming there is no @requires) will actually result in two different (and one might argue expected) runs: >> -Xcomp -XX:-TieredCompilation -Xint -Xss448K nsk.stress.stack.stack016 -eager >> -Xcomp -XX:-TieredCompilation -Xcomp -Xss448K nsk.stress.stack.stack016 -eager >> >> thus I don't think '@requires vm.compMode' is needed here. >> >> if you think that '-Xint' run should be excluded when one specifies -Xcomp externally, I'd suggest splitting the test into two >> >> /* >> * @test id=Xint >> * @requires vm.compMode == null | vm.compMode == "Xint" >> * @requires (vm.opt.DeoptimizeALot != true) >> * @library /vmTestbase >> * @build nsk.share.Terminator >> * @run main/othervm/timeout=900 -Xint -Xss448K nsk.stress.stack.stack016 -eager >> */ >> >> /* >> * @test id=Xcomp >> * @requires vm.compMode == null | vm.compMode == "Xcomp" >> * @requires (vm.opt.DeoptimizeALot != true) >> * @library /vmTestbase >> * @build nsk.share.Terminator >> * @run main/othervm/timeout=900 -Xcomp -Xss448K nsk.stress.stack.stack016 -eager >> */ >> >> in this case Xint run will be run only if there is no Xint/Xmixed/Xbatch/Xcomp specified externally or Xint is specified, and similarly for Xcomp. >> >> Thanks, >> -- Igor >> >> >> >>> On Sep 1, 2020, at 11:26 PM, David Holmes > wrote: >>> >>> I would suggest that you keep the required condition >>> >>> vm.compMode != "Xcomp" >>> >>> to minimise the runs of this test. When executed as part of an Xcomp run both @run's will behave exactly the same way (Xcomp) - and that is the same as the second @run in a non-Xcomp run. So no point running the Xcomp version three times. >> From richard.reingruber at sap.com Wed Sep 2 21:26:52 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 2 Sep 2020 21:26:52 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> Message-ID: Hi Robin, > On 2020-09-02 15:48, Reingruber, Richard wrote: > > Hi Robbin, > > > > // taking the discussion back to the mailing lists > > > > > I still don't understand why you don't deoptimize the objects inside the > > > handshake/safepoint instead? > So for handshakes using asynch handshake and allowing blocking inside > would fix that. (future fix, I'm working on that now) Just to make it clear: I'm not fond of the extra suspension mechanism currently used for JDK-8227745 either. I want to get rid of it and I will work on it. Asynch handshakes (JDK-8238761) could be a replacement for it. At least I think they can be used to suspend the target thread. > For safepoint, since we have suspended all threads, ~'safepointed them' > with a JavaThread, you _could_ just execute the action directly (e.g. > skipping VM_HeapWalkOperation safepoint) since they are suppose to be > safely suspended until the destructor of EB, no? Yes, this should be possible. This would be an advanced change though. I would like EscapeBarriers to be a no-op and fall back to current implementation, if C2-EscapeAnalysis/Graal are disabled. > So I suggest future work to instead just execute the safepoint with the > requesting JT instead of having a this special safepoiting mechanism. > Since you are missing above functionality I see why you went this way. > If you need to push it, it's fine by me. We will work on further improvements. Top of the list would be eliminating the extra suspend mechanism. The implementation has matured for more than 12 months now [1]. It's been tested extensively at SAP over that time and passed also extended testing at Oracle kindly conducted by Vladimir Kozlov. We've got two full Reviews and incorporated extensive feedback from a number of OpenJDK Reviewers (including you, thanks!). Based on that I reckon we're good to push the change as enhancement (JDK-8227745) and bug fix (JDK-8233915). > Thanks for explaining once again :) Pleasure :) Thanks, Richard. [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028729.html -----Original Message----- From: Robbin Ehn Sent: Mittwoch, 2. September 2020 16:54 To: Reingruber, Richard ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, On 2020-09-02 15:48, Reingruber, Richard wrote: > Hi Robbin, > > // taking the discussion back to the mailing lists > > > I still don't understand why you don't deoptimize the objects inside the > > handshake/safepoint instead? So for handshakes using asynch handshake and allowing blocking inside would fix that. (future fix, I'm working on that now) For safepoint, since we have suspended all threads, ~'safepointed them' with a JavaThread, you _could_ just execute the action directly (e.g. skipping VM_HeapWalkOperation safepoint) since they are suppose to be safely suspended until the destructor of EB, no? So I suggest future work to instead just execute the safepoint with the requesting JT instead of having a this special safepoiting mechanism. Since you are missing above functionality I see why you went this way. If you need to push it, it's fine by me. Thanks for explaining once again :) /Robbin > > This is unfortunately not possible. Deoptimizing objects includes reallocating > scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This > cannot be done at a safepoint or handshake. > > 1. The vm thread is not allowed to allocate on the java heap > See for instance assertions in ParallelScavengeHeap::mem_allocate() > https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKya9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ > > This is not easy to change, I suppose, because it will be difficult to gc if > necessary. > > 2. Using a direct handshake would not work either. The problem there is again > gc. Let J be the JavaThread that is executing the direct handshake. The vm > would deadlock if the vm thread waits for J to execute the closure of a > handshake-all and J waits for the vm thread to execute a gc vm operation. > Patricio Chilano made me aware of this: https://bugs.openjdk.java.net/browse/JDK-8230594 > > Cheers, Richard. > > -----Original Message----- > From: Robbin Ehn > Sent: Mittwoch, 2. September 2020 13:56 > To: Reingruber, Richard > Cc: Lindenmaier, Goetz ; Vladimir Kozlov ; David Holmes > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi, > > I still don't understand why you don't deoptimize the objects inside the > handshake/safepoint instead? > > E.g. > > JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the code > from: > eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over the > stack, so: > > void > GetOwnedMonitorInfoClosure::do_thread(Thread *target) { > assert(target->is_Java_thread(), "just checking"); > JavaThread *jt = (JavaThread *)target; > > if (!jt->is_exiting() && (jt->threadObj() != NULL)) { > + if (EscapeBarrier::deoptimize_objects(jt, MaxJavaStackTraceDepth)) { > _result = > ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, > _owned_monitors_list); > } else { > _result = JVMTI_ERROR_OUT_OF_MEMORY; > } > } > } > > Why try 'suspend' the thread first? > > > When we de-optimize all threads why not just in the following safepoint? > E.g. > VM_HeapWalkOperation::doit() { > + EscapeBarrier::deoptimize_objects_all_threads(); > ... > } > > Thanks, Robbin > > From david.holmes at oracle.com Wed Sep 2 21:58:22 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Sep 2020 07:58:22 +1000 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: <491BDA6D-46A1-4859-9465-38171C083B7F@oracle.com> References: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> <1EA617E1-97C7-4548-9B6B-FB0612407066@oracle.com> <491BDA6D-46A1-4859-9465-38171C083B7F@oracle.com> Message-ID: <4512886a-0246-94c3-06d8-c15d7e93a921@oracle.com> Hi Igor, On 3/09/2020 6:33 am, Igor Ignatyev wrote: > Hi Harold, > > currently, the configurations which a test is run defined by test > execution configuration (which is tiers/tasks-definitions in Oracle's > case, and I'd guess others their own configuration matrixes), not a test > (tests are, however, responsible for filtering out > incompatible/meaningless configurations). this particular test is > included in runs w/ tens different flag combinations, w/ around a half > of them having -Xcomp; and `vm.compMode != "Xcomp"` effectively removes > the test from all these runs. Note that the test already doesn't run in those cases as it already has the `vm.compMode != "Xcomp"`. But I agree that your suggestion would expose the test to more Xcomp testing configurations, which is potentially a good thing. The only concern is that we don't yet know how the test will behave in those different Xcomp configurations. David > Thanks, > -- Igor > >> On Sep 2, 2020, at 1:21 PM, Harold Seigel > > wrote: >> >> Hi Igor, >> >> I think that the test needs to only run twice, once with -Xcomp and >> once with -Xint.? It does not need to run with multiple -Xcomp >> configurations.? It needs only to run with consistent stack sizes, so >> that stack consumption doesn't vary because some threads run an >> interpreted method and others run a compiled method. >> >> My understanding is that "vm.compMode != "Xcomp" will cause the test >> to be run without -Xcomp.? The test will then run using its internal >> -Xint flag and then its -Xcomp flags.? Which I think is what we want. >> >> Thanks, Harold >> >> On 9/2/2020 10:45 AM, Igor Ignatyev wrote: >>> Hi Harold, >>> >>> vm.compMode != "Xcomp" will exclude all combinations which has Xcomp, >>> e.g. '-Xcomp -XX:-TieredCompilation', so it would be waste iff all >>> other flags from Xcomp configurations are also used w/o Xcomp, which >>> I don't think is generally true. >>> >>> given vm flags from jtreg actions are appended to external flags, >>> running this test w/ '-Xcomp -XX:-TieredCompilation' (assuming there >>> is no @requires) will actually result in two different (and one might >>> argue expected) runs: >>> ?-Xcomp -XX:-TieredCompilation -Xint -Xss448K >>> nsk.stress.stack.stack016?-eager >>> ?-Xcomp -XX:-TieredCompilation -Xcomp -Xss448K >>> nsk.stress.stack.stack016?-eager >>> >>> thus I don't think '@requires vm.compMode' is needed here. >>> >>> if you think that '-Xint' run should be excluded when one specifies >>> -Xcomp externally, I'd suggest splitting the test into two >>> >>> /* >>> ?* @test id=Xint >>> ?* @requires vm.compMode == null | vm.compMode == "Xint" >>> ?* @requires (vm.opt.DeoptimizeALot != true) >>> ?* @library /vmTestbase >>> ?* @build nsk.share.Terminator >>> ?* @run main/othervm/timeout=900 -Xint -Xss448K >>> nsk.stress.stack.stack016 -eager >>> ?*/ >>> >>> /* >>> ?* @test id=Xcomp >>> ?* @requires vm.compMode == null | vm.compMode == "Xcomp" >>> ?* @requires (vm.opt.DeoptimizeALot != true) >>> ?* @library /vmTestbase >>> ?* @build nsk.share.Terminator >>> ?* @run main/othervm/timeout=900 -Xcomp?-Xss448K >>> nsk.stress.stack.stack016 -eager >>> ?*/ >>> >>> in this case Xint run will be run only if there is no >>> Xint/Xmixed/Xbatch/Xcomp specified externally or Xint is specified, >>> and similarly for Xcomp. >>> >>> Thanks, >>> -- Igor >>> >>> >>> >>>> On Sep 1, 2020, at 11:26 PM, David Holmes >>> > wrote: >>>> >>>> I would suggest that you keep the required condition >>>> >>>> vm.compMode != "Xcomp" >>>> >>>> to minimise the runs of this test. When executed as part of an Xcomp >>>> run both @run's will behave exactly the same way (Xcomp) - and that >>>> is the same as the second @run in a non-Xcomp run. So no point >>>> running the Xcomp version three times. >>> > From david.holmes at oracle.com Wed Sep 2 22:14:22 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Sep 2020 08:14:22 +1000 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check In-Reply-To: References: Message-ID: <8b5a6c50-e860-d138-8db4-ae378239bdb5@oracle.com> Hi Richard, This fix looks good to me. Thanks, David On 3/09/2020 1:15 am, Reingruber, Richard wrote: > Hi, > > please help review this fix for a race condition in > JavaThread::java_suspend_self_with_safepoint_check() that allows a suspended > thread to continue executing java for an arbitrary long time (see repro test > attached to bug report). > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8252521 > > The fix is to add a do-while-loop to java_suspend_self_with_safepoint_check() > that checks if the current thread was suspended again after returning from > java_suspend_self() and before restoring the original thread state. The check is > performed after restoring the original state because then we are guaranteed to > see the suspend request issued before the requester observed that target to be > _thread_blocked and executed VM_ThreadSuspend. > > Thanks, Richard. > From igor.ignatyev at oracle.com Wed Sep 2 22:16:20 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 2 Sep 2020 15:16:20 -0700 Subject: RFR 8252249: nsk/stress/stack/stack016.java fails with "Error: TEST_BUG: trickyRecursion() must throw an error anyway!" In-Reply-To: <4512886a-0246-94c3-06d8-c15d7e93a921@oracle.com> References: <461ea084-e654-a49b-e6db-a2abad121572@oracle.com> <1EA617E1-97C7-4548-9B6B-FB0612407066@oracle.com> <491BDA6D-46A1-4859-9465-38171C083B7F@oracle.com> <4512886a-0246-94c3-06d8-c15d7e93a921@oracle.com> Message-ID: <9DDB742F-8CCD-431B-8283-A67BED99B552@oracle.com> Hi David, > On Sep 2, 2020, at 2:58 PM, David Holmes wrote: > > Hi Igor, > > On 3/09/2020 6:33 am, Igor Ignatyev wrote: >> Hi Harold, >> currently, the configurations which a test is run defined by test execution configuration (which is tiers/tasks-definitions in Oracle's case, and I'd guess others their own configuration matrixes), not a test (tests are, however, responsible for filtering out incompatible/meaningless configurations). this particular test is included in runs w/ tens different flag combinations, w/ around a half of them having -Xcomp; and `vm.compMode != "Xcomp"` effectively removes the test from all these runs. > > Note that the test already doesn't run in those cases as it already has the `vm.compMode != "Xcomp"`. right, form that perspective the patch has actually increased the number of Xcomp configurations. > But I agree that your suggestion would expose the test to more Xcomp testing configurations, which is potentially a good thing. The only concern is that we don't yet know how the test will behave in those different Xcomp configurations. that's a valid concern, yet it's a concern w/ any changes, and there are two ways to go about it: - run the test once time in each (Xcomp) configurations - run the test N times in each (Xcomp) configurations (w/ N being big enough to provide statistically significant amount of data to show stability of the test) in both cases, if the results are good, you push and observer results for regular testing and react accordingly; if the results aren't so good, you fix and reiterate. from my experience, running the test huge number of times in each "regularly" used configuration is a waste of machine time, as you will never get 100% certainly that the test is stable, and while you are testing, the product will get changed, so all the data you collected is, strictly speaking, irrelevant. anyhow, I see that 8252249 has already been integrated, so I've filed JDK-8252723 to make this test executed in Xcomp configs, and we can discuss the best approach to test the test there. Thanks, -- Igor > > David > >> Thanks, >> -- Igor >>> On Sep 2, 2020, at 1:21 PM, Harold Seigel > wrote: >>> >>> Hi Igor, >>> >>> I think that the test needs to only run twice, once with -Xcomp and once with -Xint. It does not need to run with multiple -Xcomp configurations. It needs only to run with consistent stack sizes, so that stack consumption doesn't vary because some threads run an interpreted method and others run a compiled method. >>> >>> My understanding is that "vm.compMode != "Xcomp" will cause the test to be run without -Xcomp. The test will then run using its internal -Xint flag and then its -Xcomp flags. Which I think is what we want. >>> >>> Thanks, Harold >>> >>> On 9/2/2020 10:45 AM, Igor Ignatyev wrote: >>>> Hi Harold, >>>> >>>> vm.compMode != "Xcomp" will exclude all combinations which has Xcomp, e.g. '-Xcomp -XX:-TieredCompilation', so it would be waste iff all other flags from Xcomp configurations are also used w/o Xcomp, which I don't think is generally true. >>>> >>>> given vm flags from jtreg actions are appended to external flags, running this test w/ '-Xcomp -XX:-TieredCompilation' (assuming there is no @requires) will actually result in two different (and one might argue expected) runs: >>>> -Xcomp -XX:-TieredCompilation -Xint -Xss448K nsk.stress.stack.stack016 -eager >>>> -Xcomp -XX:-TieredCompilation -Xcomp -Xss448K nsk.stress.stack.stack016 -eager >>>> >>>> thus I don't think '@requires vm.compMode' is needed here. >>>> >>>> if you think that '-Xint' run should be excluded when one specifies -Xcomp externally, I'd suggest splitting the test into two >>>> >>>> /* >>>> * @test id=Xint >>>> * @requires vm.compMode == null | vm.compMode == "Xint" >>>> * @requires (vm.opt.DeoptimizeALot != true) >>>> * @library /vmTestbase >>>> * @build nsk.share.Terminator >>>> * @run main/othervm/timeout=900 -Xint -Xss448K nsk.stress.stack.stack016 -eager >>>> */ >>>> >>>> /* >>>> * @test id=Xcomp >>>> * @requires vm.compMode == null | vm.compMode == "Xcomp" >>>> * @requires (vm.opt.DeoptimizeALot != true) >>>> * @library /vmTestbase >>>> * @build nsk.share.Terminator >>>> * @run main/othervm/timeout=900 -Xcomp -Xss448K nsk.stress.stack.stack016 -eager >>>> */ >>>> >>>> in this case Xint run will be run only if there is no Xint/Xmixed/Xbatch/Xcomp specified externally or Xint is specified, and similarly for Xcomp. >>>> >>>> Thanks, >>>> -- Igor >>>> >>>> >>>> >>>>> On Sep 1, 2020, at 11:26 PM, David Holmes > wrote: >>>>> >>>>> I would suggest that you keep the required condition >>>>> >>>>> vm.compMode != "Xcomp" >>>>> >>>>> to minimise the runs of this test. When executed as part of an Xcomp run both @run's will behave exactly the same way (Xcomp) - and that is the same as the second @run in a non-Xcomp run. So no point running the Xcomp version three times. >>>> From david.holmes at oracle.com Thu Sep 3 00:20:19 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Sep 2020 10:20:19 +1000 Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions In-Reply-To: References: <442caa21-ca0a-f6eb-60a5-1e74bf994894@oracle.com> <03df9364-817d-04d6-6434-80be93a66526@oracle.com> Message-ID: <1cc00e89-467e-f976-56a7-630f8782d61a@oracle.com> Hi Jamsheed, On 1/09/2020 10:36 pm, Jamsheed C M wrote: > Hi David, > > I reworked the patch, revised webrev here: > http://cr.openjdk.java.net/~jcm/8249451/webrev.01/ Thanks. The new macros and injected field for InternalError look good. A couple of minor comments below but overall this looks good to me. > In addition I moved UnlockFlagSaver fs(this) to more local scope. > > also removed changes done for JDK-8246727, as it will be separately > handled by the bug. > > Testing: injected and tested async exceptions randomly at compilation > request path and deopt path. I noticed in deoptimization.cpp that here: 1965 load_class_by_index(constants, unloaded_class_index, THREAD); we can now return with a pending async exception and it is unclear whether the code following this will be able to handle that, or indeed whether the caller will be able to handle it. Did you specifically test this site? --- src/hotspot/share/jvmci/jvmciRuntime.cpp The comment at: 80 // 1. The pending exception is cleared should be updated now that asyncs are not cleared. --- src/hotspot/share/compiler/tieredThresholdPolicy.* The changes from JavaThread* to Thread* look unnecessary for 90% of the cases, but the overall change seems to be dictated by the few methods that do use CHECK*. :( No point agonising over this now as I'm trying to deal with this general problem as a separate RFE - JDK-8252685. Thanks, David ----- > Best regards, > > Jamsheed > > On 24/08/2020 11:06, Jamsheed C M wrote: >> Hi David, >> >> Thank you for the review and feedback. Agree on all of them. I will >> rework and get back. >> >> On 10/08/2020 07:33, David Holmes wrote: >>> Hi Jamsheed, >>> >>> On 6/08/2020 10:07 pm, Jamsheed C M wrote: >>>> Hi all, >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249451 >>>> >>>> webrev: http://cr.openjdk.java.net/~jcm/8249451/webrev.00/ >>> >>> Thanks for tackling this messy issue. Overall I like the use of TRAPS >>> to more clearly document which methods can return with an exception >>> pending. I think there are some problems with the proposed changes. >>> I'll start with those comments and then move on to more general >>> comments. >>> >>> src/hotspot/share/utilities/exceptions.cpp >>> src/hotspot/share/utilities/exceptions.hpp >>> >>> I don't think the changes here are correct or safe in general. >>> >>> First, adding the new macro and function to only clear non-async >>> exceptions is fine itself. But naming wise the fact only non-async >>> exceptions are cleared should be evident, and there is no "check" >>> involved (in the sense of the existing CHECK_ macros) so I suggest: >>> >>> s/CHECK_CLEAR_PENDING_EXCEPTION/CLEAR_PENDING_NONASYNC_EXCEPTIONS/ >>> s/check_clear_pending_exception/clear_pending_nonasync_exceptions/ >>> >> Ok >>> But changing the existing CHECK_AND_CLEAR macros to now leave async >>> exceptions pending seems potentially dangerous as calling code may >>> not be prepared for there to now be a pending exception. For example >>> the use in thread.cpp: >>> >>> ?JDK_Version::set_runtime_name(get_java_runtime_name(THREAD)); >>> ?JDK_Version::set_runtime_version(get_java_runtime_version(THREAD)); >>> >>> get_java_runtime_name() is currently guaranteed to clear all >>> exceptions, so all the other code is known to be safe to call. But >>> that would no longer be true. That said, this is VM initialization >>> code and an async exception is impossible at this stage. >>> >>> I think I would rather see CHECK_AND_CLEAR left as-is, and an actual >>> CHECK_AND_CLEAR_NONASYNC introduced for those users of >>> CHECK_AND_CLEAR that can encounter async exceptions and which should >>> not clear them. >>> >>> +?? if >>> (!_pending_exception->is_a(SystemDictionary::ThreadDeath_klass()) && >>> +?????? _pending_exception->klass() != >>> SystemDictionary::InternalError_klass()) { >>> >> Ok >>> Flagging all InternalErrors as async exceptions is probably also not >>> correct. I don't see a good solution to this at the moment. I think >>> we would need to introduce a new subclass of InternalError for the >>> unsafe access error case**. Now it may be that all the other >>> InternalError usages are "impossible" in the context of where the new >>> macros are to be used, but that is very difficult to establish or >>> assert. >>> >>> ** Or perhaps we could inject a field that allows the VM to identify >>> instances related to unsafe access errors ... Ideally of course these >>> unsafe access errors would be distinct from the async exception >>> mechanism - something I would still like to pursue. >>> >> Ok >>> --- >>> >>> General comments ... >>> >>> There is a general change from "JavaThread* thread" to "Thread* >>> THREAD" (or TRAPS) to allow the use of the CHECK macros. This is >>> unfortunate because the fact the thread is restricted to being a >>> JavaThread is no longer evident in the method signatures. That is a >>> flaw with the TRAPS/CHECK mechanism unfortunately :( . But as the >>> methods no longer take a JavaThread* arg, they should assert that >>> THREAD->is_Java_thread(). I will also look at an RFE to have >>> as_JavaThread() to avoid the need for separate assertion checks >>> before casting from "Thread*" to "JavaThread*". >>> >> Ok >>> Note there's no need to use CHECK when the enclosing method is going >>> to return immediately after the call that contains the CHECK. It just >>> adds unnecessary checking of the exception state. The use of TRAPS >>> shows that the methods may return with an exception pending. I've >>> flagged all such occurrences I spotted below. >>> >> Ok >>> --- >>> >>> +?? // Only metaspace OOM is expected. no Java code executed. >>> >>> Nit: s/no/No >>> >>> >>> src/hotspot/share/compiler/compilationPolicy.cpp >>> >>> >>> ?410?????? method_invocation_event(method, CHECK_NULL); >>> ?489?????? CompileBroker::compile_method(m, InvocationEntryBci, >>> comp_level, m, hot_count, CompileTask::Reason_InvocationCount, CHECK); >>> >>> Nit: there's no need to use CHECK here. >>> >>> --- >>> >>> src/hotspot/share/compiler/tieredThresholdPolicy.cpp >>> >>> ?504???? method_invocation_event(method, inlinee, comp_level, nm, >>> CHECK_NULL); >>> ?570???????? compile(mh, bci, CompLevel_simple, CHECK); >>> ?581???????? compile(mh, bci, CompLevel_simple, CHECK); >>> ?595???? CompileBroker::compile_method(mh, bci, level, mh, hot_count, >>> CompileTask::Reason_Tiered, CHECK); >>> 1062?????? compile(mh, InvocationEntryBci, next_level, CHECK); >>> >>> Nit: there's no need to use CHECK here. >>> >>> 814 void TieredThresholdPolicy::create_mdo(const methodHandle& mh, >>> Thread* THREAD) { >>> >>> Thank you for correcting this misuse of the THREAD name on a >>> JavaThread* type. >>> >>> --- >>> >>> src/hotspot/share/interpreter/linkResolver.cpp >>> >>> ?128?? CompilationPolicy::compile_if_required(selected_method, CHECK); >>> >>> Nit: there's no need to use CHECK here. >>> >>> --- >>> >>> src/hotspot/share/jvmci/compilerRuntime.cpp >>> >>> ?260???? CompilationPolicy::policy()->event(emh, mh, >>> InvocationEntryBci, InvocationEntryBci, CompLevel_aot, cm, CHECK); >>> ?280???? nmethod* osr_nm = CompilationPolicy::policy()->event(emh, >>> mh, branch_bci, target_bci, CompLevel_aot, cm, CHECK); >>> >>> Nit: there's no need to use CHECK here. >>> >>> --- >>> >>> src/hotspot/share/jvmci/jvmciRuntime.cpp >>> >>> ?102???????? // Donot clear probable async exceptions. >>> >>> typo: s/Donot/Do not/ >>> >>> --- >>> >>> src/hotspot/share/runtime/deoptimization.cpp >>> >>> 1686 void Deoptimization::load_class_by_index(const >>> constantPoolHandle& constant_pool, int index) { >>> >>> This method should be declared with TRAPS now. >>> >>> 1693???? // Donot clear probable Async Exceptions. >>> >>> typo: s/Donot/Do not/ >>> >>> >> Ok >>>> testing : mach1-5(links in jbs) >>> >>> There is very little existing testing that will actually test the key >>> changes you have made here. You will need to do direct >>> fault-injection testing anywhere you now allow async exceptions to >>> remain, to see if the calling code can tolerate that. It will be >>> difficult to test thoroughly. >>> >> Ok >>> Thanks again for tackling this difficult problem! >> >> Best regards, >> >> Jamsheed >> >>> >>> David >>> ----- >>> >>>> >>>> While working on JDK-8246381 it was noticed that compilation request >>>> path clears all exceptions(including async) and doesn't propagate[1]. >>>> >>>> Fix: patch restores the propagation behavior for the probable async >>>> exceptions. >>>> >>>> Compilation request path propagate exception as in [2]. MDO and >>>> MethodCounter doesn't expect any exception other than metaspace >>>> OOM(added comments). >>>> >>>> Deoptimization path doesn't clear probable async exceptions and take >>>> unpack_exception path for non uncommontraps. >>>> >>>> Added java_lang_InternalError to well known classes. >>>> >>>> Request for review. >>>> >>>> Best Regards, >>>> >>>> Jamsheed >>>> >>>> [1] w.r.t changes done for JDK-7131259 >>>> >>>> [2] >>>> >>>> ???? (a) >>>> ???? -----> c1_Runtime1.cpp/interpreterRuntime.cpp/compilerRuntime.cpp >>>> ?????? | >>>> ??????? ----- compilationPolicy.cpp/tieredThresholdPolicy.cpp >>>> ????????? | >>>> ?????????? ------ compileBroker.cpp >>>> >>>> ???? (b) >>>> ???? Xcomp versions >>>> ???? ------> compilationPolicy.cpp >>>> ??????? | >>>> ???????? ------> compileBroker.cpp >>>> >>>> ???? (c) >>>> >>>> ???? Direct call to? compile_method in compileBroker.cpp >>>> >>>> ???? JVMCI bootstrap, whitebox, replayCompile. >>>> >>>> From david.holmes at oracle.com Thu Sep 3 02:57:50 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Sep 2020 12:57:50 +1000 Subject: [16]RFR(S):8249092:InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code In-Reply-To: <05967ec2-ec8e-48f3-bd03-09d29dbcfbba.zhuoren.wz@alibaba-inc.com> References: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com> <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> <05967ec2-ec8e-48f3-bd03-09d29dbcfbba.zhuoren.wz@alibaba-inc.com> Message-ID: Hi, On 3/09/2020 12:07 pm, Wang Zhuo(Zhuoren) wrote: > Hi Patric, > The original problem(https://bugs.openjdk.java.net/browse/JDK-8246051) > is architecture?specific. When running TestUnsafeUnalignedSwap.java on > aarch64 platforms, JVM crashes without the fix because aarch64 does not > support unaligned compare_and_swap. On X86 platforms the crash cannot be > reproduced because X86 support unaligned compare_and_swap. Patric is asking about the original situation where an unaligned CAS was performed, which motivated the change made by JDK-8246051. The GuardUnsafeAccess mechanism (or more specifically the signal handling tricks underneath it) was not intended for general use, but was specifically created to deal with the case of page faults related to mapped ByteBuffers where we wanted application code using JDK exported APIs to get an InternalError when they did the wrong thing with their mapped files, instead of crashing. No part of the JDK should be calling Unsafe.compareAndSwap* with unaligned data - if it does that is a bug. If third-party code is using Unsafe directly then that is their problem and we do not try to make things easier for them. The use of GuardUnsafeAccess with the CAS primitives on some platforms can result in an infinite loop as the mechanism cannot be applied to arbitrary code sequences. Somewhat ironically Aarch64 is one platform that can suffer from this. Cheers, David ----- > > Regards, > Zhuoren > > ------------------------------------------------------------------ > From:Patric Hedlin > Sent At:2020 Sep. 2 (Wed.) 20:22 > To:Sandler ; aarch64-port-dev > ; hotspot-runtime-dev > > Cc:david.holmes ; rahul.v.raghavan > > Subject:Re: [16]RFR(S):8249092:InternalError: a fault occurred in a > recent unsafe memory access operation in compiled Java code > > Hi Zhuoren, > > I don't actually know what behaviour to expect from the Unsafe > atomics in this test-case but perhaps you could re-cap the original > problem (addressed in JDK-8246051) since it seems to raise some > questions. Did you have an example (real code) where this behaviour > is essential? > > Best regards, > Patric Hedlin > > (Including hotspot-runtime-dev at openjdk.java.net) > > > On 2020-09-01 07:35, Wang Zhuo(Zhuoren) wrote: > Hi, this is a fix for a test case. > In -Xcomp mode, compiler/unsafe/TestUnsafeUnalignedSwap.java will > fail because the catch misses the error due to async exception. > This patch uses a loop to make sure the error can be caught. Also > -Xcomp is added in test. > BUG: https://bugs.openjdk.java.net/browse/JDK-8249092 > Patch: http://cr.openjdk.java.net/~wzhuo/8249092/webrev.00/ > > > Regards, > Zhuoren > > From david.holmes at oracle.com Thu Sep 3 04:11:46 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Sep 2020 14:11:46 +1000 Subject: [16]RFR(S):8249092:InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code In-Reply-To: <2b2f0d07-c0b2-4e8b-ad2c-fd0046273851.zhuoren.wz@alibaba-inc.com> References: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com> <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> <05967ec2-ec8e-48f3-bd03-09d29dbcfbba.zhuoren.wz@alibaba-inc.com> <2b2f0d07-c0b2-4e8b-ad2c-fd0046273851.zhuoren.wz@alibaba-inc.com> Message-ID: On 3/09/2020 1:49 pm, Wang Zhuo(Zhuoren) wrote: > David, Thank you very much for the explanation. > The original crash indeed happened in a third party code using Unsafe to > handle unaligned?data. > So you mean that it is the third party code's bug, we should not fix it > in JVM, right? Right. Unsafe is not a supported API and is by definition unsafe. If you use it and it crashes then you need to change your code. Cheers, David ----- > Regards, > Zhuoren > > ------------------------------------------------------------------ > From:David Holmes > Sent At:2020 Sep. 3 (Thu.) 10:58 > To:Sandler ; Patric Hedlin > ; aarch64-port-dev > ; hotspot-runtime-dev > > Cc:rahul.v.raghavan > Subject:Re: [16]RFR(S):8249092:InternalError: a fault occurred in a > recent unsafe memory access operation in compiled Java code > > Hi, > > On?3/09/2020?12:07?pm,?Wang?Zhuo(Zhuoren)?wrote: > >?Hi?Patric, > >?The?original?problem(https://bugs.openjdk.java.net/browse/JDK-8246051) > >?is?architecture?specific.?When?running?TestUnsafeUnalignedSwap.java?on > >?aarch64?platforms,?JVM?crashes?without?the?fix?because?aarch64?does?not > >?support?unaligned?compare_and_swap.?On?X86?platforms?the?crash?cannot?be > >?reproduced?because?X86?support?unaligned?compare_and_swap. > > Patric?is?asking?about?the?original?situation?where?an?unaligned?CAS?was > > performed,?which?motivated?the?change?made?by?JDK-8246051. > > The?GuardUnsafeAccess?mechanism?(or?more?specifically?the?signal > handling?tricks?underneath?it)?was?not?intended?for?general?use,?but?was > > specifically?created?to?deal?with?the?case?of?page?faults?related?to > mapped?ByteBuffers?where?we?wanted?application?code?using?JDK?exported > APIs?to?get?an?InternalError?when?they?did?the?wrong?thing?with?their > mapped?files,?instead?of?crashing.?No?part?of?the?JDK?should?be?calling > Unsafe.compareAndSwap*?with?unaligned?data?-?if?it?does?that?is?a?bug. > If?third-party?code?is?using?Unsafe?directly?then?that?is?their?problem > and?we?do?not?try?to?make?things?easier?for?them. > > The?use?of?GuardUnsafeAccess?with?the?CAS?primitives?on?some?platforms > can?result?in?an?infinite?loop?as?the?mechanism?cannot?be?applied?to > arbitrary?code?sequences.?Somewhat?ironically?Aarch64?is?one?platform > that?can?suffer?from?this. > > Cheers, > David > ----- > > > > >?Regards, > >?Zhuoren > > > >?????------------------------------------------------------------------ > >?????From:Patric?Hedlin? > >?????Sent?At:2020?Sep.?2?(Wed.)?20:22 > >?????To:Sandler?;?aarch64-port-dev > >?????;?hotspot-runtime-dev > >????? > >?????Cc:david.holmes?;?rahul.v.raghavan > >????? > >?????Subject:Re:?[16]RFR(S):8249092:InternalError:?a?fault?occurred?in?a > >?????recent?unsafe?memory?access?operation?in?compiled?Java?code > > > >?????Hi?Zhuoren, > > > >?????I?don't?actually?know?what?behaviour?to?expect?from?the?Unsafe > >?????atomics?in?this?test-case?but?perhaps?you?could?re-cap?the?original > >?????problem?(addressed?in?JDK-8246051)?since?it?seems?to?raise?some > >?????questions.?Did?you?have?an?example?(real?code)?where?this?behaviour > >?????is?essential? > > > >?????Best?regards, > >?????Patric?Hedlin > > > >?????(Including?hotspot-runtime-dev at openjdk.java.net) > > > > > >?????On?2020-09-01?07:35,?Wang?Zhuo(Zhuoren)?wrote: > >?????Hi,?this?is?a?fix?for?a?test?case. > >?????In?-Xcomp?mode,?compiler/unsafe/TestUnsafeUnalignedSwap.java?will > >?????fail?because?the?catch?misses?the?error?due?to?async?exception. > >?????This?patch?uses?a?loop?to?make?sure?the?error?can?be?caught.?Also > >?????-Xcomp?is?added?in?test. > >?????BUG:?https://bugs.openjdk.java.net/browse/JDK-8249092 > >?????Patch:?http://cr.openjdk.java.net/~wzhuo/8249092/webrev.00/ > > > > > >?????Regards, > >?????Zhuoren > > > > > From richard.reingruber at sap.com Thu Sep 3 08:40:12 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 3 Sep 2020 08:40:12 +0000 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check In-Reply-To: <8b5a6c50-e860-d138-8db4-ae378239bdb5@oracle.com> References: <8b5a6c50-e860-d138-8db4-ae378239bdb5@oracle.com> Message-ID: > This fix looks good to me. Thank you, David. The fix passed nightly regression testing @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms. Thanks, Richard. -----Original Message----- From: David Holmes Sent: Donnerstag, 3. September 2020 00:14 To: Reingruber, Richard ; Hotspot dev runtime Subject: Re: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check Hi Richard, This fix looks good to me. Thanks, David On 3/09/2020 1:15 am, Reingruber, Richard wrote: > Hi, > > please help review this fix for a race condition in > JavaThread::java_suspend_self_with_safepoint_check() that allows a suspended > thread to continue executing java for an arbitrary long time (see repro test > attached to bug report). > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8252521 > > The fix is to add a do-while-loop to java_suspend_self_with_safepoint_check() > that checks if the current thread was suspended again after returning from > java_suspend_self() and before restoring the original thread state. The check is > performed after restoring the original state because then we are guaranteed to > see the suspend request issued before the requester observed that target to be > _thread_blocked and executed VM_ThreadSuspend. > > Thanks, Richard. > From aph at redhat.com Thu Sep 3 08:41:32 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 3 Sep 2020 09:41:32 +0100 Subject: [16]RFR(S):8249092:InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code In-Reply-To: References: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com> <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> <05967ec2-ec8e-48f3-bd03-09d29dbcfbba.zhuoren.wz@alibaba-inc.com> <2b2f0d07-c0b2-4e8b-ad2c-fd0046273851.zhuoren.wz@alibaba-inc.com> Message-ID: <5845a59f-3790-906a-0ea4-fc0e862eb103@redhat.com> On 03/09/2020 05:11, David Holmes wrote: > On 3/09/2020 1:49 pm, Wang Zhuo(Zhuoren) wrote: >> David, Thank you very much for the explanation. >> The original crash indeed happened in a third party code using Unsafe to >> handle unaligned?data. >> So you mean that it is the third party code's bug, we should not fix it >> in JVM, right? > > Right. Unsafe is not a supported API and is by definition unsafe. If you > use it and it crashes then you need to change your code. I agree. I hindsight, I should probably not have approved 8246051. I'm happy that it should be backed out. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From jamsheed.c.m at oracle.com Thu Sep 3 08:43:09 2020 From: jamsheed.c.m at oracle.com (Jamsheed C M) Date: Thu, 3 Sep 2020 14:13:09 +0530 Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions In-Reply-To: <1cc00e89-467e-f976-56a7-630f8782d61a@oracle.com> References: <442caa21-ca0a-f6eb-60a5-1e74bf994894@oracle.com> <03df9364-817d-04d6-6434-80be93a66526@oracle.com> <1cc00e89-467e-f976-56a7-630f8782d61a@oracle.com> Message-ID: Hi David, Thank you for the review and feedback. Revised webrev here: http://cr.openjdk.java.net/~jcm/8249451/webrev.02/ On 03/09/2020 05:50, David Holmes wrote: > Hi Jamsheed, > > On 1/09/2020 10:36 pm, Jamsheed C M wrote: >> Hi David, >> >> I reworked the patch, revised webrev here: >> http://cr.openjdk.java.net/~jcm/8249451/webrev.01/ > > Thanks. The new macros and injected field for InternalError look good. > > A couple of minor comments below but overall this looks good to me. > >> In addition I moved UnlockFlagSaver fs(this) to more local scope. >> >> also removed changes done for JDK-8246727, as it will be separately >> handled by the bug. >> >> Testing: injected and tested async exceptions randomly at compilation >> request path and deopt path. > > I noticed in deoptimization.cpp that here: > > 1965?????? load_class_by_index(constants, unloaded_class_index, THREAD); > > we can now return with a pending async exception and it is unclear > whether the code following this will be able to handle that, or indeed > whether the caller will be able to handle it. Did you specifically > test this site? > Yes, I browsed through the code path, it is equipped(let it be c2 only UC, or various deopt variant). JVMCI aot code too is equipped to handle it( there are forwarding code present at foreign call exit) > --- > > src/hotspot/share/jvmci/jvmciRuntime.cpp > > The comment at: > > ? 80 //?? 1. The pending exception is cleared > > should be updated now that asyncs are not cleared. Done. > > --- > > src/hotspot/share/compiler/tieredThresholdPolicy.* > > The changes from JavaThread* to Thread* look unnecessary for 90% of > the cases, but the overall change seems to be dictated by the few > methods that do use CHECK*. :( No point agonising over this now as I'm > trying to deal with this general problem as a separate RFE - JDK-8252685. > Thank you and Best regards, Jamsheed > Thanks, > David > ----- > >> Best regards, >> >> Jamsheed >> >> On 24/08/2020 11:06, Jamsheed C M wrote: >>> Hi David, >>> >>> Thank you for the review and feedback. Agree on all of them. I will >>> rework and get back. >>> >>> On 10/08/2020 07:33, David Holmes wrote: >>>> Hi Jamsheed, >>>> >>>> On 6/08/2020 10:07 pm, Jamsheed C M wrote: >>>>> Hi all, >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249451 >>>>> >>>>> webrev: http://cr.openjdk.java.net/~jcm/8249451/webrev.00/ >>>> >>>> Thanks for tackling this messy issue. Overall I like the use of >>>> TRAPS to more clearly document which methods can return with an >>>> exception pending. I think there are some problems with the >>>> proposed changes. I'll start with those comments and then move on >>>> to more general comments. >>>> >>>> src/hotspot/share/utilities/exceptions.cpp >>>> src/hotspot/share/utilities/exceptions.hpp >>>> >>>> I don't think the changes here are correct or safe in general. >>>> >>>> First, adding the new macro and function to only clear non-async >>>> exceptions is fine itself. But naming wise the fact only non-async >>>> exceptions are cleared should be evident, and there is no "check" >>>> involved (in the sense of the existing CHECK_ macros) so I suggest: >>>> >>>> s/CHECK_CLEAR_PENDING_EXCEPTION/CLEAR_PENDING_NONASYNC_EXCEPTIONS/ >>>> s/check_clear_pending_exception/clear_pending_nonasync_exceptions/ >>>> >>> Ok >>>> But changing the existing CHECK_AND_CLEAR macros to now leave async >>>> exceptions pending seems potentially dangerous as calling code may >>>> not be prepared for there to now be a pending exception. For >>>> example the use in thread.cpp: >>>> >>>> ?JDK_Version::set_runtime_name(get_java_runtime_name(THREAD)); >>>> ?JDK_Version::set_runtime_version(get_java_runtime_version(THREAD)); >>>> >>>> get_java_runtime_name() is currently guaranteed to clear all >>>> exceptions, so all the other code is known to be safe to call. But >>>> that would no longer be true. That said, this is VM initialization >>>> code and an async exception is impossible at this stage. >>>> >>>> I think I would rather see CHECK_AND_CLEAR left as-is, and an >>>> actual CHECK_AND_CLEAR_NONASYNC introduced for those users of >>>> CHECK_AND_CLEAR that can encounter async exceptions and which >>>> should not clear them. >>>> >>>> +?? if >>>> (!_pending_exception->is_a(SystemDictionary::ThreadDeath_klass()) && >>>> +?????? _pending_exception->klass() != >>>> SystemDictionary::InternalError_klass()) { >>>> >>> Ok >>>> Flagging all InternalErrors as async exceptions is probably also >>>> not correct. I don't see a good solution to this at the moment. I >>>> think we would need to introduce a new subclass of InternalError >>>> for the unsafe access error case**. Now it may be that all the >>>> other InternalError usages are "impossible" in the context of where >>>> the new macros are to be used, but that is very difficult to >>>> establish or assert. >>>> >>>> ** Or perhaps we could inject a field that allows the VM to >>>> identify instances related to unsafe access errors ... Ideally of >>>> course these unsafe access errors would be distinct from the async >>>> exception mechanism - something I would still like to pursue. >>>> >>> Ok >>>> --- >>>> >>>> General comments ... >>>> >>>> There is a general change from "JavaThread* thread" to "Thread* >>>> THREAD" (or TRAPS) to allow the use of the CHECK macros. This is >>>> unfortunate because the fact the thread is restricted to being a >>>> JavaThread is no longer evident in the method signatures. That is a >>>> flaw with the TRAPS/CHECK mechanism unfortunately :( . But as the >>>> methods no longer take a JavaThread* arg, they should assert that >>>> THREAD->is_Java_thread(). I will also look at an RFE to have >>>> as_JavaThread() to avoid the need for separate assertion checks >>>> before casting from "Thread*" to "JavaThread*". >>>> >>> Ok >>>> Note there's no need to use CHECK when the enclosing method is >>>> going to return immediately after the call that contains the CHECK. >>>> It just adds unnecessary checking of the exception state. The use >>>> of TRAPS shows that the methods may return with an exception >>>> pending. I've flagged all such occurrences I spotted below. >>>> >>> Ok >>>> --- >>>> >>>> +?? // Only metaspace OOM is expected. no Java code executed. >>>> >>>> Nit: s/no/No >>>> >>>> >>>> src/hotspot/share/compiler/compilationPolicy.cpp >>>> >>>> >>>> ?410?????? method_invocation_event(method, CHECK_NULL); >>>> ?489?????? CompileBroker::compile_method(m, InvocationEntryBci, >>>> comp_level, m, hot_count, CompileTask::Reason_InvocationCount, CHECK); >>>> >>>> Nit: there's no need to use CHECK here. >>>> >>>> --- >>>> >>>> src/hotspot/share/compiler/tieredThresholdPolicy.cpp >>>> >>>> ?504???? method_invocation_event(method, inlinee, comp_level, nm, >>>> CHECK_NULL); >>>> ?570???????? compile(mh, bci, CompLevel_simple, CHECK); >>>> ?581???????? compile(mh, bci, CompLevel_simple, CHECK); >>>> ?595???? CompileBroker::compile_method(mh, bci, level, mh, >>>> hot_count, CompileTask::Reason_Tiered, CHECK); >>>> 1062?????? compile(mh, InvocationEntryBci, next_level, CHECK); >>>> >>>> Nit: there's no need to use CHECK here. >>>> >>>> 814 void TieredThresholdPolicy::create_mdo(const methodHandle& mh, >>>> Thread* THREAD) { >>>> >>>> Thank you for correcting this misuse of the THREAD name on a >>>> JavaThread* type. >>>> >>>> --- >>>> >>>> src/hotspot/share/interpreter/linkResolver.cpp >>>> >>>> ?128 CompilationPolicy::compile_if_required(selected_method, CHECK); >>>> >>>> Nit: there's no need to use CHECK here. >>>> >>>> --- >>>> >>>> src/hotspot/share/jvmci/compilerRuntime.cpp >>>> >>>> ?260???? CompilationPolicy::policy()->event(emh, mh, >>>> InvocationEntryBci, InvocationEntryBci, CompLevel_aot, cm, CHECK); >>>> ?280???? nmethod* osr_nm = CompilationPolicy::policy()->event(emh, >>>> mh, branch_bci, target_bci, CompLevel_aot, cm, CHECK); >>>> >>>> Nit: there's no need to use CHECK here. >>>> >>>> --- >>>> >>>> src/hotspot/share/jvmci/jvmciRuntime.cpp >>>> >>>> ?102???????? // Donot clear probable async exceptions. >>>> >>>> typo: s/Donot/Do not/ >>>> >>>> --- >>>> >>>> src/hotspot/share/runtime/deoptimization.cpp >>>> >>>> 1686 void Deoptimization::load_class_by_index(const >>>> constantPoolHandle& constant_pool, int index) { >>>> >>>> This method should be declared with TRAPS now. >>>> >>>> 1693???? // Donot clear probable Async Exceptions. >>>> >>>> typo: s/Donot/Do not/ >>>> >>>> >>> Ok >>>>> testing : mach1-5(links in jbs) >>>> >>>> There is very little existing testing that will actually test the >>>> key changes you have made here. You will need to do direct >>>> fault-injection testing anywhere you now allow async exceptions to >>>> remain, to see if the calling code can tolerate that. It will be >>>> difficult to test thoroughly. >>>> >>> Ok >>>> Thanks again for tackling this difficult problem! >>> >>> Best regards, >>> >>> Jamsheed >>> >>>> >>>> David >>>> ----- >>>> >>>>> >>>>> While working on JDK-8246381 it was noticed that compilation >>>>> request path clears all exceptions(including async) and doesn't >>>>> propagate[1]. >>>>> >>>>> Fix: patch restores the propagation behavior for the probable >>>>> async exceptions. >>>>> >>>>> Compilation request path propagate exception as in [2]. MDO and >>>>> MethodCounter doesn't expect any exception other than metaspace >>>>> OOM(added comments). >>>>> >>>>> Deoptimization path doesn't clear probable async exceptions and >>>>> take unpack_exception path for non uncommontraps. >>>>> >>>>> Added java_lang_InternalError to well known classes. >>>>> >>>>> Request for review. >>>>> >>>>> Best Regards, >>>>> >>>>> Jamsheed >>>>> >>>>> [1] w.r.t changes done for JDK-7131259 >>>>> >>>>> [2] >>>>> >>>>> ???? (a) >>>>> ???? -----> >>>>> c1_Runtime1.cpp/interpreterRuntime.cpp/compilerRuntime.cpp >>>>> ?????? | >>>>> ??????? ----- compilationPolicy.cpp/tieredThresholdPolicy.cpp >>>>> ????????? | >>>>> ?????????? ------ compileBroker.cpp >>>>> >>>>> ???? (b) >>>>> ???? Xcomp versions >>>>> ???? ------> compilationPolicy.cpp >>>>> ??????? | >>>>> ???????? ------> compileBroker.cpp >>>>> >>>>> ???? (c) >>>>> >>>>> ???? Direct call to? compile_method in compileBroker.cpp >>>>> >>>>> ???? JVMCI bootstrap, whitebox, replayCompile. >>>>> >>>>> From jamsheed.c.m at oracle.com Thu Sep 3 09:09:12 2020 From: jamsheed.c.m at oracle.com (Jamsheed C M) Date: Thu, 3 Sep 2020 14:39:12 +0530 Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions In-Reply-To: References: <442caa21-ca0a-f6eb-60a5-1e74bf994894@oracle.com> <03df9364-817d-04d6-6434-80be93a66526@oracle.com> <1cc00e89-467e-f976-56a7-630f8782d61a@oracle.com> Message-ID: <082708b2-7a8a-22e7-d35b-99dbb96d6653@oracle.com> Some edits On 03/09/2020 14:13, Jamsheed C M wrote: > Hi David, > > Thank you for the review and feedback. > > Revised webrev here: http://cr.openjdk.java.net/~jcm/8249451/webrev.02/ > > On 03/09/2020 05:50, David Holmes wrote: >> Hi Jamsheed, >> >> On 1/09/2020 10:36 pm, Jamsheed C M wrote: >>> Hi David, >>> >>> I reworked the patch, revised webrev here: >>> http://cr.openjdk.java.net/~jcm/8249451/webrev.01/ >> >> Thanks. The new macros and injected field for InternalError look good. >> >> A couple of minor comments below but overall this looks good to me. >> >>> In addition I moved UnlockFlagSaver fs(this) to more local scope. >>> >>> also removed changes done for JDK-8246727, as it will be separately >>> handled by the bug. >>> >>> Testing: injected and tested async exceptions randomly at >>> compilation request path and deopt path. >> >> I noticed in deoptimization.cpp that here: >> >> 1965?????? load_class_by_index(constants, unloaded_class_index, THREAD); >> >> we can now return with a pending async exception and it is unclear >> whether the code following this will be able to handle that, or >> indeed whether the caller will be able to handle it. Did you >> specifically test this site? >> > Yes, I browsed through the code path, it is equipped(let it be c2 only > UC, or various deopt variant). Below comment is for aot compilation request path. sorry for mixing up > JVMCI aot code too is equipped to handle it( there are forwarding code > present at foreign call exit) > Best regards, Jamsheed >> --- >> >> src/hotspot/share/jvmci/jvmciRuntime.cpp >> >> The comment at: >> >> ? 80 //?? 1. The pending exception is cleared >> >> should be updated now that asyncs are not cleared. > Done. >> >> --- >> >> src/hotspot/share/compiler/tieredThresholdPolicy.* >> >> The changes from JavaThread* to Thread* look unnecessary for 90% of >> the cases, but the overall change seems to be dictated by the few >> methods that do use CHECK*. :( No point agonising over this now as >> I'm trying to deal with this general problem as a separate RFE - >> JDK-8252685. >> > Thank you and Best regards, > > Jamsheed > >> Thanks, >> David >> ----- >> >>> Best regards, >>> >>> Jamsheed >>> >>> On 24/08/2020 11:06, Jamsheed C M wrote: >>>> Hi David, >>>> >>>> Thank you for the review and feedback. Agree on all of them. I will >>>> rework and get back. >>>> >>>> On 10/08/2020 07:33, David Holmes wrote: >>>>> Hi Jamsheed, >>>>> >>>>> On 6/08/2020 10:07 pm, Jamsheed C M wrote: >>>>>> Hi all, >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249451 >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~jcm/8249451/webrev.00/ >>>>> >>>>> Thanks for tackling this messy issue. Overall I like the use of >>>>> TRAPS to more clearly document which methods can return with an >>>>> exception pending. I think there are some problems with the >>>>> proposed changes. I'll start with those comments and then move on >>>>> to more general comments. >>>>> >>>>> src/hotspot/share/utilities/exceptions.cpp >>>>> src/hotspot/share/utilities/exceptions.hpp >>>>> >>>>> I don't think the changes here are correct or safe in general. >>>>> >>>>> First, adding the new macro and function to only clear non-async >>>>> exceptions is fine itself. But naming wise the fact only non-async >>>>> exceptions are cleared should be evident, and there is no "check" >>>>> involved (in the sense of the existing CHECK_ macros) so I suggest: >>>>> >>>>> s/CHECK_CLEAR_PENDING_EXCEPTION/CLEAR_PENDING_NONASYNC_EXCEPTIONS/ >>>>> s/check_clear_pending_exception/clear_pending_nonasync_exceptions/ >>>>> >>>> Ok >>>>> But changing the existing CHECK_AND_CLEAR macros to now leave >>>>> async exceptions pending seems potentially dangerous as calling >>>>> code may not be prepared for there to now be a pending exception. >>>>> For example the use in thread.cpp: >>>>> >>>>> ?JDK_Version::set_runtime_name(get_java_runtime_name(THREAD)); >>>>> ?JDK_Version::set_runtime_version(get_java_runtime_version(THREAD)); >>>>> >>>>> get_java_runtime_name() is currently guaranteed to clear all >>>>> exceptions, so all the other code is known to be safe to call. But >>>>> that would no longer be true. That said, this is VM initialization >>>>> code and an async exception is impossible at this stage. >>>>> >>>>> I think I would rather see CHECK_AND_CLEAR left as-is, and an >>>>> actual CHECK_AND_CLEAR_NONASYNC introduced for those users of >>>>> CHECK_AND_CLEAR that can encounter async exceptions and which >>>>> should not clear them. >>>>> >>>>> +?? if >>>>> (!_pending_exception->is_a(SystemDictionary::ThreadDeath_klass()) && >>>>> +?????? _pending_exception->klass() != >>>>> SystemDictionary::InternalError_klass()) { >>>>> >>>> Ok >>>>> Flagging all InternalErrors as async exceptions is probably also >>>>> not correct. I don't see a good solution to this at the moment. I >>>>> think we would need to introduce a new subclass of InternalError >>>>> for the unsafe access error case**. Now it may be that all the >>>>> other InternalError usages are "impossible" in the context of >>>>> where the new macros are to be used, but that is very difficult to >>>>> establish or assert. >>>>> >>>>> ** Or perhaps we could inject a field that allows the VM to >>>>> identify instances related to unsafe access errors ... Ideally of >>>>> course these unsafe access errors would be distinct from the async >>>>> exception mechanism - something I would still like to pursue. >>>>> >>>> Ok >>>>> --- >>>>> >>>>> General comments ... >>>>> >>>>> There is a general change from "JavaThread* thread" to "Thread* >>>>> THREAD" (or TRAPS) to allow the use of the CHECK macros. This is >>>>> unfortunate because the fact the thread is restricted to being a >>>>> JavaThread is no longer evident in the method signatures. That is >>>>> a flaw with the TRAPS/CHECK mechanism unfortunately :( . But as >>>>> the methods no longer take a JavaThread* arg, they should assert >>>>> that THREAD->is_Java_thread(). I will also look at an RFE to have >>>>> as_JavaThread() to avoid the need for separate assertion checks >>>>> before casting from "Thread*" to "JavaThread*". >>>>> >>>> Ok >>>>> Note there's no need to use CHECK when the enclosing method is >>>>> going to return immediately after the call that contains the >>>>> CHECK. It just adds unnecessary checking of the exception state. >>>>> The use of TRAPS shows that the methods may return with an >>>>> exception pending. I've flagged all such occurrences I spotted below. >>>>> >>>> Ok >>>>> --- >>>>> >>>>> +?? // Only metaspace OOM is expected. no Java code executed. >>>>> >>>>> Nit: s/no/No >>>>> >>>>> >>>>> src/hotspot/share/compiler/compilationPolicy.cpp >>>>> >>>>> >>>>> ?410?????? method_invocation_event(method, CHECK_NULL); >>>>> ?489?????? CompileBroker::compile_method(m, InvocationEntryBci, >>>>> comp_level, m, hot_count, CompileTask::Reason_InvocationCount, >>>>> CHECK); >>>>> >>>>> Nit: there's no need to use CHECK here. >>>>> >>>>> --- >>>>> >>>>> src/hotspot/share/compiler/tieredThresholdPolicy.cpp >>>>> >>>>> ?504???? method_invocation_event(method, inlinee, comp_level, nm, >>>>> CHECK_NULL); >>>>> ?570???????? compile(mh, bci, CompLevel_simple, CHECK); >>>>> ?581???????? compile(mh, bci, CompLevel_simple, CHECK); >>>>> ?595???? CompileBroker::compile_method(mh, bci, level, mh, >>>>> hot_count, CompileTask::Reason_Tiered, CHECK); >>>>> 1062?????? compile(mh, InvocationEntryBci, next_level, CHECK); >>>>> >>>>> Nit: there's no need to use CHECK here. >>>>> >>>>> 814 void TieredThresholdPolicy::create_mdo(const methodHandle& mh, >>>>> Thread* THREAD) { >>>>> >>>>> Thank you for correcting this misuse of the THREAD name on a >>>>> JavaThread* type. >>>>> >>>>> --- >>>>> >>>>> src/hotspot/share/interpreter/linkResolver.cpp >>>>> >>>>> ?128 CompilationPolicy::compile_if_required(selected_method, CHECK); >>>>> >>>>> Nit: there's no need to use CHECK here. >>>>> >>>>> --- >>>>> >>>>> src/hotspot/share/jvmci/compilerRuntime.cpp >>>>> >>>>> ?260???? CompilationPolicy::policy()->event(emh, mh, >>>>> InvocationEntryBci, InvocationEntryBci, CompLevel_aot, cm, CHECK); >>>>> ?280???? nmethod* osr_nm = CompilationPolicy::policy()->event(emh, >>>>> mh, branch_bci, target_bci, CompLevel_aot, cm, CHECK); >>>>> >>>>> Nit: there's no need to use CHECK here. >>>>> >>>>> --- >>>>> >>>>> src/hotspot/share/jvmci/jvmciRuntime.cpp >>>>> >>>>> ?102???????? // Donot clear probable async exceptions. >>>>> >>>>> typo: s/Donot/Do not/ >>>>> >>>>> --- >>>>> >>>>> src/hotspot/share/runtime/deoptimization.cpp >>>>> >>>>> 1686 void Deoptimization::load_class_by_index(const >>>>> constantPoolHandle& constant_pool, int index) { >>>>> >>>>> This method should be declared with TRAPS now. >>>>> >>>>> 1693???? // Donot clear probable Async Exceptions. >>>>> >>>>> typo: s/Donot/Do not/ >>>>> >>>>> >>>> Ok >>>>>> testing : mach1-5(links in jbs) >>>>> >>>>> There is very little existing testing that will actually test the >>>>> key changes you have made here. You will need to do direct >>>>> fault-injection testing anywhere you now allow async exceptions to >>>>> remain, to see if the calling code can tolerate that. It will be >>>>> difficult to test thoroughly. >>>>> >>>> Ok >>>>> Thanks again for tackling this difficult problem! >>>> >>>> Best regards, >>>> >>>> Jamsheed >>>> >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> >>>>>> While working on JDK-8246381 it was noticed that compilation >>>>>> request path clears all exceptions(including async) and doesn't >>>>>> propagate[1]. >>>>>> >>>>>> Fix: patch restores the propagation behavior for the probable >>>>>> async exceptions. >>>>>> >>>>>> Compilation request path propagate exception as in [2]. MDO and >>>>>> MethodCounter doesn't expect any exception other than metaspace >>>>>> OOM(added comments). >>>>>> >>>>>> Deoptimization path doesn't clear probable async exceptions and >>>>>> take unpack_exception path for non uncommontraps. >>>>>> >>>>>> Added java_lang_InternalError to well known classes. >>>>>> >>>>>> Request for review. >>>>>> >>>>>> Best Regards, >>>>>> >>>>>> Jamsheed >>>>>> >>>>>> [1] w.r.t changes done for JDK-7131259 >>>>>> >>>>>> [2] >>>>>> >>>>>> ???? (a) >>>>>> ???? -----> >>>>>> c1_Runtime1.cpp/interpreterRuntime.cpp/compilerRuntime.cpp >>>>>> ?????? | >>>>>> ??????? ----- compilationPolicy.cpp/tieredThresholdPolicy.cpp >>>>>> ????????? | >>>>>> ?????????? ------ compileBroker.cpp >>>>>> >>>>>> ???? (b) >>>>>> ???? Xcomp versions >>>>>> ???? ------> compilationPolicy.cpp >>>>>> ??????? | >>>>>> ???????? ------> compileBroker.cpp >>>>>> >>>>>> ???? (c) >>>>>> >>>>>> ???? Direct call to? compile_method in compileBroker.cpp >>>>>> >>>>>> ???? JVMCI bootstrap, whitebox, replayCompile. >>>>>> >>>>>> From sgehwolf at redhat.com Thu Sep 3 09:12:54 2020 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 03 Sep 2020 11:12:54 +0200 Subject: RFR: 8252359 - HotSpot Not Identifying it is Running in a Container In-Reply-To: <114257D0-5401-4E8C-BD09-87F109D74C5D@oracle.com> References: <114257D0-5401-4E8C-BD09-87F109D74C5D@oracle.com> Message-ID: Hi Bob, On Wed, 2020-09-02 at 13:54 -0400, Bob Vandette wrote: > Problem: > > Hotspot does not properly detect that it?s running in a container on Mac and Windows docker desktop > based containers. > > > BUG: > https://bugs.openjdk.java.net/browse/JDK-8252359 > > WEBREV: > http://cr.openjdk.java.net/~bobv/8252359/webrev.01 This patch looks good to me. Comments: test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java With the adapted changes in there it fails prior the fix in cgroupSubsystem_linux.cpp and passes after. That's good enough a regression fix for me. I've also tested it on cgroups v1 (hybrid) and cgroups v2. Works fine. Thanks, Severin From thomas.stuefe at gmail.com Thu Sep 3 09:19:46 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 3 Sep 2020 11:19:46 +0200 Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace In-Reply-To: References: Message-ID: Hi Richard, Thanks for your review! Please note that code has changed since the first webrev. While that is not a problem - most of your findings are still valid - a large number or renamings took place (see Leo's and Coleen's mails). I will use the new naming throughout my answers (e.g. maximal_word_size -> MaxWordSize). Hope that is not too confusing; if in doubt pls look at the HEAD of the sandbox repo, or just wait until the new webrev is done. All remarks inline: ====== > > You added includes of memory/metaspace.hpp to some files without additional > modifications. How did you chose these files? > > To jvmciCompilerToVM.hpp you added an include of collectedHeap.hpp. Why > did you > do this? > metaspace.hpp is needed for either one of MetaspaceType or MetadataType enum. jvmciCompilerToVM.hpp needs a definition or a forward declaration for CollectedHeap. > ====== src/hotspot/share/memory/metaspace.cpp > > 194 // ... and class spacelist. > 195 VirtualSpaceList* vsl = VirtualSpaceList::vslist_nonclass(); > 196 assert(vsl != NULL, "Sanity"); > 197 vsl->verify(slow); > > In 195 VirtualSpaceList::vslist_class() should be called I suppose. > > Good catch. > You could reuse the local vsl as you did with the local cm. > Any reason you assert vsl not NULL in 196 but not in the non-class case? > > A bit inconsistent, yes. I will remove all these asserts. They are not really useful (if any of those were NULL we would crash in a very obvious manner). > 637 // CCS must be aligned to root chunk size, and be at least the > size of one > 638 // root chunk. > 639 adjusted_ccs_size = align_up(adjusted_ccs_size, > reserve_alignment()); > 640 adjusted_ccs_size = MAX2(adjusted_ccs_size, reserve_alignment()); > > Line 640 is redundant, isn't it? > Well, adjusted_ccs_size could have been 0 to begin with. In the greater context I think it cannot be 0 since CompressedClassSpaceSize cannot be set to zero. But that is non-obvious, so I prefer to leave this code in. > > @@ -1274,7 +798,7 @@ > assert(loader_data != NULL, "Should never pass around a NULL > loader_data. " > "ClassLoaderData::the_null_class_loader_data() should have been > used."); > > - MetadataType mdtype = (type == MetaspaceObj::ClassType) ? ClassType : > NonClassType; > + Metaspace::MetadataType mdtype = (type == MetaspaceObj::ClassType) ? > Metaspace::ClassType : Metaspace::NonClassType; > > // Try to allocate metadata. > MetaWord* result = > loader_data->metaspace_non_null()->allocate(word_size, mdtype); > > This hunk is not needed. > > Ok > ====== src/hotspot/share/memory/metaspace/binlist.hpp > > 94 // block sizes this structure can keep are limited by > [_min_block_size, _max_block_size) > 95 const static size_t minimal_word_size = smallest_size; > 96 const static size_t maximal_word_size = minimal_word_size + > num_lists; > > _min_block_size/_max_block_size should be > minimal_word_size/maximal_word_size. > > The upper limit 'maximal_word_size' should be inclusive IMHO: > > const static size_t maximal_word_size = minimal_word_size + num_lists - > 1; > > That would better match the meaning of the variable name. Checks in l.148 > and > l.162 should be adapted in case you agree. > > Leo and Coleen wanted this too. The new naming will be consistent and follow hotspot naming rules: (Min|Max)WordSize. 43 // We store node pointer information in these blocks when storing > them. That > 44 // imposes a minimum size to the managed memory blocks. > 45 // See MetaspaceArene::get_raw_allocation_word_size(). > > > s/MetaspaceArene::get_raw_allocation_word_size/metaspace::get_raw_word_size_for_requested_word_size/ > > I agree with the comment, but > metaspace::get_raw_word_size_for_requested_word_size() does not seem to > take > this into account. > It does, it uses FreeBlocks::MinWordSize. FreeBlocks consists of a tree and a bin list. The tree is only responsible for larger blocks (larger than what would fit into the bin list). Therefore the lower limit is only determined by the bin list minimum word size. Since this may be not obvious, I'll beef up the comment somewhat. > 86 // blocks with the same size are put in a list with this node > as head. > 89 // word size of node. Note that size cannot be larger than max > metaspace size, > 115 // given a node n, add it to the list starting at head > 123 // given a node list starting at head, remove one node from it > and return it. > > You should begin a sentence consistently with a capital letter (you mostly > do it). > > 123 // given a node list starting at head, remove one node from it > and return it. > 124 // List must contain at least one other node. > 125 static node_t* remove_from_list(node_t* head) { > 126 assert(head->next != NULL, "sanity"); > 127 node_t* n = head->next; > 128 if (n != NULL) { > 129 head->next = n->next; > 130 } > 131 return n; > 132 } > > Line 129 must be executed unconditionally. > Good catch. > I'd prefer a more generic implementation that allows head->next to be > NULL. Maybe even head == NULL. > > I don't think that would be much clearer though. We probably could move remove_from_list() up into remove_block(), but for symmetry reasons this would have to be done for add_to_list() too, and I rather like it this way. > 215 // Given a node n and a node forebear, insert n under forebear > 216 void insert(node_t* forebear, node_t* n) { 217 if (n->size == forebear->size) { > 218 add_to_list(n, forebear); // parent stays NULL in this case. > 219 } else { > 220 if (n->size < forebear->size) { > 221 if (forebear->left == NULL) { > 222 set_left_child(forebear, n); > 223 } else { > 224 insert(forebear->left, n); > 225 } > 226 } else { > 227 assert(n->size > forebear->size, "sanity"); > 228 if (forebear->right == NULL) { > 229 set_right_child(forebear, n); > 230 if (_largest_size_added < n->size) { > 231 _largest_size_added = n->size; > 232 } > 233 } else { > 234 insert(forebear->right, n); > 235 } > 236 } > 237 } > 238 } > > This assertion in line 227 is redundant (cannot fail). > That is true for many asserts I add. I use asserts liberally as guard against bit rot and as documentation. I guess this here could be considered superfluous since the setup code is right above. I will remove that assert. > > Leo> There are at least two recursive calls of insert that could be > Leo> tail-called instead (it would be somewhat harder to read, so I am > not > Leo> proposing it). > > I think they _are_ tail-recursions in the current form. > Gcc eliminates them. I checked the release build with gdb: > (disass /s metaspace::FreeBlocks::add_block) > > Recursive tail-calls can be easily replaced with loops. To be save I'd > suggest > to do that or at least add 'return' after each call with a comment that > nothing > must be added between the call and the return too keep it a > tail-recursion. Maybe that's sufficient... on the other hand we don't know > if > every C++ compiler can eliminate the calls and stack overflows when > debugging > would be also irritating. > > 251 return find_closest_fit(n->right, s); > 260 return find_closest_fit(n->left, s); > > More tail-recursion. Same as above. > > I'll rewrite BlockTree to not use recursion. > 257 assert(n->size > s, "Sanity"); > > Assertion is redundant. > > 262 // n is the best fit. > 263 return n; > > In the following example it is not, is it? > > N1:40 > / > / > N2:20 > \ > \ > N3:30 > > find_closest_fit(N1, 30) will return N2 but N3 is the closest fit. I think > you > have to search the left tree for a better fit independently of the size of > its > root node. > > Good catch. > 293 if (n->left == NULL && n->right == NULL) { > 294 replace_node_in_parent(n, NULL); > 295 > 296 } else if (n->left == NULL && n->right != NULL) { > 297 replace_node_in_parent(n, n->right); > 298 > 299 } else if (n->left != NULL && n->right == NULL) { > 300 replace_node_in_parent(n, n->left); > 301 > 302 } else { > > Can be simplified to: > > if (n->left == NULL) { > replace_node_in_parent(n, n->right); > } else if (n->right == NULL) { > replace_node_in_parent(n, n->left); > } else { > > Yes, but I'd rather leave the code as it is; I think it's easier to read that way. > 341 // The right child of the successor (if there was one) > replaces the successor at its parent's left child. > > Please add a line break. > > The comments and assertions in remove_node_from_tree() helped to > understand the > logic. Thanks! > > :) > ====== src/hotspot/share/memory/metaspace/blocktree.cpp > > 40 // These asserts prints the tree, then asserts > 41 #define assrt(cond, format, ...) \ > 42 if (!(cond)) { \ > 43 print_tree(tty); \ > 44 assert(cond, format, __VA_ARGS__); \ > 45 } > 46 > 47 // This assert prints the tree, then stops (generic message) > 48 #define assrt0(cond) \ > 49 if (!(cond)) { \ > 50 print_tree(tty); \ > 51 assert(cond, "sanity"); \ > 52 } > > Better wrap into do-while(0) (see definition of vmassert) > Ok. > > 110 verify_node(n->left, left_limit, n->size, vd, lvl + 1); > > Recursive call that isn't a tail call. Prone to stack overflow. Well I > guess you > need a stack to traverse a tree. GrowableArray is a common choice if you > want to > eliminate this recursion. As it is only verification code you might as well > leave it and interpret stack overflow as verification failure. > 118 verify_node(n->right, n->size, right_limit, vd, lvl + 1); > > Tail-recursion can be easily eliminated. See comments on blocktree.hpp > above. > I'll rewrite this to be non-recursive. ====== src/hotspot/share/memory/metaspace/chunkManager.cpp > > The slow parameter in ChunkManager::verify*() is not used. > > I'll remove all "slow" params from all verifications and recode this to use -XX:VerifyMetaspaceInterval. I think that is easier to use. > ====== src/hotspot/share/memory/metaspace/counter.hpp > > 104 void decrement() { > 105 #ifdef ASSERT > 106 T old = Atomic::load_acquire(&_c); > 107 assert(old >= 1, > 108 "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > 109 #endif > 110 Atomic::dec(&_c); > 111 } > > I think you could use Atomic::add() which returns the old value and make > the assert atomic too: > void decrement() { > T old = Atomic::add(&_c, T(-1)); > #ifdef ASSERT > assert(old >= 1, > "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > #endif > } > > Same for increment(), increment_by(), decrement_by(), ... > > Thought so too but Atomic::add seems to return the new value, not the old, see e.g. atomic_linux_x86.hpp: struct Atomic::PlatformAdd { .... D add_and_fetch(D volatile* dest, I add_value, atomic_memory_order order) const { return fetch_and_add(dest, add_value, order) + add_value; } }; I also checked the callers, those few places I found which do anything meaningful with the result seem to expect the new value. See e.g. G1BuildCandidateArray::claim_chunk(). It is annoying that these APIs are not documented. I thought this is because these APIs are obvious to everyone but me but looks like they are not. > ====== src/hotspot/share/memory/metaspace/metaspaceArena.cpp > > There's too much vertical white space, I'd think. > > metaspace::get_raw_allocation_word_size() is a duplicate of > metaspace::get_raw_word_size_for_requested_word_size() > metaspace::get_raw_allocation_word_size() is only referenced in comments > and > should be removed. > > Oops. Sure, I'll remove that. byte_size should also depend on BlockTree::minimal_word_size I think. > Something like > > if (worde_size > FreeBlocks::maximal_word_size) > byte_size = MAX2(byte_size, BlockTree::minimal_word_size * BytesPerWord); > > FreeBlocks::maximal_word_size needs to be defined for this. > > See above. In addition to what I wrote, BlockTree is an implementation detail of FreeBlocks, so it should not matter here. Thanks Richard. I will work in your feedback and publish a new webrev shortly. Cheers, Thomas From robbin.ehn at oracle.com Thu Sep 3 10:02:49 2020 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 3 Sep 2020 12:02:49 +0200 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check In-Reply-To: References: Message-ID: Looks good, ship it! Thanks for fixing! /Robbin On 2020-09-02 17:15, Reingruber, Richard wrote: > Hi, > > please help review this fix for a race condition in > JavaThread::java_suspend_self_with_safepoint_check() that allows a suspended > thread to continue executing java for an arbitrary long time (see repro test > attached to bug report). > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8252521 > > The fix is to add a do-while-loop to java_suspend_self_with_safepoint_check() > that checks if the current thread was suspended again after returning from > java_suspend_self() and before restoring the original thread state. The check is > performed after restoring the original state because then we are guaranteed to > see the suspend request issued before the requester observed that target to be > _thread_blocked and executed VM_ThreadSuspend. > > Thanks, Richard. > From jamsheed.c.m at oracle.com Thu Sep 3 10:08:56 2020 From: jamsheed.c.m at oracle.com (Jamsheed C M) Date: Thu, 3 Sep 2020 15:38:56 +0530 Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions In-Reply-To: <1cc00e89-467e-f976-56a7-630f8782d61a@oracle.com> References: <442caa21-ca0a-f6eb-60a5-1e74bf994894@oracle.com> <03df9364-817d-04d6-6434-80be93a66526@oracle.com> <1cc00e89-467e-f976-56a7-630f8782d61a@oracle.com> Message-ID: Hi David, On 03/09/2020 05:50, David Holmes wrote: > > we can now return with a pending async exception and it is unclear > whether the code following this will be able to handle that, or indeed > whether the caller will be able to handle it. Did you specifically > test this site? Yes i specifically tested this site for C2 UC trap case. It works fine. Best regards, Jamsheed From yumin.qi at oracle.com Thu Sep 3 15:36:28 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Thu, 3 Sep 2020 08:36:28 -0700 Subject: RFR: 8252725: Refactor jlink GenerateJLIClassesPlugin code Message-ID: Hi, Please review bug: https://bugs.openjdk.java.net/browse/JDK-8252725 webrev: http://cr.openjdk.java.net/~minqi/2020/8252725/webrev-01/ Summary: The work is part of 8247536, which supports archive pre-generated java.lang.invoke classes in CDS. In this patch (thanks to Mandy): 1. Two methods for tracing SPECIES_RESOLVE and LF_RESOLVE are added to GenerateJLIClassesHelper: traceSpeciesType and traceLambdaForm respectively; 2. Move log file parsing work to java.lang.InvokeJLIClassesHelper; 3. Clean up interface APIs since old APIs no longer used with the moving; 4. New API JavaLangInvokeAccess::generateHolderClassesreturns a map of class name, which in internal form as key rather than the jimage entry point, vs class bytes. This makes both JLI and CDS can use the new interface easily. CDS will add a new function (in 8247536 patch, only for convenience for converting the map to array) to GenerateJLIClassesHelper to call the new added interface API (generateHolderClasses)to regenerate holder classes during dump time. Thanks Yumin From richard.reingruber at sap.com Thu Sep 3 15:37:13 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 3 Sep 2020 15:37:13 +0000 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check In-Reply-To: References: Message-ID: Thanks for reviewing. I'll ship it tomorrow. Cheers, Richard. -----Original Message----- From: Robbin Ehn Sent: Donnerstag, 3. September 2020 12:03 To: Reingruber, Richard ; Hotspot dev runtime Subject: Re: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check Looks good, ship it! Thanks for fixing! /Robbin On 2020-09-02 17:15, Reingruber, Richard wrote: > Hi, > > please help review this fix for a race condition in > JavaThread::java_suspend_self_with_safepoint_check() that allows a suspended > thread to continue executing java for an arbitrary long time (see repro test > attached to bug report). > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8252521 > > The fix is to add a do-while-loop to java_suspend_self_with_safepoint_check() > that checks if the current thread was suspended again after returning from > java_suspend_self() and before restoring the original thread state. The check is > performed after restoring the original state because then we are guaranteed to > see the suspend request issued before the requester observed that target to be > _thread_blocked and executed VM_ThreadSuspend. > > Thanks, Richard. > From yumin.qi at oracle.com Thu Sep 3 15:41:05 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Thu, 3 Sep 2020 08:41:05 -0700 Subject: RFR: 8252725: Refactor jlink GenerateJLIClassesPlugin code In-Reply-To: References: Message-ID: <34eeb2ba-4f8a-37ea-861f-e946ae1fd22c@oracle.com> Sorry push "send" too soon: Tests: local build. mach5 tier1-4 have 2 timeouts on build which I think related to lab move(?). Thanks Yumin On 9/3/20 8:36 AM, Yumin Qi wrote: > Hi, Please review > > > bug: https://bugs.openjdk.java.net/browse/JDK-8252725 > > webrev: http://cr.openjdk.java.net/~minqi/2020/8252725/webrev-01/ > > > Summary: The work is part of 8247536, which supports archive pre-generated java.lang.invoke classes in CDS. In this patch (thanks to Mandy): > > 1. Two methods for tracing SPECIES_RESOLVE and LF_RESOLVE are added to GenerateJLIClassesHelper: traceSpeciesType and traceLambdaForm respectively; > > 2. Move log file parsing work to java.lang.InvokeJLIClassesHelper; > > 3. Clean up interface APIs since old APIs no longer used with the moving; > > 4. New API JavaLangInvokeAccess::generateHolderClassesreturns a map of class name, which in internal form as key rather than the jimage entry point, vs class bytes. > > This makes both JLI and CDS can use the new interface easily. CDS will add a new function (in 8247536 patch, only for convenience for converting the map to array) to GenerateJLIClassesHelper to call the new added interface API (generateHolderClasses)to regenerate holder classes during dump time. > > > Thanks > > Yumin > > From mandy.chung at oracle.com Thu Sep 3 16:13:39 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 3 Sep 2020 09:13:39 -0700 Subject: RFR: 8252725: Refactor jlink GenerateJLIClassesPlugin code In-Reply-To: References: Message-ID: <4be47e30-aa69-7db0-ff04-f2d379fb8b38@oracle.com> On 9/3/20 8:36 AM, Yumin Qi wrote: > Hi, Please review > > > bug: https://bugs.openjdk.java.net/browse/JDK-8252725 > > webrev: http://cr.openjdk.java.net/~minqi/2020/8252725/webrev-01/ > Looks good to me.?? Sundar should also review it. A few things to mention compared to the proposed patch from 8247536:? we no longer log the error case for LF_RESOLVE since it's ignored anyway.? As the code is moved to java.lang.invoke, we also clean up the code to use constants and methods defined in LambdaForm and MethodTypeForm and BasicType (rather than duplicating such definitions). Mandy > > Summary: The work is part of 8247536, which supports archive > pre-generated java.lang.invoke classes in CDS. In this patch (thanks > to Mandy): > > 1. Two methods for tracing SPECIES_RESOLVE and LF_RESOLVE are added to > GenerateJLIClassesHelper: traceSpeciesType and traceLambdaForm > respectively; > > 2. Move log file parsing work to java.lang.InvokeJLIClassesHelper; > > 3. Clean up interface APIs since old APIs no longer used with the moving; > > 4. New API JavaLangInvokeAccess::generateHolderClassesreturns a map of > class name, which in internal form as key rather than the jimage entry > point, vs class bytes. > > This makes both JLI and CDS can use the new interface easily. CDS will > add a new function (in 8247536 patch, only for convenience for > converting the map to array) to GenerateJLIClassesHelper to call the > new added interface API (generateHolderClasses)to regenerate holder > classes during dump time. > > > Thanks > > Yumin > > From richard.reingruber at sap.com Thu Sep 3 16:56:42 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 3 Sep 2020 16:56:42 +0000 Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace In-Reply-To: References: Message-ID: Hi Thomas, > Thanks for your review! Welcome! :) > > ====== src/hotspot/share/memory/metaspace/counter.hpp > > > > 104 void decrement() { > > 105 #ifdef ASSERT > > 106 T old = Atomic::load_acquire(&_c); > > 107 assert(old >= 1, > > 108 "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > > 109 #endif > > 110 Atomic::dec(&_c); > > 111 } > > > > I think you could use Atomic::add() which returns the old value and make > > the assert atomic too: > > > void decrement() { > > T old = Atomic::add(&_c, T(-1)); > > #ifdef ASSERT > > assert(old >= 1, > > "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > > #endif > > } > > > > Same for increment(), increment_by(), decrement_by(), ... > > > > > Thought so too but Atomic::add seems to return the new value, not the old, > see e.g. atomic_linux_x86.hpp: > struct Atomic::PlatformAdd { > .... > D add_and_fetch(D volatile* dest, I add_value, atomic_memory_order order) > const { > return fetch_and_add(dest, add_value, order) + add_value; > } > }; > I also checked the callers, those few places I found which do anything > meaningful with the result seem to expect the new value. See > e.g. G1BuildCandidateArray::claim_chunk(). > It is annoying that these APIs are not documented. I thought this is > because these APIs are obvious to everyone but me but looks like they are > not. I agree that a little bit of documentation would help. Well if it is the new value that is returned, you could assert like this: void decrement() { T new_val = Atomic::add(&_c, T(-1)); #ifdef ASSERT assert(new_val != T(-1), "underflow (0-1)"); #endif } I'm not insisting but out of experience I do know that imprecise asserts can be a pain. Note also that Atomic::load_acquire(&_c) does not help to get the most recent value of _c. Acquire only prevents subsequent accesses to be reordered with the load_acquire(). In other words: the state that is observed subsequently is at least as recent as the state observed with the load_acquire() but it can be stale. > > byte_size should also depend on BlockTree::minimal_word_size I think. > > Something like > > > > if (worde_size > FreeBlocks::maximal_word_size) > > byte_size = MAX2(byte_size, BlockTree::minimal_word_size * BytesPerWord); > > > > FreeBlocks::maximal_word_size needs to be defined for this. > > > > > See above. In addition to what I wrote, BlockTree is an implementation > detail of FreeBlocks, so it should not matter here. You could assert (statically) somewhere that FreeBlocks::maximal_word_size >= BlockTree::minimal_word_size. But I'm not insisting. Thanks, Richard. From: Thomas St?fe Sent: Donnerstag, 3. September 2020 11:20 To: Reingruber, Richard Cc: Hotspot dev runtime ; Hotspot-Gc-Dev Subject: Re: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace Hi Richard, Thanks for your review! Please note that code has changed since the first webrev. While that is not a problem - most of your findings are still valid - a large number or renamings took place (see Leo's and Coleen's mails). I will use the new naming throughout my answers (e.g. maximal_word_size -> MaxWordSize). Hope that is not too confusing; if in doubt pls look at the HEAD of the sandbox repo, or just wait until the new webrev is done. All remarks inline: ====== You added includes of memory/metaspace.hpp to some files without additional modifications. How did you chose these files? To jvmciCompilerToVM.hpp you added an include of collectedHeap.hpp. Why did you do this? metaspace.hpp is needed for either one of MetaspaceType or MetadataType enum. jvmciCompilerToVM.hpp needs a definition or a forward declaration for CollectedHeap. ====== src/hotspot/share/memory/metaspace.cpp 194 // ... and class spacelist. 195 VirtualSpaceList* vsl = VirtualSpaceList::vslist_nonclass(); 196 assert(vsl != NULL, "Sanity"); 197 vsl->verify(slow); In 195 VirtualSpaceList::vslist_class() should be called I suppose. Good catch. You could reuse the local vsl as you did with the local cm. Any reason you assert vsl not NULL in 196 but not in the non-class case? A bit inconsistent, yes. I will remove all these asserts. They are not really useful (if any of those were NULL we would crash in a very obvious manner). 637 // CCS must be aligned to root chunk size, and be at least the size of one 638 // root chunk. 639 adjusted_ccs_size = align_up(adjusted_ccs_size, reserve_alignment()); 640 adjusted_ccs_size = MAX2(adjusted_ccs_size, reserve_alignment()); Line 640 is redundant, isn't it? Well, adjusted_ccs_size could have been 0 to begin with. In the greater context I think it cannot be 0 since CompressedClassSpaceSize cannot be set to zero. But that is non-obvious, so I prefer to leave this code in. @@ -1274,7 +798,7 @@ assert(loader_data != NULL, "Should never pass around a NULL loader_data. " "ClassLoaderData::the_null_class_loader_data() should have been used."); - MetadataType mdtype = (type == MetaspaceObj::ClassType) ? ClassType : NonClassType; + Metaspace::MetadataType mdtype = (type == MetaspaceObj::ClassType) ? Metaspace::ClassType : Metaspace::NonClassType; // Try to allocate metadata. MetaWord* result = loader_data->metaspace_non_null()->allocate(word_size, mdtype); This hunk is not needed. Ok ====== src/hotspot/share/memory/metaspace/binlist.hpp 94 // block sizes this structure can keep are limited by [_min_block_size, _max_block_size) 95 const static size_t minimal_word_size = smallest_size; 96 const static size_t maximal_word_size = minimal_word_size + num_lists; _min_block_size/_max_block_size should be minimal_word_size/maximal_word_size. The upper limit 'maximal_word_size' should be inclusive IMHO: const static size_t maximal_word_size = minimal_word_size + num_lists - 1; That would better match the meaning of the variable name. Checks in l.148 and l.162 should be adapted in case you agree. Leo and Coleen wanted this too. The new naming will be consistent and follow hotspot naming rules: (Min|Max)WordSize. 43 // We store node pointer information in these blocks when storing them. That 44 // imposes a minimum size to the managed memory blocks. 45 // See MetaspaceArene::get_raw_allocation_word_size(). s/MetaspaceArene::get_raw_allocation_word_size/metaspace::get_raw_word_size_for_requested_word_size/ I agree with the comment, but metaspace::get_raw_word_size_for_requested_word_size() does not seem to take this into account. It does, it uses FreeBlocks::MinWordSize. FreeBlocks consists of a tree and a bin list. The tree is only responsible for larger blocks (larger than what would fit into the bin list). Therefore the lower limit is only determined by the bin list minimum word size. Since this may be not obvious, I'll beef up the comment somewhat. 86 // blocks with the same size are put in a list with this node as head. 89 // word size of node. Note that size cannot be larger than max metaspace size, 115 // given a node n, add it to the list starting at head 123 // given a node list starting at head, remove one node from it and return it. You should begin a sentence consistently with a capital letter (you mostly do it). 123 // given a node list starting at head, remove one node from it and return it. 124 // List must contain at least one other node. 125 static node_t* remove_from_list(node_t* head) { 126 assert(head->next != NULL, "sanity"); 127 node_t* n = head->next; 128 if (n != NULL) { 129 head->next = n->next; 130 } 131 return n; 132 } Line 129 must be executed unconditionally. Good catch. I'd prefer a more generic implementation that allows head->next to be NULL. Maybe even head == NULL. I don't think that would be much clearer though. We probably could move remove_from_list() up into remove_block(), but for symmetry reasons this would have to be done for add_to_list() too, and I rather like it this way. 215 // Given a node n and a node forebear, insert n under forebear 216 void insert(node_t* forebear, node_t* n) { 217 if (n->size == forebear->size) { 218 add_to_list(n, forebear); // parent stays NULL in this case. 219 } else { 220 if (n->size < forebear->size) { 221 if (forebear->left == NULL) { 222 set_left_child(forebear, n); 223 } else { 224 insert(forebear->left, n); 225 } 226 } else { 227 assert(n->size > forebear->size, "sanity"); 228 if (forebear->right == NULL) { 229 set_right_child(forebear, n); 230 if (_largest_size_added < n->size) { 231 _largest_size_added = n->size; 232 } 233 } else { 234 insert(forebear->right, n); 235 } 236 } 237 } 238 } This assertion in line 227 is redundant (cannot fail). That is true for many asserts I add. I use asserts liberally as guard against bit rot and as documentation. I guess this here could be considered superfluous since the setup code is right above. I will remove that assert. Leo> There are at least two recursive calls of insert that could be Leo> tail-called instead (it would be somewhat harder to read, so I am not Leo> proposing it). I think they _are_ tail-recursions in the current form. Gcc eliminates them. I checked the release build with gdb: (disass /s metaspace::FreeBlocks::add_block) Recursive tail-calls can be easily replaced with loops. To be save I'd suggest to do that or at least add 'return' after each call with a comment that nothing must be added between the call and the return too keep it a tail-recursion. Maybe that's sufficient... on the other hand we don't know if every C++ compiler can eliminate the calls and stack overflows when debugging would be also irritating. 251 return find_closest_fit(n->right, s); 260 return find_closest_fit(n->left, s); More tail-recursion. Same as above. I'll rewrite BlockTree to not use recursion. 257 assert(n->size > s, "Sanity"); Assertion is redundant. 262 // n is the best fit. 263 return n; In the following example it is not, is it? N1:40 / / N2:20 \ \ N3:30 find_closest_fit(N1, 30) will return N2 but N3 is the closest fit. I think you have to search the left tree for a better fit independently of the size of its root node. Good catch. 293 if (n->left == NULL && n->right == NULL) { 294 replace_node_in_parent(n, NULL); 295 296 } else if (n->left == NULL && n->right != NULL) { 297 replace_node_in_parent(n, n->right); 298 299 } else if (n->left != NULL && n->right == NULL) { 300 replace_node_in_parent(n, n->left); 301 302 } else { Can be simplified to: if (n->left == NULL) { replace_node_in_parent(n, n->right); } else if (n->right == NULL) { replace_node_in_parent(n, n->left); } else { Yes, but I'd rather leave the code as it is; I think it's easier to read that way. 341 // The right child of the successor (if there was one) replaces the successor at its parent's left child. Please add a line break. The comments and assertions in remove_node_from_tree() helped to understand the logic. Thanks! :) ====== src/hotspot/share/memory/metaspace/blocktree.cpp 40 // These asserts prints the tree, then asserts 41 #define assrt(cond, format, ...) \ 42 if (!(cond)) { \ 43 print_tree(tty); \ 44 assert(cond, format, __VA_ARGS__); \ 45 } 46 47 // This assert prints the tree, then stops (generic message) 48 #define assrt0(cond) \ 49 if (!(cond)) { \ 50 print_tree(tty); \ 51 assert(cond, "sanity"); \ 52 } Better wrap into do-while(0) (see definition of vmassert) Ok. 110 verify_node(n->left, left_limit, n->size, vd, lvl + 1); Recursive call that isn't a tail call. Prone to stack overflow. Well I guess you need a stack to traverse a tree. GrowableArray is a common choice if you want to eliminate this recursion. As it is only verification code you might as well leave it and interpret stack overflow as verification failure. 118 verify_node(n->right, n->size, right_limit, vd, lvl + 1); Tail-recursion can be easily eliminated. See comments on blocktree.hpp above. I'll rewrite this to be non-recursive. ====== src/hotspot/share/memory/metaspace/chunkManager.cpp The slow parameter in ChunkManager::verify*() is not used. I'll remove all "slow" params from all verifications and recode this to use -XX:VerifyMetaspaceInterval. I think that is easier to use. ====== src/hotspot/share/memory/metaspace/counter.hpp 104 void decrement() { 105 #ifdef ASSERT 106 T old = Atomic::load_acquire(&_c); 107 assert(old >= 1, 108 "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); 109 #endif 110 Atomic::dec(&_c); 111 } I think you could use Atomic::add() which returns the old value and make the assert atomic too: void decrement() { T old = Atomic::add(&_c, T(-1)); #ifdef ASSERT assert(old >= 1, "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); #endif } Same for increment(), increment_by(), decrement_by(), ... Thought so too but Atomic::add seems to return the new value, not the old, see e.g. atomic_linux_x86.hpp: struct Atomic::PlatformAdd { .... D add_and_fetch(D volatile* dest, I add_value, atomic_memory_order order) const { return fetch_and_add(dest, add_value, order) + add_value; } }; I also checked the callers, those few places I found which do anything meaningful with the result seem to expect the new value. See e.g. G1BuildCandidateArray::claim_chunk(). It is annoying that these APIs are not documented. I thought this is because these APIs are obvious to everyone but me but looks like they are not. ====== src/hotspot/share/memory/metaspace/metaspaceArena.cpp There's too much vertical white space, I'd think. metaspace::get_raw_allocation_word_size() is a duplicate of metaspace::get_raw_word_size_for_requested_word_size() metaspace::get_raw_allocation_word_size() is only referenced in comments and should be removed. Oops. Sure, I'll remove that. byte_size should also depend on BlockTree::minimal_word_size I think. Something like if (worde_size > FreeBlocks::maximal_word_size) byte_size = MAX2(byte_size, BlockTree::minimal_word_size * BytesPerWord); FreeBlocks::maximal_word_size needs to be defined for this. See above. In addition to what I wrote, BlockTree is an implementation detail of FreeBlocks, so it should not matter here. Thanks Richard. I will work in your feedback and publish a new webrev shortly. Cheers, Thomas From thomas.stuefe at gmail.com Thu Sep 3 17:06:32 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 3 Sep 2020 19:06:32 +0200 Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace In-Reply-To: References: Message-ID: On Thu, Sep 3, 2020 at 6:56 PM Reingruber, Richard < richard.reingruber at sap.com> wrote: > Hi Thomas, > > > > > Thanks for your review! > > > > Welcome! :) > > > > > > ====== src/hotspot/share/memory/metaspace/counter.hpp > > > > > > > > 104 void decrement() { > > > > 105 #ifdef ASSERT > > > > 106 T old = Atomic::load_acquire(&_c); > > > > 107 assert(old >= 1, > > > > 108 "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > > > > 109 #endif > > > > 110 Atomic::dec(&_c); > > > > 111 } > > > > > > > > I think you could use Atomic::add() which returns the old value and > make > > > > the assert atomic too: > > > > > > > void decrement() { > > > > T old = Atomic::add(&_c, T(-1)); > > > > #ifdef ASSERT > > > > assert(old >= 1, > > > > "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > > > > #endif > > > > } > > > > > > > > Same for increment(), increment_by(), decrement_by(), ... > > > > > > > > > > > Thought so too but Atomic::add seems to return the new value, not the > old, > > > see e.g. atomic_linux_x86.hpp: > > > > > struct Atomic::PlatformAdd { > > > .... > > > D add_and_fetch(D volatile* dest, I add_value, atomic_memory_order > order) > > > const { > > > return fetch_and_add(dest, add_value, order) + add_value; > > > } > > > }; > > > > > I also checked the callers, those few places I found which do anything > > > meaningful with the result seem to expect the new value. See > > > e.g. G1BuildCandidateArray::claim_chunk(). > > > > > It is annoying that these APIs are not documented. I thought this is > > > because these APIs are obvious to everyone but me but looks like they are > > > not. > > > > I agree that a little bit of documentation would help. Well if it is the > new > > value that is returned, you could assert like this: > > > > void decrement() { > > T new_val = Atomic::add(&_c, T(-1)); > > #ifdef ASSERT > > assert(new_val != T(-1), "underflow (0-1)"); > > #endif > > } > > > > I'm not insisting but out of experience I do know that imprecise asserts > can be > > a pain. Note also that Atomic::load_acquire(&_c) does not help to get the > most > > recent value of _c. Acquire only prevents subsequent accesses to be > reordered > > with the load_acquire(). In other words: the state that is observed > subsequently > > is at least as recent as the state observed with the load_acquire() but it > can > > be stale. > I'll think about this. You may be right. I originally wanted to use the return value of A::add but shied away because of the bad documentation. Do you think it behaves the same way on all platforms? Well I guess it does since the using code I found is shared. > > > > > byte_size should also depend on BlockTree::minimal_word_size I think. > > > > Something like > > > > > > > > if (worde_size > FreeBlocks::maximal_word_size) > > > > byte_size = MAX2(byte_size, BlockTree::minimal_word_size * > BytesPerWord); > > > > > > > > FreeBlocks::maximal_word_size needs to be defined for this. > > > > > > > > > > > See above. In addition to what I wrote, BlockTree is an implementation > > > detail of FreeBlocks, so it should not matter here. > > > > You could assert (statically) somewhere that FreeBlocks::maximal_word_size > >= BlockTree::minimal_word_size. > > But I'm not insisting. > > > Oh sure, I can do that. > Thanks, > > Richard. > > > Thanks, Richard. I'm currently preparing a new webrev and running tests; these changes will have to wait until after that. Cheers, Thomas > *From:* Thomas St?fe > *Sent:* Donnerstag, 3. September 2020 11:20 > *To:* Reingruber, Richard > *Cc:* Hotspot dev runtime ; > Hotspot-Gc-Dev > *Subject:* Re: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace > > > > Hi Richard, > > > > Thanks for your review! > > > > Please note that code has changed since the first webrev. While that is > not a problem - most of your findings are still valid - a large number or > renamings took place (see Leo's and Coleen's mails). > > > > I will use the new naming throughout my answers (e.g. maximal_word_size -> > MaxWordSize). Hope that is not too confusing; if in doubt pls look at the > HEAD of the sandbox repo, or just wait until the new webrev is done. > > > > All remarks inline: > > > > > > > > ====== > > You added includes of memory/metaspace.hpp to some files without additional > modifications. How did you chose these files? > > To jvmciCompilerToVM.hpp you added an include of collectedHeap.hpp. Why > did you > do this? > > > > metaspace.hpp is needed for either one of MetaspaceType or MetadataType > enum. jvmciCompilerToVM.hpp needs a definition or a forward declaration for > CollectedHeap. > > > > > ====== src/hotspot/share/memory/metaspace.cpp > > 194 // ... and class spacelist. > 195 VirtualSpaceList* vsl = VirtualSpaceList::vslist_nonclass(); > 196 assert(vsl != NULL, "Sanity"); > 197 vsl->verify(slow); > > In 195 VirtualSpaceList::vslist_class() should be called I suppose. > > > > Good catch. > > > > You could reuse the local vsl as you did with the local cm. > Any reason you assert vsl not NULL in 196 but not in the non-class case? > > > > A bit inconsistent, yes. I will remove all these asserts. They are not > really useful (if any of those were NULL we would crash in a very obvious > manner). > > > > 637 // CCS must be aligned to root chunk size, and be at least the > size of one > 638 // root chunk. > 639 adjusted_ccs_size = align_up(adjusted_ccs_size, > reserve_alignment()); > 640 adjusted_ccs_size = MAX2(adjusted_ccs_size, reserve_alignment()); > > Line 640 is redundant, isn't it? > > > > Well, adjusted_ccs_size could have been 0 to begin with. In the greater > context I think it cannot be 0 since CompressedClassSpaceSize cannot be set > to zero. But that is non-obvious, so I prefer to leave this code in. > > > > > @@ -1274,7 +798,7 @@ > assert(loader_data != NULL, "Should never pass around a NULL > loader_data. " > "ClassLoaderData::the_null_class_loader_data() should have been > used."); > > - MetadataType mdtype = (type == MetaspaceObj::ClassType) ? ClassType : > NonClassType; > + Metaspace::MetadataType mdtype = (type == MetaspaceObj::ClassType) ? > Metaspace::ClassType : Metaspace::NonClassType; > > // Try to allocate metadata. > MetaWord* result = > loader_data->metaspace_non_null()->allocate(word_size, mdtype); > > This hunk is not needed. > > > > Ok > > > > ====== src/hotspot/share/memory/metaspace/binlist.hpp > > 94 // block sizes this structure can keep are limited by > [_min_block_size, _max_block_size) > 95 const static size_t minimal_word_size = smallest_size; > 96 const static size_t maximal_word_size = minimal_word_size + > num_lists; > > _min_block_size/_max_block_size should be > minimal_word_size/maximal_word_size. > > The upper limit 'maximal_word_size' should be inclusive IMHO: > > const static size_t maximal_word_size = minimal_word_size + num_lists - > 1; > > That would better match the meaning of the variable name. Checks in l.148 > and > l.162 should be adapted in case you agree. > > > > Leo and Coleen wanted this too. The new naming will be consistent and > follow hotspot naming rules: (Min|Max)WordSize. > > > > > > > > 43 // We store node pointer information in these blocks when storing > them. That > 44 // imposes a minimum size to the managed memory blocks. > 45 // See MetaspaceArene::get_raw_allocation_word_size(). > > > s/MetaspaceArene::get_raw_allocation_word_size/metaspace::get_raw_word_size_for_requested_word_size/ > > I agree with the comment, but > metaspace::get_raw_word_size_for_requested_word_size() does not seem to > take > this into account. > > > > It does, it uses FreeBlocks::MinWordSize. FreeBlocks consists of a tree > and a bin list. The tree is only responsible for larger blocks (larger than > what would fit into the bin list). Therefore the lower limit is only > determined by the bin list minimum word size. > > > > Since this may be not obvious, I'll beef up the comment somewhat. > > > > > 86 // blocks with the same size are put in a list with this node > as head. > 89 // word size of node. Note that size cannot be larger than max > metaspace size, > 115 // given a node n, add it to the list starting at head > 123 // given a node list starting at head, remove one node from it > and return it. > > You should begin a sentence consistently with a capital letter (you mostly > do it). > > 123 // given a node list starting at head, remove one node from it > and return it. > 124 // List must contain at least one other node. > 125 static node_t* remove_from_list(node_t* head) { > 126 assert(head->next != NULL, "sanity"); > 127 node_t* n = head->next; > 128 if (n != NULL) { > 129 head->next = n->next; > 130 } > 131 return n; > 132 } > > Line 129 must be executed unconditionally. > > > > Good catch. > > > > > I'd prefer a more generic implementation that allows head->next to be > NULL. Maybe even head == NULL. > > > > I don't think that would be much clearer though. We probably could move > remove_from_list() up into remove_block(), but for symmetry reasons this > would have to be done for add_to_list() too, and I rather like it this way. > > > > > > 215 // Given a node n and a node forebear, insert n under forebear > 216 void insert(node_t* forebear, node_t* n) { > > 217 if (n->size == forebear->size) { > 218 add_to_list(n, forebear); // parent stays NULL in this case. > 219 } else { > 220 if (n->size < forebear->size) { > 221 if (forebear->left == NULL) { > 222 set_left_child(forebear, n); > 223 } else { > 224 insert(forebear->left, n); > 225 } > 226 } else { > 227 assert(n->size > forebear->size, "sanity"); > 228 if (forebear->right == NULL) { > 229 set_right_child(forebear, n); > 230 if (_largest_size_added < n->size) { > 231 _largest_size_added = n->size; > 232 } > 233 } else { > 234 insert(forebear->right, n); > 235 } > 236 } > 237 } > 238 } > > This assertion in line 227 is redundant (cannot fail). > > > > That is true for many asserts I add. I use asserts liberally as guard > against bit rot and as documentation. I guess this here could be considered > superfluous since the setup code is right above. I will remove that assert. > > > > > Leo> There are at least two recursive calls of insert that could be > Leo> tail-called instead (it would be somewhat harder to read, so I am > not > Leo> proposing it). > > I think they _are_ tail-recursions in the current form. > Gcc eliminates them. I checked the release build with gdb: > (disass /s metaspace::FreeBlocks::add_block) > > Recursive tail-calls can be easily replaced with loops. To be save I'd > suggest > to do that or at least add 'return' after each call with a comment that > nothing > must be added between the call and the return too keep it a > tail-recursion. Maybe that's sufficient... on the other hand we don't know > if > every C++ compiler can eliminate the calls and stack overflows when > debugging > would be also irritating. > > 251 return find_closest_fit(n->right, s); > 260 return find_closest_fit(n->left, s); > > More tail-recursion. Same as above. > > > > I'll rewrite BlockTree to not use recursion. > > > > 257 assert(n->size > s, "Sanity"); > > Assertion is redundant. > > 262 // n is the best fit. > 263 return n; > > In the following example it is not, is it? > > N1:40 > / > / > N2:20 > \ > \ > N3:30 > > find_closest_fit(N1, 30) will return N2 but N3 is the closest fit. I think > you > have to search the left tree for a better fit independently of the size of > its > root node. > > > > Good catch. > > > > 293 if (n->left == NULL && n->right == NULL) { > 294 replace_node_in_parent(n, NULL); > 295 > 296 } else if (n->left == NULL && n->right != NULL) { > 297 replace_node_in_parent(n, n->right); > 298 > 299 } else if (n->left != NULL && n->right == NULL) { > 300 replace_node_in_parent(n, n->left); > 301 > 302 } else { > > Can be simplified to: > > if (n->left == NULL) { > replace_node_in_parent(n, n->right); > } else if (n->right == NULL) { > replace_node_in_parent(n, n->left); > } else { > > > > Yes, but I'd rather leave the code as it is; I think it's easier to read > that way. > > > > 341 // The right child of the successor (if there was one) > replaces the successor at its parent's left child. > > Please add a line break. > > The comments and assertions in remove_node_from_tree() helped to > understand the > logic. Thanks! > > > > :) > > > > ====== src/hotspot/share/memory/metaspace/blocktree.cpp > > 40 // These asserts prints the tree, then asserts > 41 #define assrt(cond, format, ...) \ > 42 if (!(cond)) { \ > 43 print_tree(tty); \ > 44 assert(cond, format, __VA_ARGS__); \ > 45 } > 46 > 47 // This assert prints the tree, then stops (generic message) > 48 #define assrt0(cond) \ > 49 if (!(cond)) { \ > 50 print_tree(tty); \ > 51 assert(cond, "sanity"); \ > 52 } > > Better wrap into do-while(0) (see definition of vmassert) > > > > Ok. > > > > > > > 110 verify_node(n->left, left_limit, n->size, vd, lvl + 1); > > Recursive call that isn't a tail call. Prone to stack overflow. Well I > guess you > need a stack to traverse a tree. GrowableArray is a common choice if you > want to > eliminate this recursion. As it is only verification code you might as well > leave it and interpret stack overflow as verification failure. > > > 118 verify_node(n->right, n->size, right_limit, vd, lvl + 1); > > Tail-recursion can be easily eliminated. See comments on blocktree.hpp > above. > > > > I'll rewrite this to be non-recursive. > > > > ====== src/hotspot/share/memory/metaspace/chunkManager.cpp > > The slow parameter in ChunkManager::verify*() is not used. > > > > I'll remove all "slow" params from all verifications and recode this to > use -XX:VerifyMetaspaceInterval. I think that is easier to use. > > > > ====== src/hotspot/share/memory/metaspace/counter.hpp > > 104 void decrement() { > 105 #ifdef ASSERT > 106 T old = Atomic::load_acquire(&_c); > 107 assert(old >= 1, > 108 "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > 109 #endif > 110 Atomic::dec(&_c); > 111 } > > I think you could use Atomic::add() which returns the old value and make > the assert atomic too: > > void decrement() { > T old = Atomic::add(&_c, T(-1)); > #ifdef ASSERT > assert(old >= 1, > "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > #endif > } > > Same for increment(), increment_by(), decrement_by(), ... > > > > Thought so too but Atomic::add seems to return the new value, not the old, > see e.g. atomic_linux_x86.hpp: > > > > struct Atomic::PlatformAdd { > .... > D add_and_fetch(D volatile* dest, I add_value, atomic_memory_order > order) const { > return fetch_and_add(dest, add_value, order) + add_value; > } > }; > > > > I also checked the callers, those few places I found which do anything > meaningful with the result seem to expect the new value. See > e.g. G1BuildCandidateArray::claim_chunk(). > > > > It is annoying that these APIs are not documented. I thought this is > because these APIs are obvious to everyone but me but looks like they are > not. > > > > ====== src/hotspot/share/memory/metaspace/metaspaceArena.cpp > > There's too much vertical white space, I'd think. > > metaspace::get_raw_allocation_word_size() is a duplicate of > metaspace::get_raw_word_size_for_requested_word_size() > metaspace::get_raw_allocation_word_size() is only referenced in comments > and > should be removed. > > > > Oops. Sure, I'll remove that. > > > > > > > > byte_size should also depend on BlockTree::minimal_word_size I think. > Something like > > if (worde_size > FreeBlocks::maximal_word_size) > byte_size = MAX2(byte_size, BlockTree::minimal_word_size * BytesPerWord); > > FreeBlocks::maximal_word_size needs to be defined for this. > > > > See above. In addition to what I wrote, BlockTree is an implementation > detail of FreeBlocks, so it should not matter here. > > > > Thanks Richard. > > > > I will work in your feedback and publish a new webrev shortly. > > > > Cheers, Thomas > > > From lois.foltan at oracle.com Thu Sep 3 18:45:45 2020 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 3 Sep 2020 14:45:45 -0400 Subject: RFR(L) 8244778 Archive full module graph in CDS In-Reply-To: References: <9e6b0043-65a1-dd97-a3d2-33679c8048d4@oracle.com> <234856fa-1eb1-8514-6786-bce6689afd16@oracle.com> Message-ID: On 8/31/2020 11:31 AM, Ioi Lam wrote: > Hi Lois, > > Thanks for the review. Your comments has led me to discover a couple > of pretty serious issues, which hopefully I have fixed. Also the code > is cleaner now, with much fewer control-flow changes than the last > webrev. > > http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/ > > http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03.delta/ > Hi Ioi, Looks very good, thanks for making those changes.? One minor comment here then another comment below when you discuss the addition of check_cds_restrictions(). Minor nit in moduleEntry.cpp & packageEntry.cpp when dealing with the ModuleEntry's reads list and a PackageEntry's exports list.? The names of the methods to write and read those arrays is somewhat confusing. ModuleEntry::write_archived_entry_array ModuleEntry::read_archived_entry_array At first I thought you were reading/writing an array of archived entries, not the array within an archived entry itself.? I was trying to think of a better name.? Please consider adding a comment at line #400 & line #417 ahead of those methods in moduleEntry.cpp to indicate that they are used for both reading/writing a ModuleEntry's reads list and a PackageEntry's exports list. > > Please see my comments in-line. > > On 8/25/20 7:58 AM, Lois Foltan wrote: >> >> Hi Ioi, >> >> Changes looks really good.? Comments interspersed below. >> >> Thanks, >> Lois >> >> On 8/12/2020 6:06 PM, Ioi Lam wrote: >>> Hi Lois, >>> >>> Thanks for the comments. I have an updated webrev >>> >>> http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v02/ >>> >>> http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v02.delta/ >>> >>> >>> Here are the general notes on the changes. Replies to your questions >>> are in-line below. >>> >>> (1) Integrated updates in the Java code from Alan Bateman. Here are >>> Alan's >>> ? ? notes: >>> >>> ??? The archive of the boot layer is as before, except that >>> archiving is >>> ??? skipped if there are split packages or incubator modules. >>> Incubating >>> ??? modules aren't resolved by default so we shouldn't have them in the >>> ??? boot layer by default anyway. >>> >>> ??? I've dropped the module finders from the boot layer archive. I've >>> ??? left the IllegalAccessLogger.Builder in the acrhive for now (even >>> ??? though it is not the boot layer). We should be able to remove that >>> ??? once the JEP to disallow illegal access by default is in. >>> >>> ??? Related is that I don't like the archive for the module graph >>> ??? (ArchivedModuleGraph, pre-dates this RFE) including the set of >>> ??? packages to export/open for illegal access as they aren't part >>> ??? of the module graph. I've left it for now but we can also remove >>> ??? that once we disallow illegal access by default (as those sets >>> ??? will be empty). >>> >>> ??? The archive of built-in class loaders is now in one object >>> ??? jdk.internal.loader.ArchivedClassLoaders which I think will be a >>> ??? bit more maintainable. I've also drop the ucp field from the >>> ??? AppClassLoader as the changes to BuiltinClassLoader means is no >>> ??? longer needs to duplicated. >>> >>> ??? There's one remaining issue in ModuleBootstrap.boot where it has >>> fix >>> ??? an app class loader value (ModuleLayer.CLV). Ideally the >>> initialization >>> ??? of the built-in class loaders would do this but we are kinda forced >>> ??? to separate the archiving of the built-in class loaders from the >>> boot >>> ??? layer. I might look at this again some time. >>> >>> >>> (2) Moved code from classLoaderData.cpp -> classLoaderSharedData.cpp >>> >>> (3) Reverted unnecessary changes in JavaClasses::compute_offset >>> >>> (4) Minor clean up to use QuickSort::sort() instead of qsort() >>> >>> (5) Moved the C-side of module initialization for platform/system >>> ? ? loaders to inside java.lang.System::initPhase2(), so this happens >>> ??? at the same time as without full module archiving. >>> >>> (6) Moved the use of Module_lock to so all modules in a class loader >>> ??? are restored atomically. See ArchivedClassLoaderData::restore() >>> >>> ??? This fixed a bug where test/jdk/com/sun/jdi/ModulesTest.java >>> ? ? would fail as it sees a partially restored set of modules. >>> >>> >>> >>> On 8/7/20 12:06 PM, Lois Foltan wrote: >>>> Hi Ioi, >>>> >>>> Overall looks promising.? I have some review comments below, but >>>> not a complete review.? I concentrated on the JVM side primarily >>>> how the archived module graph is read in, how the ModuleEntry and >>>> PackageEntry tables are created from the archive for the 3 builtin >>>> loaders and potential timing issues during start up. >>>> >>>> On 7/22/2020 3:36 PM, Ioi Lam wrote: >>>>> https://bugs.openjdk.java.net/browse/JDK-8244778 >>>>> http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v01/ >>>>> >>>>> >>>>> Please review this patch that stores the full module graph in the CDS >>>>> archive heap. This reduces the initialization time of the basic >>>>> JVM by >>>>> about 22%: >>>>> >>>>> $ perf stat -r 100 bin/java -version >>>>> before: 98,219,329 instructions 0.03971 secs elapsed (+- 0.44%) >>>>> after:? 55,835,815 instructions 0.03109 secs elapsed (+- 0.65%) >>>>> >>>>> [1] Start with ModuleBootstrap.java. The current implementation is >>>>> ??? quite restrictive: the archived module graph is used only when no >>>>> ??? module options are specified. >>>>> >>>>> ??? See ModuleBootstrap.mayUseArchivedBootLayer(). >>>>> >>>>> ??? We can probably support options such as main module and module >>>>> path >>>>> ??? in a future RFE. >>>> >>>> Yes, I noticed the restrictions.? The JBS issue discusses the >>>> possibility of using the dumped module graph in the event that the >>>> value of the options are the same.? I think for this implementation >>>> to be viable and have a positive impact you should consider >>>> comparing at least the options --add-modules, --add-exports and >>>> --add-reads. >>> >>> I agree. I want to keep the changes minimal in this RFE, and will >>> add more support for other use cases in follow-on RFEs. Instead of >>> requiring these options to be unspecified, we can relax the >>> restriction such that these options must be the same between archive >>> dump time and run time. >> >> Sounds good! >> >>> >>>> >>>> >>>>> >>>>> [2] In the current JDK implementation, there is no single object >>>>> ??? that represents "the module graph". Most of the information >>>>> ??? is stored in the archive bootLayer object, but a few additional >>>>> ??? restoration operations need to be performed: >>>>> >>>>> ??? + See ModuleBootstrap.getArchivedBootLayer() >>>>> ??? + Some static fields need to be archived/restored in >>>>> ????? Module.java, BuiltinClassLoader.java, ClassLoaders.java >>>>> ????? and BootLoader.java >>>>> >>>>> [3] I ran into a complication with two loader instances of >>>>> ??? PlatformClassLoader and AppClassLoader. They are stored in >>>>> ??? multiple tables inside the module graph (e.g., >>>>> ??? BuiltinClassLoader$LoadedModule) so I cannot easily recreate >>>>> ??? them at runtime. >>>>> >>>>> ??? However, these two loaders contain information specific to the >>>>> ??? dump time VM lifecycle (such as the classes that were loaded >>>>> ??? during CDS dumping) that need to be scrubbed. I couldn't find an >>>>> ??? elegant way of doing this, so I added a private >>>>> "resetArchivedStates" >>>>> ??? method to a few classes. They are called inside >>>>> ??? HeapShared::reset_archived_object_states(). >>>>> >>>>> [4] Related native data structures (PackageEntry and ModuleEntry) >>>>> ??? are also archived. Start with classLoaderData.cpp >>>>> >>>>> Passes mach5 tiers 1-4. I will test with additional tiers. >>>>> >>>>> Thanks >>>>> - Ioi >>>> >>>> classfile/classLoader.cpp >>>> - line #1644 pulling the javabase_moduleEntry() check outside of >>>> the Module_lock maybe problematic if after you check it another >>>> ? thread initializes in the time between the check and the >>>> obtaining of the Module_lock on line #1645. >>> >>> Fixed. >> >> That looks good.? I think it is fine that you are checking if >> java.base is defined via a call to javabase_moduleEntry() because you >> are just trying to determine if a ModuleEntry should be created or >> not.? In most cases though using ModuleEntryTable::javabase_defined() >> is the correct way to ensure that both the ModuleEntry for java.base >> has been created and that the ModuleEntry has been injected in the >> corresponding java.lang.Module oop. >> >>> >>>> >>>> classfile/classLoader.hpp >>>> - somewhere in ArchivedClassLoaderData there should be an assert to >>>> make sure that the CLD, whose package entries and module entries >>>> are being archived is not a "_has_class_mirror_holder" CLD for a >>>> weakly defined hidden class.? Those dedicated CLDs should never >>>> have package entries or module entries. >>>> >>> >>> I added ArchivedClassLoaderData::assert_valid() >>> >>>> classfile/moduleEntry.cpp >>>> - line #400, typo "conver" --> "convert" >>>> - line #500, maybe sort if n is greater than 1 instead of 0 (same >>>> comment for packageEntry.cpp line #270) >>>> >>> Fixed >>> >>>> classfile/systemDictionary.cpp >>>> - It looks like the PackageEntry and ModuleEntry tables for the >>>> system and platform class loaders are? added within >>>> SystemDictionary::compute_java_loaders() which is called from >>>> Thread::create_vm() but after the call in Thread::create_vm() to >>>> call_initPhase2 which is when Universe::_module_initialized is set >>>> to true.? Did I read this correctly?? If yes, then I think you have >>>> to be sure that Universe::_module_initialized is not set until the >>>> module graph for the 3 builtin loaders is loaded from the archive. >>>> >>> >>> I moved the code to be called by ModuleBootstrap::boot() so it >>> should now happen inside call_initPhase2. >> >> Yes, that looks good. >> >>> >>>> memory/filemap.cpp >>>> - line #237 typo "modue" --> "module" >>>> >>> >>> Fixed >>> >>>> - Since the CDS support for capturing the module graph does not >>>> archive unnamed modules, I assume >>>> Modules::set_bootloader_unnamed_module() is still called. Is this >>>> true or does the unnamed module for the boot loader get archived? >>> The unnamed module for the boot loader is not archived. >>> >>> Modules::set_bootloader_unnamed_module()? wasn't called in my last >>> webrev. Thanks for catching it. >>> >>> I added a call to BootLoader.getUnnamedModule() in >>> ModuleBootstrap::boot()? to trigger of BootLoader, which >>> will call into the VM for Modules::set_bootloader_unnamed_module(). >> >> Looks good. >> >>> >>> >>> >>>> Clarification/timing questions: >>>> >>> >>> Here's an overall problem I am facing: >>> >>> The module graph is archived after the module system has fully >>> started up. This means that for the boot loader, we only know the >>> full set of modules/packages, but we don't know which part is the >>> subset that was initialized during early VM bootstraping (e.g., when >>> ModuleEntryTable::javabase_defined() == false). >>> >>> So the behavior is a bit different during the early bootstrap phase >>> (all the way before we reach ModuleBootstrap::boot()): >>> ClassLoaderData::the_null_class_loader_data()->modules() and >>> ClassLoaderData::the_null_class_loader_data()->packages() are >>> already populated before a single class is loaded. >> >> If this is the case then, at the point when a ModuleEntry is created >> for java.base using the archived module graph, there should be a >> assertion that java_lang_Class::_fixup_module_field_list is NULL. To >> confirm no class has been loaded before java.base is defined. Maybe >> in ClassLoaderDataShared::serialize where you restore the boot >> loader's archived modules? > > Thanks for pointing this out. It turned out that in the last webrev > (v02), I had a bug where the module of the primitive classes were not > initialized. Now I have changed the initialization code to behave the > same whether archived full module graph is used or not. The only > differences are: > > [1] ClassLoader::create_javabase(): > ModuleEntryTable::javabase_moduleEntry() may be inited by CDS. > > [2] When archived full module graph is used, > ModuleEntryTable::patch_javabase_entries is called from > Modules::define_archived_modules. > (java_lang_Class::_fixup_module_field_list is used within this call.) > > I also added a new test case: cds/PrimitiveClassMirrors.java Good test, I'm glad you added that! > >> >>> This difference doesn't seem to make a practical difference. E.g., >>> none of our code seems to assume that "before any classes in the >>> java.util package is loaded, >>> ClassLoaderData::the_null_class_loader_data()->packages() must not >>> contain an entry for java.util". >>> >>> I think we have two choices when the archived module graph is used: >>> >>> [1] We require that the state of the module system must, at every >>> step of VM initialization, be identical to that of a VM that doesn't >>> use an archived module graph. >>> >>> [2] We make sure that the VM/JDK bootstrap code can tolerate the >>> existence of module/packages that are added earlier than a VM that >>> doesn't use an archived module graph. >>> >>> I tried doing a version of [1] and found that to be too difficult. >>> [2] seems much simpler and is the approach I am using now. >> >> I think [2] is reasonable. >> >>> >>>> oops/instanceKlass.cpp >>>> - line #2545, comment "TO DO? -- point it to the archived >>>> PackageEntry (JDK-8249262)" >>>> are you thinking that since the module graph is read in ahead of >>>> time that it can be set when an InstanceKlass is created? There is >>>> a point before java.base is defined that InstanceKlass::set_package >>>> takes into account that could be a timing issue. >>>> >>>> >>> >>> I think it will work as if another class in the same package has >>> already been defined. >>> >>>> - There are some checks in modules.cpp that are valuable to have >>>> during bootstrapping.? For example, packages that are encountered >>>> during bootstrapping before java.base is defined are put into the >>>> java.base module but then verified after java.base is defined via >>>> verify_javabase_packages.? So at what point in the bootstrapping >>>> process does the boot loader's archived module's become known? >>>> Could classes have been defined to the VM before this point? Then >>>> their packages will have to be verified to be sure that they are >>>> indeed packages within java.base.? Consider looking at other checks >>>> that occur within classfile/modules.cpp as well. >>>> >>> As mentioned above, calling verify_javabase_packages() at run time >>> will fail, as we have loaded all packages for the boot loader, not >>> just those for java.base. >> >> I think not calling verify_javabase_packages works because as you >> stated above no classes have been loaded before java.base is defined >> which is not the same situation as running without the archived >> module graph. >> >> A couple of additional comments: >> >> - ModuleEntry's reads list and PackageEntry's exports list.? We had >> hoped eventually to change these lists from being a c heap allocated >> GrowableArray over to a ResourceHashTable for faster lookup.? Doing >> that as a separate RFE might help CDS archiving not to have to >> archive those lists as an Array? >> > > CDS cannot archive ResourceHashTable either. We have CompactHashtable > which can be archived, but it cannot be modified at run time. I think > the export list can be modified at run time with > java.lang.Module::addExports(), so we probably need to do it like: > > - if a Module was archived, check its CompactHashtable first > - if not found, check the ResourceHashTable > > This would make the start-up a little faster (no more copying from the > array lists into the hashtable, but would make the code more > complicated. I need to investigate to see if it's worthwhile. > >> - moduleEntry.cpp and packageEntry.cpp: Both methods >> "compare_module_by_name" and "compare_package_by_name" should have an >> assert if the names are equal.? No ModuleEntryTable or >> PackageEntryTable should ever have 2 same named modules or packages. >> > > I tried adding: > > static int compare_package_by_name(PackageEntry* a, PackageEntry* b) { > ? assert(a != b && a->name() != b->name(), "array should not contain > duplicated entries"); > ? return a->name()->fast_compare(b->name()); > } > > but this caused an assert, because our QuickSort implementation would > sometimes compare the same element! > > #7? 0x00007ffff659238b in QuickSort::partition int (*)(void const*, void const*)> (array=0x80043fcb0, > ??? pivot=248, length=496, comparator=0x7ffff6590a5a > ) > ??? at /jdk2/gil/open/src/hotspot/share/utilities/quickSort.hpp:76 > 76??? ????? for ( ; comparator(array[left_index], pivot_val) < 0; > ++left_index) { > (gdb) p array[left_index] > $1 = (PackageEntry *) 0x7ffff0439d00 > (gdb) p pivot_val > $2 = (PackageEntry *) 0x7ffff0439d00 > (gdb) p pivot > $3 = 248 > (gdb) p left_index > $4 = 248 > > So I ended up with: > > static int compare_package_by_name(PackageEntry* a, PackageEntry* b) { > ? assert(a == b || a->name() != b->name(), "no duplicated names"); > ? return a->name()->fast_compare(b->name()); > } Looks good to me! > >> - Another timing clarification question for me.? I assume that the >> module graph is dumped post module initialization when Java has >> completely defined the module graph to the JVM, is this correct? > > Yes > >> My concern is that there could be a window post module initialization >> and pre archiving the module graph where another thread could define >> a new module or add Module readability or add Package exportability.? >> So that the module graph you are dumping is not the same module graph >> at the end of module initialization.? Is this a concern? > > We don't allow arbitrary code to be executed during -Xshare:dump, so > the module graph shouldn't be changed after module initialization has > finished. > > I've added Modules::check_cds_restrictions() to check for this. A question about this because a user's program can define modules post module initialization via ModuleDescriptor.newModule().? See for example, tests within open/test/hotspot/jtreg/runtime/module/AccessCheck.? So all of these tests would trigger check_cds_restrictions() if -Xshare:dump was turned on.? Is that a concern? Thanks, Lois > > I also added asserts to make sure that none of the classes used by the > archived module graph can be modified by JVMTI. All these classes are > loaded in the "early" phase of JVMTI, and we would disable CDS if a > JVMTI agent is registered to modify classes in this phase, so we > should be safe, but the asserts will ensure that. I updated the > ReplaceCriticalClassesForSubgraphs.java test and added a new test > RedefineClassesInModuleGraph.java. > > > Thanks > - Ioi > > >> >> Thanks, >> Lois >> >>> >>> Well, verify_javabase_packages() was called once and it succeeded, >>> but that was during CDS dump time. So we know at least we have >>> verified this once :-) >>> >>> Thanks >>> - Ioi >>> >>>> I may have more review comments as I continue to look at this! >>>> >>>> Thanks, >>>> Lois >>>> >>>> >>>> >>>> >>>> >>> >> >> >> >> > From paul.sandoz at oracle.com Thu Sep 3 19:20:10 2020 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 3 Sep 2020 12:20:10 -0700 Subject: [16]RFR(S):8249092:InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code In-Reply-To: <5845a59f-3790-906a-0ea4-fc0e862eb103@redhat.com> References: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com> <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> <05967ec2-ec8e-48f3-bd03-09d29dbcfbba.zhuoren.wz@alibaba-inc.com> <2b2f0d07-c0b2-4e8b-ad2c-fd0046273851.zhuoren.wz@alibaba-inc.com> <5845a59f-3790-906a-0ea4-fc0e862eb103@redhat.com> Message-ID: <5F0E7A5C-6FC8-4DE2-8D2B-D7D4AEBB382A@oracle.com> I also agree, the fix should be backed out. VarHandles can be used as a replacement for Unsafe. If a VarHandle is used to access the contents of a byte[] or ByteBuffer [1] then only the set/get methods are supported for misaligned access, use of all other methods will throw an IllegalStateException. Paul. [1] https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/lang/invoke/MethodHandles.html#byteBufferViewVarHandle(java.lang.Class,java.nio.ByteOrder) > On Sep 3, 2020, at 1:41 AM, Andrew Haley wrote: > > On 03/09/2020 05:11, David Holmes wrote: >> On 3/09/2020 1:49 pm, Wang Zhuo(Zhuoren) wrote: >>> David, Thank you very much for the explanation. >>> The original crash indeed happened in a third party code using Unsafe to >>> handle unaligned data. >>> So you mean that it is the third party code's bug, we should not fix it >>> in JVM, right? >> >> Right. Unsafe is not a supported API and is by definition unsafe. If you >> use it and it crashes then you need to change your code. > > I agree. I hindsight, I should probably not have approved 8246051. I'm happy > that it should be backed out. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From yumin.qi at oracle.com Thu Sep 3 22:11:39 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Thu, 3 Sep 2020 15:11:39 -0700 Subject: RFR: 8252725: Refactor jlink GenerateJLIClassesPlugin code In-Reply-To: <4be47e30-aa69-7db0-ff04-f2d379fb8b38@oracle.com> References: <4be47e30-aa69-7db0-ff04-f2d379fb8b38@oracle.com> Message-ID: HI, Mandy ? Thanks for review and comment. Yumin On 9/3/20 9:13 AM, Mandy Chung wrote: > > > On 9/3/20 8:36 AM, Yumin Qi wrote: >> Hi, Please review >> >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8252725 >> >> webrev: http://cr.openjdk.java.net/~minqi/2020/8252725/webrev-01/ >> > > Looks good to me.?? Sundar should also review it. > > A few things to mention compared to the proposed patch from 8247536:? we no longer log the error case for LF_RESOLVE since it's ignored anyway.? As the code is moved to java.lang.invoke, we also clean up the code to use constants and methods defined in LambdaForm and MethodTypeForm and BasicType (rather than duplicating such definitions). > > Mandy >> >> Summary: The work is part of 8247536, which supports archive pre-generated java.lang.invoke classes in CDS. In this patch (thanks to Mandy): >> >> 1. Two methods for tracing SPECIES_RESOLVE and LF_RESOLVE are added to GenerateJLIClassesHelper: traceSpeciesType and traceLambdaForm respectively; >> >> 2. Move log file parsing work to java.lang.InvokeJLIClassesHelper; >> >> 3. Clean up interface APIs since old APIs no longer used with the moving; >> >> 4. New API JavaLangInvokeAccess::generateHolderClassesreturns a map of class name, which in internal form as key rather than the jimage entry point, vs class bytes. >> >> This makes both JLI and CDS can use the new interface easily. CDS will add a new function (in 8247536 patch, only for convenience for converting the map to array) to GenerateJLIClassesHelper to call the new added interface API (generateHolderClasses)to regenerate holder classes during dump time. >> >> >> Thanks >> >> Yumin >> >> > From sundararajan.athijegannathan at oracle.com Fri Sep 4 01:34:29 2020 From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com) Date: Fri, 4 Sep 2020 07:04:29 +0530 Subject: RFR: 8252725: Refactor jlink GenerateJLIClassesPlugin code In-Reply-To: References: <4be47e30-aa69-7db0-ff04-f2d379fb8b38@oracle.com> Message-ID: <1face782-cc07-dfad-35c9-3247e556827c@oracle.com> Looks good to me. Few minor comment: * traceFileStream (and even the preexisting mainArgument) is accessed only inside GenerateJLIClassesPlugin. Could be private? -Sundar On 04/09/20 3:41 am, Yumin Qi wrote: > HI, Mandy > > ? Thanks for review and comment. > > > Yumin > > On 9/3/20 9:13 AM, Mandy Chung wrote: >> >> >> On 9/3/20 8:36 AM, Yumin Qi wrote: >>> Hi, Please review >>> >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8252725 >>> >>> webrev: http://cr.openjdk.java.net/~minqi/2020/8252725/webrev-01/ >>> >> >> Looks good to me.?? Sundar should also review it. >> >> A few things to mention compared to the proposed patch from 8247536:? >> we no longer log the error case for LF_RESOLVE since it's ignored >> anyway.? As the code is moved to java.lang.invoke, we also clean up >> the code to use constants and methods defined in LambdaForm and >> MethodTypeForm and BasicType (rather than duplicating such definitions). >> >> Mandy >>> >>> Summary: The work is part of 8247536, which supports archive >>> pre-generated java.lang.invoke classes in CDS. In this patch (thanks >>> to Mandy): >>> >>> 1. Two methods for tracing SPECIES_RESOLVE and LF_RESOLVE are added >>> to GenerateJLIClassesHelper: traceSpeciesType and traceLambdaForm >>> respectively; >>> >>> 2. Move log file parsing work to java.lang.InvokeJLIClassesHelper; >>> >>> 3. Clean up interface APIs since old APIs no longer used with the >>> moving; >>> >>> 4. New API JavaLangInvokeAccess::generateHolderClassesreturns a map >>> of class name, which in internal form as key rather than the jimage >>> entry point, vs class bytes. >>> >>> This makes both JLI and CDS can use the new interface easily. CDS >>> will add a new function (in 8247536 patch, only for convenience for >>> converting the map to array) to GenerateJLIClassesHelper to call the >>> new added interface API (generateHolderClasses)to regenerate holder >>> classes during dump time. >>> >>> >>> Thanks >>> >>> Yumin >>> >>> >> From yumin.qi at oracle.com Fri Sep 4 04:37:53 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Thu, 3 Sep 2020 21:37:53 -0700 Subject: RFR: 8252725: Refactor jlink GenerateJLIClassesPlugin code In-Reply-To: <1face782-cc07-dfad-35c9-3247e556827c@oracle.com> References: <4be47e30-aa69-7db0-ff04-f2d379fb8b38@oracle.com> <1face782-cc07-dfad-35c9-3247e556827c@oracle.com> Message-ID: <6a4bbb92-7e6a-f037-a2dd-220aba9c6374@oracle.com> HI, Sundar ? Thanks for review. On 9/3/20 6:34 PM, sundararajan.athijegannathan at oracle.com wrote: > Looks good to me. > > Few minor comment: > > * traceFileStream (and even the preexisting mainArgument) is accessed only inside GenerateJLIClassesPlugin. Could be private? > I will fix them before push. (Certainly will build first to verify that). Thanks Yumin > -Sundar > > On 04/09/20 3:41 am, Yumin Qi wrote: >> HI, Mandy >> >> ? Thanks for review and comment. >> >> >> Yumin >> >> On 9/3/20 9:13 AM, Mandy Chung wrote: >>> >>> >>> On 9/3/20 8:36 AM, Yumin Qi wrote: >>>> Hi, Please review >>>> >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8252725 >>>> >>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8252725/webrev-01/ >>>> >>> >>> Looks good to me.?? Sundar should also review it. >>> >>> A few things to mention compared to the proposed patch from 8247536:? we no longer log the error case for LF_RESOLVE since it's ignored anyway.? As the code is moved to java.lang.invoke, we also clean up the code to use constants and methods defined in LambdaForm and MethodTypeForm and BasicType (rather than duplicating such definitions). >>> >>> Mandy >>>> >>>> Summary: The work is part of 8247536, which supports archive pre-generated java.lang.invoke classes in CDS. In this patch (thanks to Mandy): >>>> >>>> 1. Two methods for tracing SPECIES_RESOLVE and LF_RESOLVE are added to GenerateJLIClassesHelper: traceSpeciesType and traceLambdaForm respectively; >>>> >>>> 2. Move log file parsing work to java.lang.InvokeJLIClassesHelper; >>>> >>>> 3. Clean up interface APIs since old APIs no longer used with the moving; >>>> >>>> 4. New API JavaLangInvokeAccess::generateHolderClassesreturns a map of class name, which in internal form as key rather than the jimage entry point, vs class bytes. >>>> >>>> This makes both JLI and CDS can use the new interface easily. CDS will add a new function (in 8247536 patch, only for convenience for converting the map to array) to GenerateJLIClassesHelper to call the new added interface API (generateHolderClasses)to regenerate holder classes during dump time. >>>> >>>> >>>> Thanks >>>> >>>> Yumin >>>> >>>> >>> From richard.reingruber at sap.com Fri Sep 4 07:29:47 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 4 Sep 2020 07:29:47 +0000 Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace In-Reply-To: References: , Message-ID: Hi Thomas, > > I agree that a little bit of documentation would help. Well if it is the > > new > > > > value that is returned, you could assert like this: > > > > > > > > void decrement() { > > > > T new_val = Atomic::add(&_c, T(-1)); > > > > #ifdef ASSERT > > > > assert(new_val != T(-1), "underflow (0-1)"); > > > > #endif > > > > } > > > > > > > > I'm not insisting but out of experience I do know that imprecise asserts > > can be > > > > a pain. Note also that Atomic::load_acquire(&_c) does not help to get the > > most > > > > recent value of _c. Acquire only prevents subsequent accesses to be > > reordered > > > > with the load_acquire(). In other words: the state that is observed > > subsequently > > > > is at least as recent as the state observed with the load_acquire() but it > > can > > > > be stale. > > > I'll think about this. You may be right. I originally wanted to use the > return value of A::add but shied away because of the bad documentation. Do > you think it behaves the same way on all platforms? > Well I guess it does since the using code I found is shared. Yes, semantics should be the same on all plattforms. > Thanks, Richard. I'm currently preparing a new webrev and running tests; > these changes will have to wait until after that. Sure thing! Cheers, Richard. --------- From: Thomas St?fe Sent: Thursday, September 3, 2020 19:06 To: Reingruber, Richard Cc: Hotspot dev runtime ; Hotspot-Gc-Dev Subject: Re: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace ? On Thu, Sep 3, 2020 at 6:56 PM Reingruber, Richard wrote: Hi Thomas, ? > Thanks for your review! ? Welcome! :) ? > > ====== src/hotspot/share/memory/metaspace/counter.hpp > > > >? 104 ??void decrement() { > >? 105 #ifdef ASSERT > >? 106???? T old = Atomic::load_acquire(&_c); > >? 107???? assert(old >= 1, > >? 108???????? "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > >? 109 #endif > >? 110???? Atomic::dec(&_c); > >? 111?? } > > > > I think you could use Atomic::add() which returns the old value and make > > the assert atomic too: > > >?? void decrement() { > >???? T old = Atomic::add(&_c, T(-1)); > > #ifdef ASSERT > >???? assert(old >= 1, > >??????????? "underflow (" UINT64_FORMAT "-1)", (uint64_t)old); > > #endif > >?? } > > > > Same for increment(), increment_by(), decrement_by(), ... > > > > > Thought so too but Atomic::add seems to return the new value, not the old, > see e.g. atomic_linux_x86.hpp: ? > struct Atomic::PlatformAdd { >? .... >?? D add_and_fetch(D volatile* dest, I add_value, atomic_memory_order order) > const { >???? return fetch_and_add(dest, add_value, order) + add_value; >?? } > }; ? > I also checked the callers, those few places I found which do anything > meaningful with the result seem to expect the new value. See > e.g. G1BuildCandidateArray::claim_chunk(). ? > It is annoying that these APIs are not documented. I thought this is > because these APIs are obvious to everyone but me but looks like they are > not. ? I agree that a little bit of documentation would help. Well if it is the new value that is returned, you could assert like this: ? ? void decrement() { ??? T new_val = Atomic::add(&_c, T(-1)); #ifdef ASSERT ??? assert(new_val != T(-1), "underflow (0-1)"); #endif ? } ? I'm not insisting but out of experience I do know that imprecise asserts can be a pain. Note also that Atomic::load_acquire(&_c) does not help to get the most recent value of _c. Acquire only prevents subsequent accesses to be reordered with the load_acquire(). In other words: the state that is observed subsequently is at least as recent as the state observed with the load_acquire() but it can be stale. I'll think about this. You may be right. I originally wanted to use the return value of A::add but shied away because of the bad documentation. Do you think it behaves the same way on all platforms? Well I guess it does since the using code I found is shared. ? ? > > byte_size should also depend on BlockTree::minimal_word_size I think. > > Something like > > > > if (worde_size > FreeBlocks::maximal_word_size) > >?? byte_size = MAX2(byte_size, BlockTree::minimal_word_size * BytesPerWord); > > > > FreeBlocks::maximal_word_size needs to be defined for this. > > > > > See above. In addition to what I wrote, BlockTree is an implementation > detail of FreeBlocks, so it should not matter here. ? You could assert (statically) somewhere that FreeBlocks::maximal_word_size >= BlockTree::minimal_word_size. But I'm not insisting. ? Oh sure, I can do that. ? Thanks, Richard. ? Thanks, Richard. I'm currently preparing a new webrev and running tests; these changes will have to wait until after that. Cheers, Thomas ? From: Thomas St?fe Sent: Donnerstag, 3. September 2020 11:20 To: Reingruber, Richard Cc: Hotspot dev runtime ; Hotspot-Gc-Dev Subject: Re: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace ? Hi Richard, ? Thanks for your review! ? Please note that code has changed since the first webrev. While that is not a problem - most of your findings are still valid - a large number or renamings took place (see Leo's and Coleen's mails).? ? I will use the new naming throughout my answers (e.g. maximal_word_size -> MaxWordSize). Hope that is not too confusing; if in doubt pls look at the HEAD of the sandbox repo, or just wait until the new webrev is done. ? All remarks inline: ? ? ====== You added includes of memory/metaspace.hpp to some files without additional modifications. How did you chose these files? To jvmciCompilerToVM.hpp you added an include of collectedHeap.hpp. Why did you do this? ? metaspace.hpp is needed for either one of MetaspaceType or MetadataType enum. jvmciCompilerToVM.hpp?needs a definition or a forward declaration for CollectedHeap. ? ====== src/hotspot/share/memory/metaspace.cpp ?194? ? ? ?// ... and class spacelist. ?195? ? ? ?VirtualSpaceList* vsl = VirtualSpaceList::vslist_nonclass(); ?196? ? ? ?assert(vsl != NULL, "Sanity"); ?197? ? ? ?vsl->verify(slow); In 195 VirtualSpaceList::vslist_class() should be called I suppose. ? Good catch. ? You could reuse the local vsl as you did with the local cm. Any reason you assert vsl not NULL in 196 but not in the non-class case? ? A bit inconsistent, yes. I will remove all these asserts. They are not really useful (if any of those were NULL we would crash in a very obvious manner). ? ?637? ? ?// CCS must be aligned to root chunk size, and be at least the size of one ?638? ? ?//? root chunk. ?639? ? ?adjusted_ccs_size = align_up(adjusted_ccs_size, reserve_alignment()); ?640? ? ?adjusted_ccs_size = MAX2(adjusted_ccs_size, reserve_alignment()); Line 640 is redundant, isn't it? ? Well, adjusted_ccs_size could have been 0 to begin with. In the greater context I think it cannot be 0 since CompressedClassSpaceSize cannot be set to zero. But that is non-obvious, so I prefer to leave this code in. ? @@ -1274,7 +798,7 @@ ? ?assert(loader_data != NULL, "Should never pass around a NULL loader_data. " ? ? ? ? ?"ClassLoaderData::the_null_class_loader_data() should have been used."); -? MetadataType mdtype = (type == MetaspaceObj::ClassType) ? ClassType : NonClassType; +? Metaspace::MetadataType mdtype = (type == MetaspaceObj::ClassType) ? Metaspace::ClassType : Metaspace::NonClassType; ? ?// Try to allocate metadata. ? ?MetaWord* result = loader_data->metaspace_non_null()->allocate(word_size, mdtype); This hunk is not needed. ? Ok ? ====== src/hotspot/share/memory/metaspace/binlist.hpp ? 94? ? ? // block sizes this structure can keep are limited by [_min_block_size, _max_block_size) ? 95? ? ? const static size_t minimal_word_size = smallest_size; ? 96? ? ? const static size_t maximal_word_size = minimal_word_size + num_lists; _min_block_size/_max_block_size should be minimal_word_size/maximal_word_size. The upper limit 'maximal_word_size' should be inclusive IMHO: ? const static size_t maximal_word_size = minimal_word_size + num_lists - 1; That would better match the meaning of the variable name. Checks in l.148 and l.162 should be adapted in case you agree. ? Leo and Coleen wanted this too. The new naming will be consistent and follow hotspot naming rules: (Min|Max)WordSize. ? ? ? 43? ? // We store node pointer information in these blocks when storing them. That ? 44? ? //? imposes a minimum size to the managed memory blocks. ? 45? ? //? See MetaspaceArene::get_raw_allocation_word_size(). s/MetaspaceArene::get_raw_allocation_word_size/metaspace::get_raw_word_size_for_requested_word_size/ I agree with the comment, but metaspace::get_raw_word_size_for_requested_word_size() does not seem to take this into account. ? It does, it uses FreeBlocks::MinWordSize. FreeBlocks consists of a tree and a bin list. The tree is only responsible for larger blocks (larger than what would fit into the bin list). Therefore the lower limit is only determined by the bin list minimum word size. ? Since this may be not obvious, I'll beef up the comment somewhat. ? ? 86? ? ? ? // blocks with the same size are put in a list with this node as head. ? 89? ? ? ? // word size of node. Note that size cannot be larger than max metaspace size, ?115? ? ? // given a node n, add it to the list starting at head ?123? ? ? // given a node list starting at head, remove one node from it and return it. You should begin a sentence consistently with a capital letter (you mostly do it). ?123? ? ? // given a node list starting at head, remove one node from it and return it. ?124? ? ? // List must contain at least one other node. ?125? ? ? static node_t* remove_from_list(node_t* head) { ?126? ? ? ? assert(head->next != NULL, "sanity"); ?127? ? ? ? node_t* n = head->next; ?128? ? ? ? if (n != NULL) { ?129? ? ? ? ? head->next = n->next; ?130? ? ? ? } ?131? ? ? ? return n; ?132? ? ? } Line 129 must be executed unconditionally. ? Good catch. ? I'd prefer a more generic implementation that allows head->next to be NULL. Maybe even head == NULL. ? I don't think that would be much clearer though. We probably could move remove_from_list() up into remove_block(), but for symmetry reasons this would have to be done for add_to_list() too,?and I rather like it this way. ? ?215? ? ? // Given a node n and a node forebear, insert n under forebear ?216? ? ? void insert(node_t* forebear, node_t* n) {? ?217? ? ? ? if (n->size == forebear->size) { ?218? ? ? ? ? add_to_list(n, forebear); // parent stays NULL in this case. ?219? ? ? ? } else { ?220? ? ? ? ? if (n->size < forebear->size) { ?221? ? ? ? ? ? if (forebear->left == NULL) { ?222? ? ? ? ? ? ? set_left_child(forebear, n); ?223? ? ? ? ? ? } else { ?224? ? ? ? ? ? ? insert(forebear->left, n); ?225? ? ? ? ? ? } ?226? ? ? ? ? } else { ?227? ? ? ? ? ? assert(n->size > forebear->size, "sanity"); ?228? ? ? ? ? ? if (forebear->right == NULL) { ?229? ? ? ? ? ? ? set_right_child(forebear, n); ?230? ? ? ? ? ? ? if (_largest_size_added < n->size) { ?231? ? ? ? ? ? ? ? _largest_size_added = n->size; ?232? ? ? ? ? ? ? } ?233? ? ? ? ? ? } else { ?234? ? ? ? ? ? ? insert(forebear->right, n); ?235? ? ? ? ? ? } ?236? ? ? ? ? } ?237? ? ? ? } ?238? ? ? } This assertion in line 227 is redundant (cannot fail). ? That is true for many asserts I add. I use asserts liberally as guard against bit rot and as documentation. I guess this here could?be considered superfluous since the setup?code is right above. I will remove that assert. ? ? Leo> There are at least two recursive calls of insert that could be ? Leo> tail-called instead (it would be somewhat harder to read, so I am not ? Leo> proposing it). I think they _are_ tail-recursions in the current form. Gcc eliminates them. I checked the release build with gdb: (disass /s metaspace::FreeBlocks::add_block) Recursive tail-calls can be easily replaced with loops. To be save I'd suggest to do that or at least add 'return' after each call with a comment that nothing must be added between the call and the return too keep it a tail-recursion. Maybe that's sufficient... on the other hand we don't know if every C++ compiler can eliminate the calls and stack overflows when debugging would be also irritating. ?251? ? ? ? ? ? return find_closest_fit(n->right, s); ?260? ? ? ? ? ? return find_closest_fit(n->left, s); More tail-recursion. Same as above. ? I'll rewrite BlockTree to not?use recursion. ? ?257? ? ? ? ? assert(n->size > s, "Sanity"); Assertion is redundant. ?262? ? ? ? ? ? // n is the best fit. ?263? ? ? ? ? ? return n; In the following example it is not, is it? ? ? ? ? ? N1:40 ? ? ? ? ? ?/ ? ? ? ? ? / ? ? ? ?N2:20 ? ? ? ? ? \ ? ? ? ? ? ?\ ? ? ? ? ?N3:30 find_closest_fit(N1, 30) will return N2 but N3 is the closest fit. I think you have to search the left tree for a better fit independently of the size of its root node. ? Good catch. ? ?293? ? ? ? if (n->left == NULL && n->right == NULL) { ?294? ? ? ? ? replace_node_in_parent(n, NULL); ?295? ? ?296? ? ? ? } else if (n->left == NULL && n->right != NULL) { ?297? ? ? ? ? replace_node_in_parent(n, n->right); ?298? ? ?299? ? ? ? } else if (n->left != NULL && n->right == NULL) { ?300? ? ? ? ? replace_node_in_parent(n, n->left); ?301? ? ?302? ? ? ? } else { Can be simplified to: ? ? if (n->left == NULL) { ? ? ? replace_node_in_parent(n, n->right); ? ? } else if (n->right == NULL) { ? ? ? replace_node_in_parent(n, n->left); ? ? } else { ? Yes, but I'd rather leave the code as it is; I think it's?easier to read that way. ? ?341? ? ? ? ? ? // The right child of the successor (if there was one) replaces the successor at its parent's left child. Please add a line break. The comments and assertions in remove_node_from_tree() helped to understand the logic. Thanks! ? :) ? ====== src/hotspot/share/memory/metaspace/blocktree.cpp ? 40? ? // These asserts prints the tree, then asserts ? 41? ? #define assrt(cond, format, ...) \ ? 42? ? ? if (!(cond)) { \ ? 43? ? ? ? print_tree(tty); \ ? 44? ? ? ? assert(cond, format, __VA_ARGS__); \ ? 45? ? ? } ? 46? ? ? 47? ? ? // This assert prints the tree, then stops (generic message) ? 48? ? #define assrt0(cond) \ ? 49? ? ? ? ? ? ? if (!(cond)) { \ ? 50? ? ? ? ? ? ? ? print_tree(tty); \ ? 51? ? ? ? ? ? ? ? assert(cond, "sanity"); \ ? 52? ? ? ? ? ? ? } Better wrap into do-while(0) (see definition of vmassert) ? Ok. ? ?110? ? ? ? verify_node(n->left, left_limit, n->size, vd, lvl + 1); Recursive call that isn't a tail call. Prone to stack overflow. Well I guess you need a stack to traverse a tree. GrowableArray is a common choice if you want to eliminate this recursion. As it is only verification code you might as well leave it and interpret stack overflow as verification failure.? ?118? ? ? ? verify_node(n->right, n->size, right_limit, vd, lvl + 1); Tail-recursion can be easily eliminated. See comments on blocktree.hpp above. ? I'll rewrite this to be non-recursive. ? ====== src/hotspot/share/memory/metaspace/chunkManager.cpp The slow parameter in ChunkManager::verify*() is not used. ? I'll remove all "slow" params from all verifications and recode this to use -XX:VerifyMetaspaceInterval. I think that is easier to use. ? ====== src/hotspot/share/memory/metaspace/counter.hpp ?104? ?void decrement() { ?105 #ifdef ASSERT ?106? ? ?T old = Atomic::load_acquire(&_c); ?107? ? ?assert(old >= 1, ?108? ? ? ? ?"underflow (" UINT64_FORMAT "-1)", (uint64_t)old); ?109 #endif ?110? ? ?Atomic::dec(&_c); ?111? ?} I think you could use Atomic::add() which returns the old value and make the assert atomic too:? ? void decrement() { ? ? T old = Atomic::add(&_c, T(-1)); #ifdef ASSERT ? ? assert(old >= 1, ? ? ? ? ? ?"underflow (" UINT64_FORMAT "-1)", (uint64_t)old); #endif ? } Same for increment(), increment_by(), decrement_by(), ... ? Thought so too but Atomic::add seems to return the new value, not the old, see e.g. atomic_linux_x86.hpp: ? struct Atomic::PlatformAdd { ?.... ? D add_and_fetch(D volatile* dest, I add_value, atomic_memory_order order) const { ? ? return fetch_and_add(dest, add_value, order) + add_value; ? } }; ? I also checked the callers, those few places I found which do anything meaningful with the result seem to expect the new value. See e.g.?G1BuildCandidateArray::claim_chunk(). ? It is annoying that these?APIs are not documented. I thought this is because these APIs are obvious to everyone but me but looks like they are not. ? ====== src/hotspot/share/memory/metaspace/metaspaceArena.cpp There's too much vertical white space, I'd think. metaspace::get_raw_allocation_word_size() is a duplicate of metaspace::get_raw_word_size_for_requested_word_size() metaspace::get_raw_allocation_word_size() is only referenced in comments and should be removed. ? Oops. Sure, I'll remove that. ? ? byte_size should also depend on BlockTree::minimal_word_size I think. Something like if (worde_size > FreeBlocks::maximal_word_size) ? byte_size = MAX2(byte_size, BlockTree::minimal_word_size * BytesPerWord); FreeBlocks::maximal_word_size needs to be defined for this. ? See above. In addition to what I wrote, BlockTree is an implementation detail of FreeBlocks, so it should not matter here. ? Thanks Richard.? ? I will work in your feedback and publish a new webrev shortly. ? Cheers, Thomas ? From Alan.Bateman at oracle.com Fri Sep 4 09:10:07 2020 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 4 Sep 2020 10:10:07 +0100 Subject: RFR: 8252725: Refactor jlink GenerateJLIClassesPlugin code In-Reply-To: <6a4bbb92-7e6a-f037-a2dd-220aba9c6374@oracle.com> References: <4be47e30-aa69-7db0-ff04-f2d379fb8b38@oracle.com> <1face782-cc07-dfad-35c9-3247e556827c@oracle.com> <6a4bbb92-7e6a-f037-a2dd-220aba9c6374@oracle.com> Message-ID: <0eab9ecd-50f7-f91d-06af-e526b193826f@oracle.com> On 04/09/2020 05:37, Yumin Qi wrote: > HI, Sundar > > ? Thanks for review. > > On 9/3/20 6:34 PM, sundararajan.athijegannathan at oracle.com wrote: >> Looks good to me. >> >> Few minor comment: >> >> * traceFileStream (and even the preexisting mainArgument) is accessed >> only inside GenerateJLIClassesPlugin. Could be private? >> > I will fix them before push. (Certainly will build first to verify that). I went through the refactoring of the jlink plugin and I think it looks good too. -Alan. From igor.ignatyev at oracle.com Fri Sep 4 16:18:53 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 4 Sep 2020 09:18:53 -0700 Subject: RFR(S) : 8252004 : remove usage of PropertyResolvingWrapper in vmTestbase/nsk/sysdict In-Reply-To: References: <76F1769B-C665-47B9-9D6F-88CC0AB974DB@oracle.com> Message-ID: <3906EA7C-5FF8-4D8A-8443-286197BFBA5A@oracle.com> can I get an LGTM from someone w/ a Reviewer status before hg repo became read-only? -- Igor > On Aug 31, 2020, at 2:50 PM, Igor Ignatyev wrote: > > Hi Gerard, > > thanks for your review. > > 8219408 is to update tests residing in test/jdk, hotspot tests are going to be updated by 8219140's sub-tasks, which include 8252004 and 8252002. I am not updating all hotspot tests by one patch, b/c in some cases, the tests need small modification to work w/o PropertyResolvingWrapper and splitting the work by test groups make it easier to both work on and review such patches. > > Cheers, > -- Igor > >> On Aug 31, 2020, at 2:12 PM, gerard ziemski wrote: >> >> hi Igor, >> >> Looks fine as far a I can tell, however, other tests are waiting till JDK-8219408 to re-enable allowSmartActionnArgs - why don't we need to wait for that here as well? >> >> I noticed that there are other issues filed to address the same issue in other tests, of which there are many (i.e. JDK-8252002) - they will be following the same pattern as this one? >> >> >> cheers >> >> >> On 8/31/20 2:02 PM, Igor Ignatyev wrote: >>> ping? >>> >>> -- Igor >>> >>>> On Aug 21, 2020, at 10:46 AM, Igor Ignatyev wrote: >>>> >>>> http://cr.openjdk.java.net/~iignatyev/8252004/webrev.00/ >>>>> 22 lines changed: 0 ins; 22 del; 0 mod; >>>> >>>> Hi all, >>>> >>>> could you please review this small patch which removes usage of PropertyResolvingWrapper from nsk/sysdict tests and reenables allowSmartActionArgs? >>>> >>>> background from the main bug: >>>>> CODETOOLS-7902352 added support of using ${property} in action directive, so PropertyResolvingWrapper isn't needed anymore and can be removed. >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8252004 >>>> webrev: http://cr.openjdk.java.net/~iignatyev/8252004/webrev.00 >>>> testing: :vmTestbase_nsk_sysdict >>>> >>>> Thanks, >>>> -- Igor >>>> >> > From daniel.daugherty at oracle.com Fri Sep 4 16:43:03 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Sep 2020 12:43:03 -0400 Subject: RFR(S) : 8252004 : remove usage of PropertyResolvingWrapper in vmTestbase/nsk/sysdict In-Reply-To: <3906EA7C-5FF8-4D8A-8443-286197BFBA5A@oracle.com> References: <76F1769B-C665-47B9-9D6F-88CC0AB974DB@oracle.com> <3906EA7C-5FF8-4D8A-8443-286197BFBA5A@oracle.com> Message-ID: <4f20108c-d311-ff62-cf44-86c7134215a5@oracle.com> Thumbs up. Reviewed the patch since that was easy to scroll through. Dan On 9/4/20 12:18 PM, Igor Ignatyev wrote: > can I get an LGTM from someone w/ a Reviewer status before hg repo became read-only? > > -- Igor > >> On Aug 31, 2020, at 2:50 PM, Igor Ignatyev wrote: >> >> Hi Gerard, >> >> thanks for your review. >> >> 8219408 is to update tests residing in test/jdk, hotspot tests are going to be updated by 8219140's sub-tasks, which include 8252004 and 8252002. I am not updating all hotspot tests by one patch, b/c in some cases, the tests need small modification to work w/o PropertyResolvingWrapper and splitting the work by test groups make it easier to both work on and review such patches. >> >> Cheers, >> -- Igor >> >>> On Aug 31, 2020, at 2:12 PM, gerard ziemski wrote: >>> >>> hi Igor, >>> >>> Looks fine as far a I can tell, however, other tests are waiting till JDK-8219408 to re-enable allowSmartActionnArgs - why don't we need to wait for that here as well? >>> >>> I noticed that there are other issues filed to address the same issue in other tests, of which there are many (i.e. JDK-8252002) - they will be following the same pattern as this one? >>> >>> >>> cheers >>> >>> >>> On 8/31/20 2:02 PM, Igor Ignatyev wrote: >>>> ping? >>>> >>>> -- Igor >>>> >>>>> On Aug 21, 2020, at 10:46 AM, Igor Ignatyev wrote: >>>>> >>>>> http://cr.openjdk.java.net/~iignatyev/8252004/webrev.00/ >>>>>> 22 lines changed: 0 ins; 22 del; 0 mod; >>>>> Hi all, >>>>> >>>>> could you please review this small patch which removes usage of PropertyResolvingWrapper from nsk/sysdict tests and reenables allowSmartActionArgs? >>>>> >>>>> background from the main bug: >>>>>> CODETOOLS-7902352 added support of using ${property} in action directive, so PropertyResolvingWrapper isn't needed anymore and can be removed. >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8252004 >>>>> webrev: http://cr.openjdk.java.net/~iignatyev/8252004/webrev.00 >>>>> testing: :vmTestbase_nsk_sysdict >>>>> >>>>> Thanks, >>>>> -- Igor >>>>> From yumin.qi at oracle.com Fri Sep 4 16:49:18 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Fri, 4 Sep 2020 09:49:18 -0700 Subject: RFR: 8252725: Refactor jlink GenerateJLIClassesPlugin code In-Reply-To: <0eab9ecd-50f7-f91d-06af-e526b193826f@oracle.com> References: <4be47e30-aa69-7db0-ff04-f2d379fb8b38@oracle.com> <1face782-cc07-dfad-35c9-3247e556827c@oracle.com> <6a4bbb92-7e6a-f037-a2dd-220aba9c6374@oracle.com> <0eab9ecd-50f7-f91d-06af-e526b193826f@oracle.com> Message-ID: <4c134304-9ccd-b1b0-1f27-979b5191d3e1@oracle.com> Hi, Alan ? Thanks. Pushed before saw your email, could not credit you on reviewers. Thanks Yumin On 9/4/20 2:10 AM, Alan Bateman wrote: > On 04/09/2020 05:37, Yumin Qi wrote: >> HI, Sundar >> >> ? Thanks for review. >> >> On 9/3/20 6:34 PM, sundararajan.athijegannathan at oracle.com wrote: >>> Looks good to me. >>> >>> Few minor comment: >>> >>> * traceFileStream (and even the preexisting mainArgument) is accessed only inside GenerateJLIClassesPlugin. Could be private? >>> >> I will fix them before push. (Certainly will build first to verify that). > I went through the refactoring of the jlink plugin and I think it looks good too. > > -Alan. From daniel.daugherty at oracle.com Fri Sep 4 17:15:36 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Sep 2020 13:15:36 -0400 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check In-Reply-To: References: Message-ID: <46a793f3-ded4-26ec-351a-359ec30f646e@oracle.com> Richard, Sorry for the late review. I know you have already pushed the fix. src/hotspot/share/runtime/thread.cpp ??? L2625: ? } while (is_external_suspend()); ??????? In the other places where we are checking for a racing suspend ??????? request, we check the is_external_suspend() condition while ??????? holding the SR_lock or we call is_external_suspend_with_lock(). ??????? Here's the usual example I point people at: ??????? L2049: void JavaThread::exit(bool destroy_vm, ExitType exit_type) { ??????? L2121: ??? while (true) { ??????? L2122: ????? { ??????? L2123:? ?????? MutexLocker ml(SR_lock(), Mutex::_no_safepoint_check_flag); ??????? L2124: ??????? if (!is_external_suspend()) { ??????? The JVM/TI SuspendThread() and JVM_SuspendThread() entry points ??????? call set_external_suspend() while holding the SR_lock so the ??????? only way to be sure you haven't lost the race is to hold the ??????? SR_lock while you're checking the flag yourself. ??????? Have I missed something in my analysis? Dan On 9/2/20 11:15 AM, Reingruber, Richard wrote: > Hi, > > please help review this fix for a race condition in > JavaThread::java_suspend_self_with_safepoint_check() that allows a suspended > thread to continue executing java for an arbitrary long time (see repro test > attached to bug report). > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8252521 > > The fix is to add a do-while-loop to java_suspend_self_with_safepoint_check() > that checks if the current thread was suspended again after returning from > java_suspend_self() and before restoring the original thread state. The check is > performed after restoring the original state because then we are guaranteed to > see the suspend request issued before the requester observed that target to be > _thread_blocked and executed VM_ThreadSuspend. > > Thanks, Richard. From igor.ignatyev at oracle.com Fri Sep 4 17:37:10 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 4 Sep 2020 10:37:10 -0700 Subject: RFR(S) : 8252004 : remove usage of PropertyResolvingWrapper in vmTestbase/nsk/sysdict In-Reply-To: <4f20108c-d311-ff62-cf44-86c7134215a5@oracle.com> References: <76F1769B-C665-47B9-9D6F-88CC0AB974DB@oracle.com> <3906EA7C-5FF8-4D8A-8443-286197BFBA5A@oracle.com> <4f20108c-d311-ff62-cf44-86c7134215a5@oracle.com> Message-ID: <01DB0CEA-3ABD-4C66-88E4-651216C60260@oracle.com> thanks Dan, pushed. -- Igor > On Sep 4, 2020, at 9:43 AM, Daniel D. Daugherty wrote: > > Thumbs up. Reviewed the patch since that was easy to scroll through. > > Dan > > > On 9/4/20 12:18 PM, Igor Ignatyev wrote: >> can I get an LGTM from someone w/ a Reviewer status before hg repo became read-only? >> >> -- Igor >> >>> On Aug 31, 2020, at 2:50 PM, Igor Ignatyev wrote: >>> >>> Hi Gerard, >>> >>> thanks for your review. >>> >>> 8219408 is to update tests residing in test/jdk, hotspot tests are going to be updated by 8219140's sub-tasks, which include 8252004 and 8252002. I am not updating all hotspot tests by one patch, b/c in some cases, the tests need small modification to work w/o PropertyResolvingWrapper and splitting the work by test groups make it easier to both work on and review such patches. >>> >>> Cheers, >>> -- Igor >>> >>>> On Aug 31, 2020, at 2:12 PM, gerard ziemski wrote: >>>> >>>> hi Igor, >>>> >>>> Looks fine as far a I can tell, however, other tests are waiting till JDK-8219408 to re-enable allowSmartActionnArgs - why don't we need to wait for that here as well? >>>> >>>> I noticed that there are other issues filed to address the same issue in other tests, of which there are many (i.e. JDK-8252002) - they will be following the same pattern as this one? >>>> >>>> >>>> cheers >>>> >>>> >>>> On 8/31/20 2:02 PM, Igor Ignatyev wrote: >>>>> ping? >>>>> >>>>> -- Igor >>>>> >>>>>> On Aug 21, 2020, at 10:46 AM, Igor Ignatyev wrote: >>>>>> >>>>>> http://cr.openjdk.java.net/~iignatyev/8252004/webrev.00/ >>>>>>> 22 lines changed: 0 ins; 22 del; 0 mod; >>>>>> Hi all, >>>>>> >>>>>> could you please review this small patch which removes usage of PropertyResolvingWrapper from nsk/sysdict tests and reenables allowSmartActionArgs? >>>>>> >>>>>> background from the main bug: >>>>>>> CODETOOLS-7902352 added support of using ${property} in action directive, so PropertyResolvingWrapper isn't needed anymore and can be removed. >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8252004 >>>>>> webrev: http://cr.openjdk.java.net/~iignatyev/8252004/webrev.00 >>>>>> testing: :vmTestbase_nsk_sysdict >>>>>> >>>>>> Thanks, >>>>>> -- Igor >>>>>> > From thomas.stuefe at gmail.com Sat Sep 5 08:47:51 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sat, 5 Sep 2020 10:47:51 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) Message-ID: Hi all, This is Round Two of the review for JEP 387 "Elastic Metaspace". Please find the first round of reviews here: https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html I did massage in all feedback - plus some smaller changes from me - and here is the new version: Full Webrev: http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/ Patch applies cleanly atop of "8247352: improve error messages for sealed classes and records" Parts: Core: http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-core/webrev/ Test: http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-test/webrev/ Misc: http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-misc/webrev/ Delta webrev (somewhat useless due to many file renamings): http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/delta.webrev/webrev/ List of fine granular commits since last webrev: http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/changes_since_last_review.txt Review guide (updated; added FAQ): http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/guide/review-guide.pdf http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/guide/review-guide.html JEP: https://openjdk.java.net/jeps/387 ------ What changed? 1) Aesthetics Some of you requested style changes so there are a lot: - Most files in memory/metaspace/ now follow a common theme with a common prefix ("ms"). The only exception is metaspacesSizesSnapshot.(cpp|hpp) which I plan to remove in a follow up (see JDK-8251342). - Files in gtest/metaspace have also been partly renamed to be more consistent - Did rename enums, members and structures to adhere to hotspot C++ style guide - Provided constructors for most structures - Did use placement new where appropriate - Switched to enum class where possible. Note though that enum classes do not allow (easily) to iterate over its values, therefore I cannot use it for MetaspaceType, MetadataType. - Removed metaspaceEnum.cpp/hpp and moved all those helper functions back to metaspace.cpp/hpp. I also renamed them back to their original names (e.g. metaspace::is_class() -> Metaspace::is_class_space_allocation()) to reduce the total patch size. These renamings will show up in the delta diff though. - gtest: Did remove all includes from metaspaceTestCommon.hpp and moved them to the individual cpp files as Coleen suggested - Fixed include guard names and copyrights - Whitespace- and empty-line-cleanup http://hg.openjdk.java.net/jdk/sandbox/rev/018c370bbbd5 Rename static constants according to naming laws http://hg.openjdk.java.net/jdk/sandbox/rev/d509e2c607c7 MetaspaceReporter::ReportFlag: rename and make enum class http://hg.openjdk.java.net/jdk/sandbox/rev/b029ae20f238 Metachunk::state_t rename and enum class http://hg.openjdk.java.net/jdk/sandbox/rev/8e285bdfcca1 Correct formatting (tabs) http://hg.openjdk.java.net/jdk/sandbox/rev/a14ba4a2fb8c Remove superfluous spaces http://hg.openjdk.java.net/jdk/sandbox/rev/223c19deea53 Add constructors to structs http://hg.openjdk.java.net/jdk/sandbox/rev/0bf8e94f9e09 Rename struct members to follow naming laws http://hg.openjdk.java.net/jdk/sandbox/rev/29513e1f6d49 Remove unused struct BitCounterClosure http://hg.openjdk.java.net/jdk/sandbox/rev/7facf1afd137 Wholesale rename all structs to adhere to naming laws http://hg.openjdk.java.net/jdk/sandbox/rev/d09f635f3a07 Wholesale file renamings <<< Thats the big rename change <<< http://hg.openjdk.java.net/jdk/sandbox/rev/a56e69aba3b1 Fix include guard names in gtest headers http://hg.openjdk.java.net/jdk/sandbox/rev/788f6454226f Fix copyrights in new gtest sources http://hg.openjdk.java.net/jdk/sandbox/rev/bcb4bbc71cc9 Remove metaspaceEnums.cpp,hpp http://hg.openjdk.java.net/jdk/sandbox/rev/a3d33ce1b9db Rename FreeBlocks::_v to _blocks http://hg.openjdk.java.net/jdk/sandbox/rev/fe5aa08178d3 Fix include guard names http://hg.openjdk.java.net/jdk/sandbox/rev/1e89f4dfd8a7 Remove empty lines http://hg.openjdk.java.net/jdk/sandbox/rev/f7bf12bdbfe4 Gtests: consistent naming of context instances http://hg.openjdk.java.net/jdk/sandbox/rev/551226c61470 Beautify comment for MetaspaceArena 2) Changes to BlockTree and friends Both Leo and Richard reviewed the free block management in detail. Changes: - Made all BlockTree functions non-recursive. - Did remove offending typedefs (BinListXX etc) - Richard found two bugs (BlockTree::find_closest_fit, BlockTree::remove_from_list); I fixed those and wrote regression gtests (see test_blocktree.cpp) - In BlockTree::insert, renamed "forebear" argument to "insertion_point" - Did consistently rename "get_block" to "remove_block" in all classes - Tweaked gtests for BlockTree and BinList to make them clearer - Did remove the "splinter threshold" logic in FreeBlocks; that logic was used to determine when chopping off remainder space from a returned block was worth the effort (see FreeBlocks::remove_block()). But it is always worth doing, so no need for this threshold. - Did remove, from BlockTree, the "largest block added" logic since Leo convinced me it was less useful than I thought. http://hg.openjdk.java.net/jdk/sandbox/rev/78a34af45cb8 make BlockTree::print_tree non-recursive http://hg.openjdk.java.net/jdk/sandbox/rev/00246785a20e Grooming BlockTree (and various unrelated fixes) http://hg.openjdk.java.net/jdk/sandbox/rev/24ed785d5c51 Simplify, comment BlockTree_basic_siblings test http://hg.openjdk.java.net/jdk/sandbox/rev/e1df0dbc7cc9 Fix BlockTree::find_closest_fit and add test http://hg.openjdk.java.net/jdk/sandbox/rev/0284f4705973 Rename Forebear http://hg.openjdk.java.net/jdk/sandbox/rev/4340648bd624 Remove BinList8, BinList16, BinList64, SmallBlocksType typedef http://hg.openjdk.java.net/jdk/sandbox/rev/7c74250c35fd Clarify BlockTree gtest http://hg.openjdk.java.net/jdk/sandbox/rev/4680ad0ae1db Remove largest-block-added optimization from BlockTree http://hg.openjdk.java.net/jdk/sandbox/rev/255d1c34356f FreeBlocks, BinList, BlockTree: code grooming http://hg.openjdk.java.net/jdk/sandbox/rev/e07c51de3056 Remove unused splinter_threshold from FreeBlocks 3) Coleen asked me to remove the gtest for Metaspace reporting, and instead to provide a jtreg test. I then found that I had written bearish such a test already, and just expanded it a bit. I also re-enabled it for ZGC, for some reason it had been disabled. http://hg.openjdk.java.net/jdk/sandbox/rev/abba106ec3c0 Extend cases for PrintMetaspaceDcmd 4) Did some code grooming for allocation guards, nothing major, and added a new gtest, using the newly discovered assertion test feature :), to test that we notice overwriters in metaspace: http://hg.openjdk.java.net/jdk/sandbox/rev/07089d3e4be4 Reform allocation guard coding and add gtest 5) I did beef up the gtests for chunk enlargement, and made them easier to read: http://hg.openjdk.java.net/jdk/sandbox/rev/8fe76f9ad8ee Improve/extend testing for chunk enlargement 6) Did remove a number of dead code sections http://hg.openjdk.java.net/jdk/sandbox/rev/29513e1f6d49 Remove unused struct BitCounterClosure http://hg.openjdk.java.net/jdk/sandbox/rev/50bfadc554cd Remove dead code in metaspaceContext.cpp http://hg.openjdk.java.net/jdk/sandbox/rev/f00879c58720 Remove dead gtest files http://hg.openjdk.java.net/jdk/sandbox/rev/074a374e9c5c Remove a dead portion of code http://hg.openjdk.java.net/jdk/sandbox/rev/b10cf8275580 Remove setting which had been effectively unused. 7) I removed the "slow" parameter from all "::verify()" methods since it was not that useful. Expensive code section I place now inside a SOMETIMES clause which executes the designated code sometimes, at regular intervals, controlled via VerifyMetaspaceInterval (e.g. -XX:VerifyMetaspaceInterval=1 will execute the code always. That way we don't pay the full performance loss for these tests but still run them occasionally. http://hg.openjdk.java.net/jdk/sandbox/rev/f18d566c5875 Rework xxx::verify() functions 8) While testing on 32bit I found an error in destruction of MetaspaceTestContext where I forgot to unmap the ReservedSpace for the space-provided-from-outside case (which simulates ccs): http://hg.openjdk.java.net/jdk/sandbox/rev/d4f358b658a5 clean up spaces for non-expandable test contexts ----- Follow up items: I refrained from doing too large changes to not disturb the review process, and to keep the patch stable. Changes which are not strictly necessary but maybe a good idea I collect in follow up issues to be done once this patch is upstream: https://bugs.openjdk.java.net/browse/JDK-8251342 "Rework JFR metaspace free chunk statistics after JEP 387" https://bugs.openjdk.java.net/browse/JDK-8251392 "Brush up and consolidate Metaspace statistics after JEP 387" https://bugs.openjdk.java.net/browse/JDK-8252014 "Find a better place for counter utility classes after JEP387" https://bugs.openjdk.java.net/browse/JDK-8252132 "Investigate MetaspaceArena locking after JEP387" https://bugs.openjdk.java.net/browse/JDK-8252187 "Optimize freeblocks storage in MetaspaceArena after JEP387" https://bugs.openjdk.java.net/browse/JDK-8252189 "Clarify meaning of OOM texts for out-of-metaspace errors" ----- Thanks alot for your review work! I know it is hard, and it is very appreciated. I hope we now got all across-the-board style changes done, so the next delta diff will be easier to read. Also, feel free to contact me in case you have quick questions, or for a quick zoom meeting should that be easier. Please note that starting today I will have vacation but should be back by mid September. Cheers, Thomas From david.holmes at oracle.com Sun Sep 6 22:36:08 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 7 Sep 2020 08:36:08 +1000 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check In-Reply-To: <46a793f3-ded4-26ec-351a-359ec30f646e@oracle.com> References: <46a793f3-ded4-26ec-351a-359ec30f646e@oracle.com> Message-ID: <5b3c2e59-cc83-cb33-4dee-de7d1ea2ddd3@oracle.com> Hi Dan, On 5/09/2020 3:15 am, Daniel D. Daugherty wrote: > Richard, > > Sorry for the late review. I know you have already pushed the fix. > > src/hotspot/share/runtime/thread.cpp > ??? L2625: ? } while (is_external_suspend()); > ??????? In the other places where we are checking for a racing suspend > ??????? request, we check the is_external_suspend() condition while > ??????? holding the SR_lock or we call is_external_suspend_with_lock(). > > ??????? Here's the usual example I point people at: > > ??????? L2049: void JavaThread::exit(bool destroy_vm, ExitType > exit_type) { > > ??????? L2121: ??? while (true) { > ??????? L2122: ????? { > ??????? L2123:? ?????? MutexLocker ml(SR_lock(), > Mutex::_no_safepoint_check_flag); > ??????? L2124: ??????? if (!is_external_suspend()) { > > ??????? The JVM/TI SuspendThread() and JVM_SuspendThread() entry points > ??????? call set_external_suspend() while holding the SR_lock so the > ??????? only way to be sure you haven't lost the race is to hold the > ??????? SR_lock while you're checking the flag yourself. > > ??????? Have I missed something in my analysis? Holding the SR_lock while checking is_external_suspend doesn't really achieve anything by itself - a racing suspend can come just before the check or just after it. The SR_lock primarily ensures correct synchronization for the actual suspension (when it waits on the SR_lock) and resumption - and as per the exit code, it ensures there is no thread termination race by setting is_exiting under the lock. What we are racing with in this changeset are the java_suspend()/JvmtiSuspendControl::suspend() calls which don't hold the SR_lock. The race we have to avoid is the race where another thread completes the safepoint/handshake operation which is supposed to ensure the target is suspended, and the loop checking is_external_suspend() achieves that. You could argue that without the lock we may see a stale value for is_external_suspend() but that is not possible when a safepoint/handshake has been issued as we have full memory synchronization between threads in that case. Put another way, when the target thread returns from the safepoint/handshake and sees is_external_suspend() then it knows there is another suspend request in progress, and it will honour it (and any racing resume is handled in java_suspend_self()). If it doesn't see is_external_suspend() set then any racing suspend request can't have initiated the safepoint/handshake yet and so that suspend request will be seen the next time the target does a safepoint/handshake poll. Cheers, David ----- > Dan > > > > On 9/2/20 11:15 AM, Reingruber, Richard wrote: >> Hi, >> >> please help review this fix for a race condition in >> JavaThread::java_suspend_self_with_safepoint_check() that allows a >> suspended >> thread to continue executing java for an arbitrary long time (see >> repro test >> attached to bug report). >> >> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ >> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8252521 >> >> The fix is to add a do-while-loop to >> java_suspend_self_with_safepoint_check() >> that checks if the current thread was suspended again after returning >> from >> java_suspend_self() and before restoring the original thread state. >> The check is >> performed after restoring the original state because then we are >> guaranteed to >> see the suspend request issued before the requester observed that >> target to be >> _thread_blocked and executed VM_ThreadSuspend. >> >> Thanks, Richard. > From david.holmes at oracle.com Sun Sep 6 22:58:59 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 7 Sep 2020 08:58:59 +1000 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check In-Reply-To: <5b3c2e59-cc83-cb33-4dee-de7d1ea2ddd3@oracle.com> References: <46a793f3-ded4-26ec-351a-359ec30f646e@oracle.com> <5b3c2e59-cc83-cb33-4dee-de7d1ea2ddd3@oracle.com> Message-ID: Clarification ... On 7/09/2020 8:36 am, David Holmes wrote: > Hi Dan, > > On 5/09/2020 3:15 am, Daniel D. Daugherty wrote: >> Richard, >> >> Sorry for the late review. I know you have already pushed the fix. >> >> src/hotspot/share/runtime/thread.cpp >> ???? L2625: ? } while (is_external_suspend()); >> ???????? In the other places where we are checking for a racing suspend >> ???????? request, we check the is_external_suspend() condition while >> ???????? holding the SR_lock or we call is_external_suspend_with_lock(). >> >> ???????? Here's the usual example I point people at: >> >> ???????? L2049: void JavaThread::exit(bool destroy_vm, ExitType >> exit_type) { >> >> ???????? L2121: ??? while (true) { >> ???????? L2122: ????? { >> ???????? L2123:? ?????? MutexLocker ml(SR_lock(), >> Mutex::_no_safepoint_check_flag); >> ???????? L2124: ??????? if (!is_external_suspend()) { >> >> ???????? The JVM/TI SuspendThread() and JVM_SuspendThread() entry points >> ???????? call set_external_suspend() while holding the SR_lock so the >> ???????? only way to be sure you haven't lost the race is to hold the >> ???????? SR_lock while you're checking the flag yourself. >> >> ???????? Have I missed something in my analysis? > > Holding the SR_lock while checking is_external_suspend doesn't really > achieve anything by itself - a racing suspend can come just before the > check or just after it. By "racing suspend" I mean the part that calls set_external_suspend(). David ----- The SR_lock primarily ensures correct > synchronization for the actual suspension (when it waits on the SR_lock) > and resumption - and as per the exit code, it ensures there is no thread > termination race by setting is_exiting under the lock. > > What we are racing with in this changeset are the > java_suspend()/JvmtiSuspendControl::suspend() calls which don't hold the > SR_lock. The race we have to avoid is the race where another thread > completes the safepoint/handshake operation which is supposed to ensure > the target is suspended, and the loop checking is_external_suspend() > achieves that. > > You could argue that without the lock we may see a stale value for > is_external_suspend() but that is not possible when a > safepoint/handshake has been issued as we have full memory > synchronization between threads in that case. > > Put another way, when the target thread returns from the > safepoint/handshake and sees is_external_suspend() then it knows there > is another suspend request in progress, and it will honour it (and any > racing resume is handled in java_suspend_self()). If it doesn't see > is_external_suspend() set then any racing suspend request can't have > initiated the safepoint/handshake yet and so that suspend request will > be seen the next time the target does a safepoint/handshake poll. > > Cheers, > David > ----- > >> Dan >> >> >> >> On 9/2/20 11:15 AM, Reingruber, Richard wrote: >>> Hi, >>> >>> please help review this fix for a race condition in >>> JavaThread::java_suspend_self_with_safepoint_check() that allows a >>> suspended >>> thread to continue executing java for an arbitrary long time (see >>> repro test >>> attached to bug report). >>> >>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ >>> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8252521 >>> >>> The fix is to add a do-while-loop to >>> java_suspend_self_with_safepoint_check() >>> that checks if the current thread was suspended again after returning >>> from >>> java_suspend_self() and before restoring the original thread state. >>> The check is >>> performed after restoring the original state because then we are >>> guaranteed to >>> see the suspend request issued before the requester observed that >>> target to be >>> _thread_blocked and executed VM_ThreadSuspend. >>> >>> Thanks, Richard. >> From richard.reingruber at sap.com Mon Sep 7 08:28:11 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 7 Sep 2020 08:28:11 +0000 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check In-Reply-To: <46a793f3-ded4-26ec-351a-359ec30f646e@oracle.com> References: <46a793f3-ded4-26ec-351a-359ec30f646e@oracle.com> Message-ID: Hi Dan, > Sorry for the late review. I know you have already pushed the fix. Thanks for taking the time! And sorry for being a bit quick with the push. I wanted it to happen before the switch to git. Like David, I think the VM_ThreadSuspend on the suspender side and restoring the saved thread state with a fence on the suspendee side are sufficient synchronization to avoid the issue. 2613 void JavaThread::java_suspend_self_with_safepoint_check() { 2614 assert(this == Thread::current(), "invariant"); 2615 JavaThreadState state = thread_state(); 2616 2617 do { 2618 set_thread_state(_thread_blocked); 2619 java_suspend_self(); 2620 // The current thread could have been suspended again. We have to check for 2621 // suspend after restoring the saved state. Without this the current thread 2622 // might return to _thread_in_Java and execute bytecodes for an arbitrary 2623 // long time. 2624 set_thread_state_fence(state); 2625 } while (is_external_suspend()); 2626 2627 // Since we are not using a regular thread-state transition helper here, 2628 // we must manually emit the instruction barrier after leaving a safe state. 2629 OrderAccess::cross_modify_fence(); 2630 if (state != _thread_in_native) { 2631 SafepointMechanism::block_if_requested(this); 2632 } 2633 } The issue was that (A) a java thread T1 returns from a call T2.suspend() and then observes (B) T2 is still running. With the fix either (A) or (B) does not happen. (A) means that T1 has set _external_suspend for T2 and then executed VM_ThreadSuspend and (B) means that it did so before T2 reached the safepoint in 2631. The fence in line 2624 (*) guarantees that the check in line 2625 will see a suspend request if the associated safepoint was established before the state change in line 2624 was executed. So if the check 2625 returns false this means that the VM_ThreadSuspend has not begun and T1 cannot yet return from T2.suspend(). So either not (A) or not (B). Thanks, Richard. (*) together with the safepoint sync in T1 -----Original Message----- From: Daniel D. Daugherty Sent: Freitag, 4. September 2020 19:16 To: Reingruber, Richard ; Hotspot dev runtime Subject: Re: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check Richard, Sorry for the late review. I know you have already pushed the fix. src/hotspot/share/runtime/thread.cpp ??? L2625: ? } while (is_external_suspend()); ??????? In the other places where we are checking for a racing suspend ??????? request, we check the is_external_suspend() condition while ??????? holding the SR_lock or we call is_external_suspend_with_lock(). ??????? Here's the usual example I point people at: ??????? L2049: void JavaThread::exit(bool destroy_vm, ExitType exit_type) { ??????? L2121: ??? while (true) { ??????? L2122: ????? { ??????? L2123:? ?????? MutexLocker ml(SR_lock(), Mutex::_no_safepoint_check_flag); ??????? L2124: ??????? if (!is_external_suspend()) { ??????? The JVM/TI SuspendThread() and JVM_SuspendThread() entry points ??????? call set_external_suspend() while holding the SR_lock so the ??????? only way to be sure you haven't lost the race is to hold the ??????? SR_lock while you're checking the flag yourself. ??????? Have I missed something in my analysis? Dan On 9/2/20 11:15 AM, Reingruber, Richard wrote: > Hi, > > please help review this fix for a race condition in > JavaThread::java_suspend_self_with_safepoint_check() that allows a suspended > thread to continue executing java for an arbitrary long time (see repro test > attached to bug report). > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8252521 > > The fix is to add a do-while-loop to java_suspend_self_with_safepoint_check() > that checks if the current thread was suspended again after returning from > java_suspend_self() and before restoring the original thread state. The check is > performed after restoring the original state because then we are guaranteed to > see the suspend request issued before the requester observed that target to be > _thread_blocked and executed VM_ThreadSuspend. > > Thanks, Richard. From richard.reingruber at sap.com Mon Sep 7 08:28:11 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 7 Sep 2020 08:28:11 +0000 Subject: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check In-Reply-To: <46a793f3-ded4-26ec-351a-359ec30f646e@oracle.com> References: <46a793f3-ded4-26ec-351a-359ec30f646e@oracle.com> Message-ID: Hi Dan, > Sorry for the late review. I know you have already pushed the fix. Thanks for taking the time! And sorry for being a bit quick with the push. I wanted it to happen before the switch to git. Like David, I think the VM_ThreadSuspend on the suspender side and restoring the saved thread state with a fence on the suspendee side are sufficient synchronization to avoid the issue. 2613 void JavaThread::java_suspend_self_with_safepoint_check() { 2614 assert(this == Thread::current(), "invariant"); 2615 JavaThreadState state = thread_state(); 2616 2617 do { 2618 set_thread_state(_thread_blocked); 2619 java_suspend_self(); 2620 // The current thread could have been suspended again. We have to check for 2621 // suspend after restoring the saved state. Without this the current thread 2622 // might return to _thread_in_Java and execute bytecodes for an arbitrary 2623 // long time. 2624 set_thread_state_fence(state); 2625 } while (is_external_suspend()); 2626 2627 // Since we are not using a regular thread-state transition helper here, 2628 // we must manually emit the instruction barrier after leaving a safe state. 2629 OrderAccess::cross_modify_fence(); 2630 if (state != _thread_in_native) { 2631 SafepointMechanism::block_if_requested(this); 2632 } 2633 } The issue was that (A) a java thread T1 returns from a call T2.suspend() and then observes (B) T2 is still running. With the fix either (A) or (B) does not happen. (A) means that T1 has set _external_suspend for T2 and then executed VM_ThreadSuspend and (B) means that it did so before T2 reached the safepoint in 2631. The fence in line 2624 (*) guarantees that the check in line 2625 will see a suspend request if the associated safepoint was established before the state change in line 2624 was executed. So if the check 2625 returns false this means that the VM_ThreadSuspend has not begun and T1 cannot yet return from T2.suspend(). So either not (A) or not (B). Thanks, Richard. (*) together with the safepoint sync in T1 -----Original Message----- From: Daniel D. Daugherty Sent: Freitag, 4. September 2020 19:16 To: Reingruber, Richard ; Hotspot dev runtime Subject: Re: RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check Richard, Sorry for the late review. I know you have already pushed the fix. src/hotspot/share/runtime/thread.cpp ??? L2625: ? } while (is_external_suspend()); ??????? In the other places where we are checking for a racing suspend ??????? request, we check the is_external_suspend() condition while ??????? holding the SR_lock or we call is_external_suspend_with_lock(). ??????? Here's the usual example I point people at: ??????? L2049: void JavaThread::exit(bool destroy_vm, ExitType exit_type) { ??????? L2121: ??? while (true) { ??????? L2122: ????? { ??????? L2123:? ?????? MutexLocker ml(SR_lock(), Mutex::_no_safepoint_check_flag); ??????? L2124: ??????? if (!is_external_suspend()) { ??????? The JVM/TI SuspendThread() and JVM_SuspendThread() entry points ??????? call set_external_suspend() while holding the SR_lock so the ??????? only way to be sure you haven't lost the race is to hold the ??????? SR_lock while you're checking the flag yourself. ??????? Have I missed something in my analysis? Dan On 9/2/20 11:15 AM, Reingruber, Richard wrote: > Hi, > > please help review this fix for a race condition in > JavaThread::java_suspend_self_with_safepoint_check() that allows a suspended > thread to continue executing java for an arbitrary long time (see repro test > attached to bug report). > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8252521 > > The fix is to add a do-while-loop to java_suspend_self_with_safepoint_check() > that checks if the current thread was suspended again after returning from > java_suspend_self() and before restoring the original thread state. The check is > performed after restoring the original state because then we are guaranteed to > see the suspend request issued before the requester observed that target to be > _thread_blocked and executed VM_ThreadSuspend. > > Thanks, Richard. From martin.doerr at sap.com Mon Sep 7 10:23:36 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 7 Sep 2020 10:23:36 +0000 Subject: Fatal errors when running JCK tests with JDK15/16 debug build In-Reply-To: References: <80c5c750-2a02-9bd9-d0b4-628481c71264@oracle.com> Message-ID: Hi Leonid, the errors were observed in many more vm/jvmti/Get... tests like the following ones: vm/jvmti/GetAllThreads/gath001/gath00101/gath00101.html vm/jvmti/GetAvailableProcessors/gaps001/gaps00101/gaps00101.html vm/jvmti/GetClassModifiers/gcmo001/gcmo00102/gcmo00102.html vm/jvmti/GetClassMethods/gcmt001/gcmt00102/gcmt00102.html vm/jvmti/GetBytecodes/gbyc001/gbyc00102/gbyc00102.html vm/jvmti/GetCapabilities/gcap001/gcap00101/gcap00101.html vm/jvmti/GetClassLoader/gclo001/gclo00101/gclo00101.html We run them with fastdbg builds every night and we have seen the errors almost every day. Best regards, Martin > -----Original Message----- > From: David Holmes > Sent: Dienstag, 1. September 2020 07:07 > To: leonid.kuskov at oracle.com; Doerr, Martin ; > serviceability-dev at openjdk.java.net; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug build > > Hi Leonid, > > On 1/09/2020 10:42 am, leonid.kuskov at oracle.com wrote: > > Hi, > > > > It's a known issue that was reported by Arno Zeller > > (arno.zeller at sap.com) in the middle of June. The test > > jvmti/GetAllStackTraces/gast001/gast00105/gast00105.html failed with the > > same stack trace despite the fix ( JCK-7022500 lprintf in > > jvmti/support.c is not MT-Safe) Please file a JCK's issue with details > > to reproduce the failure. > > Interesting. The fix is supposed to make things thread-safe by using a > RawMonitor to ensure only one thread can use lprintf at a time. I missed > that in my initial analysis. But something is going wrong. > > Thanks, > David > > > Thanks, > > Leonid > > > > On 8/31/20 3:37 PM, David Holmes wrote: > > > >> On 1/09/2020 3:00 am, Doerr, Martin wrote: > >>> Hi David, > >>> > >>> thanks for analyzing it. We need to exclude the test for now. > >> > >> Can you file a JCK bug? I can file one on our internal JCK Jira but > >> I'm not sure what the right process is in this case. > >> > >> Thanks, > >> David > >> > >>> Best regards, > >>> Martin > >>> > >>> > >>>> -----Original Message----- > >>>> From: David Holmes > >>>> Sent: Montag, 31. August 2020 04:34 > >>>> To: Doerr, Martin ; serviceability- > >>>> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > >>>> Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug > >>>> build > >>>> > >>>> Hi Martin, > >>>> > >>>> On 29/08/2020 3:53 am, Doerr, Martin wrote: > >>>>> Hi, > >>>>> > >>>>> we have seen the following fatal error more than 50 times since > >>>>> 2020-05-25 in various JCK tests vm/jvmti. > >>>>> > >>>>> fatal error: String conversion failure: [check] ExitLock destroyed > >>>>> > >>>>> --> ?? [check] ExitLock exited > >>>>> > >>>>> (followed by garbage output) > >>>>> > >>>>> 8166358: Re-enable String verification in > >>>>> java_lang_String::create_from_str() > >>>>> > >>>>> was pushed at that date which introduced the call to fatal. > >>>>> > >>>>> Stack (example from linuxppc64le, but also observed on x86 and > >>>>> aarch64): > >>>>> V? [libjvm.so+0xee242c] java_lang_String::create_from_str(char > const*, > >>>>> Thread*) [clone .part.158]+0x51c > >>>>> V? [libjvm.so+0xee2530] java_lang_String::create_oop_from_str(char > >>>>> const*, Thread*)+0x40 > >>>>> V? [libjvm.so+0x1026a30]? jni_NewStringUTF+0x1e0 > >>>>> C? [libjckjvmti.so+0x3ce4c]? logWrite+0x5c > >>>>> C? [libjckjvmti.so+0x3cd20]? lprintf+0x170 > >>>>> C? [libjckjvmti.so+0x485b8]? gast00104_agent_proc+0x254 > >>>>> V? [libjvm.so+0x1218f0c] > JvmtiAgentThread::call_start_function()+0x24c > >>>>> V? [libjvm.so+0x193a8fc] JavaThread::thread_main_inner()+0x32c > >>>>> V? [libjvm.so+0x19418a0]? Thread::call_run()+0x160 > >>>>> V? [libjvm.so+0x15c9d0c]? thread_native_entry(Thread*)+0x18c > >>>>> C? [libpthread.so.0+0x9b48]? start_thread+0x108 > >>>>> > >>>>> (Problem could have been there before but without this fatal > message.) > >>>>> > >>>>> The messages are generated by: > >>>>> > >>>>> tests/vm/jvmti/GetAllStackTraces/gast001/gast00104/gast00104.c > >>>>> > >>>>> This looks like a race condition. The message changes while the VM > >>>>> creates a String object from it. Has anybody seen this before? > >>>> > >>>> No but ... > >>>> > >>>>> Is it a test problem? I'm not familiar with the lprintf calls in > >>>>> the test. > >>>> > >>>> ... the lprintf is part of the JCK support library (support.c if you > >>>> have access to sources) and it uses a static buffer for the log > >>>> messages > >>>> and so it not thread-safe. This test creates a thread and both it and > >>>> the main thread call lprintf concurrently. > >>>> > >>>> So this is a JCK test/test-library bug that appears to be exposed by > >>>> the > >>>> changes made in 8166358. > >>>> > >>>> Cheers, > >>>> David > >>>> ----- > >>>> > >>>>> Best regards, > >>>>> > >>>>> Martin > >>>>> From avoitylov at openjdk.java.net Mon Sep 7 11:29:29 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Mon, 7 Sep 2020 11:29:29 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port Message-ID: continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > The download side of using JNI in these tests is that it complicates the > setup a bit for those that run jtreg directly and/or just build the JDK > and not the test libraries. You could reduce this burden a bit by > limiting the load library/isMusl check to Linux only, meaning isMusl > would not be called on other platforms. > > The alternative you suggest above might indeed be better. I assume you > don't mean splitting the tests but rather just adding a second @test > description so that the vm.musl case runs the test with a system > property that allows the test know the expected load library path behavior. I have updated the PR to split the two tests in multiple @test s. > The updated comment in java_md.c in this looks good. A minor comment on > Platform.isBusybox is Files.isSymbolicLink returning true implies that > the link exists so no need to check for exists too. Also the > if-then-else style for the new class in ProcessBuilder/Basic.java is > inconsistent with the rest of the test so it stands out. Thank you, these changes are done in the updated PR. > Given the repo transition this weekend then I assume you'll create a PR > for the final review at least. Also I see JEP 386 hasn't been targeted > yet but I assume Boris, as owner, will propose-to-target and wait for it > to be targeted before it is integrated. Yes. How can this be best accomplished with the new git workflow? - we can continue the review process till the end and I will request the integration to happen only after the JEP is targeted. I guess this step is now done by typing "slash integrate" in a comment. - we can pause the review process now until the JEP is targeted. In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. ------------- Commit messages: - JDK-8247589: Implementation of Alpine Linux/x64 Port Changes: https://git.openjdk.java.net/jdk/pull/49/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8247589 Stats: 403 lines in 30 files changed: 346 ins; 17 del; 40 mod Patch: https://git.openjdk.java.net/jdk/pull/49.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/49/head:pull/49 PR: https://git.openjdk.java.net/jdk/pull/49 From alanb at openjdk.java.net Mon Sep 7 12:20:40 2020 From: alanb at openjdk.java.net (Alan Bateman) Date: Mon, 7 Sep 2020 12:20:40 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port In-Reply-To: References: Message-ID: On Mon, 7 Sep 2020 11:23:28 GMT, Aleksei Voitylov wrote: > continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > >> The download side of using JNI in these tests is that it complicates the >> setup a bit for those that run jtreg directly and/or just build the JDK >> and not the test libraries. You could reduce this burden a bit by >> limiting the load library/isMusl check to Linux only, meaning isMusl >> would not be called on other platforms. >> >> The alternative you suggest above might indeed be better. I assume you >> don't mean splitting the tests but rather just adding a second @test >> description so that the vm.musl case runs the test with a system >> property that allows the test know the expected load library path behavior. > > I have updated the PR to split the two tests in multiple @test s. > >> The updated comment in java_md.c in this looks good. A minor comment on >> Platform.isBusybox is Files.isSymbolicLink returning true implies that >> the link exists so no need to check for exists too. Also the >> if-then-else style for the new class in ProcessBuilder/Basic.java is >> inconsistent with the rest of the test so it stands out. > > Thank you, these changes are done in the updated PR. > >> Given the repo transition this weekend then I assume you'll create a PR >> for the final review at least. Also I see JEP 386 hasn't been targeted >> yet but I assume Boris, as owner, will propose-to-target and wait for it >> to be targeted before it is integrated. > > Yes. How can this be best accomplished with the new git workflow? > - we can continue the review process till the end and I will request the integration to happen only after the JEP is > targeted. I guess this step is now done by typing "slash integrate" in a comment. > - we can pause the review process now until the JEP is targeted. > > In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. Marked as reviewed by alanb (Reviewer). This change was in review on core-libs-dev and other mailing lists before the switch to skara/git. The issues that I brought up have been added in the PR and I don't have any further comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From david.holmes at oracle.com Mon Sep 7 13:29:17 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 7 Sep 2020 23:29:17 +1000 Subject: [16][RFR][s]:8252835:Revert fix for JDK-8246051 In-Reply-To: <743d73d5-36cc-4a1e-bc13-96bb27c8ab8e.zhuoren.wz@alibaba-inc.com> References: <743d73d5-36cc-4a1e-bc13-96bb27c8ab8e.zhuoren.wz@alibaba-inc.com> Message-ID: <989bb8a4-f33a-5a10-5b65-d9684e79a2e2@oracle.com> Hi Zhuoren, This needs to be done as a Pull Request (PR) now that we have transitoned to git/gitbub. Thanks, David On 7/09/2020 9:49 pm, Wang Zhuo(Zhuoren) wrote: > As discussed before, this patch is to revert > JDK-8246051(SIGBUS?by?unaligned?Unsafe?compare_and_swap). Please review. > JDK bug: https://bugs.openjdk.java.net/browse/JDK-8252835 > Patch: http://cr.openjdk.java.net/~wzhuo/8252835/webrev.00/ > > > Regards, > Zhuoren > From richard.reingruber at sap.com Mon Sep 7 14:09:11 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 7 Sep 2020 14:09:11 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> Message-ID: Hi, I would like to close the review of this change. It has received a lot of helpful feedback during the process and 2 full Reviews. Thanks everybody! I'm planning to push it this week on Thursday as solution for JBS items: https://bugs.openjdk.java.net/browse/JDK-8227745 https://bugs.openjdk.java.net/browse/JDK-8233915 Version to be pushed: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ Hope to get my GIT/Skara setup going until then... :) Thanks, Richard. -----Original Message----- From: hotspot-compiler-dev On Behalf Of Reingruber, Richard Sent: Mittwoch, 2. September 2020 23:27 To: Robbin Ehn ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Robin, > On 2020-09-02 15:48, Reingruber, Richard wrote: > > Hi Robbin, > > > > // taking the discussion back to the mailing lists > > > > > I still don't understand why you don't deoptimize the objects inside the > > > handshake/safepoint instead? > So for handshakes using asynch handshake and allowing blocking inside > would fix that. (future fix, I'm working on that now) Just to make it clear: I'm not fond of the extra suspension mechanism currently used for JDK-8227745 either. I want to get rid of it and I will work on it. Asynch handshakes (JDK-8238761) could be a replacement for it. At least I think they can be used to suspend the target thread. > For safepoint, since we have suspended all threads, ~'safepointed them' > with a JavaThread, you _could_ just execute the action directly (e.g. > skipping VM_HeapWalkOperation safepoint) since they are suppose to be > safely suspended until the destructor of EB, no? Yes, this should be possible. This would be an advanced change though. I would like EscapeBarriers to be a no-op and fall back to current implementation, if C2-EscapeAnalysis/Graal are disabled. > So I suggest future work to instead just execute the safepoint with the > requesting JT instead of having a this special safepoiting mechanism. > Since you are missing above functionality I see why you went this way. > If you need to push it, it's fine by me. We will work on further improvements. Top of the list would be eliminating the extra suspend mechanism. The implementation has matured for more than 12 months now [1]. It's been tested extensively at SAP over that time and passed also extended testing at Oracle kindly conducted by Vladimir Kozlov. We've got two full Reviews and incorporated extensive feedback from a number of OpenJDK Reviewers (including you, thanks!). Based on that I reckon we're good to push the change as enhancement (JDK-8227745) and bug fix (JDK-8233915). > Thanks for explaining once again :) Pleasure :) Thanks, Richard. [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028729.html -----Original Message----- From: Robbin Ehn Sent: Mittwoch, 2. September 2020 16:54 To: Reingruber, Richard ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, On 2020-09-02 15:48, Reingruber, Richard wrote: > Hi Robbin, > > // taking the discussion back to the mailing lists > > > I still don't understand why you don't deoptimize the objects inside the > > handshake/safepoint instead? So for handshakes using asynch handshake and allowing blocking inside would fix that. (future fix, I'm working on that now) For safepoint, since we have suspended all threads, ~'safepointed them' with a JavaThread, you _could_ just execute the action directly (e.g. skipping VM_HeapWalkOperation safepoint) since they are suppose to be safely suspended until the destructor of EB, no? So I suggest future work to instead just execute the safepoint with the requesting JT instead of having a this special safepoiting mechanism. Since you are missing above functionality I see why you went this way. If you need to push it, it's fine by me. Thanks for explaining once again :) /Robbin > > This is unfortunately not possible. Deoptimizing objects includes reallocating > scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This > cannot be done at a safepoint or handshake. > > 1. The vm thread is not allowed to allocate on the java heap > See for instance assertions in ParallelScavengeHeap::mem_allocate() > https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKya9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ > > This is not easy to change, I suppose, because it will be difficult to gc if > necessary. > > 2. Using a direct handshake would not work either. The problem there is again > gc. Let J be the JavaThread that is executing the direct handshake. The vm > would deadlock if the vm thread waits for J to execute the closure of a > handshake-all and J waits for the vm thread to execute a gc vm operation. > Patricio Chilano made me aware of this: https://bugs.openjdk.java.net/browse/JDK-8230594 > > Cheers, Richard. > > -----Original Message----- > From: Robbin Ehn > Sent: Mittwoch, 2. September 2020 13:56 > To: Reingruber, Richard > Cc: Lindenmaier, Goetz ; Vladimir Kozlov ; David Holmes > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi, > > I still don't understand why you don't deoptimize the objects inside the > handshake/safepoint instead? > > E.g. > > JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the code > from: > eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over the > stack, so: > > void > GetOwnedMonitorInfoClosure::do_thread(Thread *target) { > assert(target->is_Java_thread(), "just checking"); > JavaThread *jt = (JavaThread *)target; > > if (!jt->is_exiting() && (jt->threadObj() != NULL)) { > + if (EscapeBarrier::deoptimize_objects(jt, MaxJavaStackTraceDepth)) { > _result = > ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, > _owned_monitors_list); > } else { > _result = JVMTI_ERROR_OUT_OF_MEMORY; > } > } > } > > Why try 'suspend' the thread first? > > > When we de-optimize all threads why not just in the following safepoint? > E.g. > VM_HeapWalkOperation::doit() { > + EscapeBarrier::deoptimize_objects_all_threads(); > ... > } > > Thanks, Robbin > > From jiefu at openjdk.java.net Mon Sep 7 23:59:28 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 7 Sep 2020 23:59:28 GMT Subject: RFR: 8252887: Zero VM is broken after JDK-8252661 Message-ID: Hi all, JBS: https://bugs.openjdk.java.net/browse/JDK-8252887 Zero VM is broken due to 'block_if_requested' is not a member of 'SafepointMechanism'. The reason is that 'block_if_requested' has been replaced by 'process_if_requested' after JDK-8252661. The fix just replaces 'block_if_requested' with 'process_if_requested'. Thanks. Best regards, Jie 8252887: Zero VM is broken after JDK-8252661 ------------- Commit messages: - 8252887: Zero VM is broken after JDK-8252661 Changes: https://git.openjdk.java.net/jdk/pull/64/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=64&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252887 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/64.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/64/head:pull/64 PR: https://git.openjdk.java.net/jdk/pull/64 From dholmes at openjdk.java.net Tue Sep 8 00:17:44 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 8 Sep 2020 00:17:44 GMT Subject: RFR: 8252887: Zero VM is broken after JDK-8252661 In-Reply-To: References: Message-ID: On Mon, 7 Sep 2020 23:54:06 GMT, Jie Fu wrote: > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8252887 > > Zero VM is broken due to 'block_if_requested' is not a member of 'SafepointMechanism'. > The reason is that 'block_if_requested' has been replaced by 'process_if_requested' after JDK-8252661. > > The fix just replaces 'block_if_requested' with 'process_if_requested'. > > Thanks. > Best regards, > Jie > > > > > 8252887: Zero VM is broken after JDK-8252661 Looks good and trivial. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/64 From iklam at openjdk.java.net Tue Sep 8 01:36:27 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 8 Sep 2020 01:36:27 GMT Subject: RFR: 8250563 Add KVHashtable::add_if_absent Message-ID: Please review this XS change. I added a new **KVHashtable::add_if_absent** function (modeled after ResourceHashtable::put_if_absent from JDK-8244733). - I used "add" instead of "put" to be consistent with the naming convention in utility/hashtable.hpp - I also fixed a type in the comments in resourceHashtable.hpp Running mach5 tiers1/2. ------------- Commit messages: - 8250563 Add KVHashtable::add_if_absent Changes: https://git.openjdk.java.net/jdk/pull/66/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=66&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8250563 Stats: 33 lines in 4 files changed: 21 ins; 3 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/66.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/66/head:pull/66 PR: https://git.openjdk.java.net/jdk/pull/66 From jiefu at openjdk.java.net Tue Sep 8 02:38:41 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Tue, 8 Sep 2020 02:38:41 GMT Subject: RFR: 8252887: Zero VM is broken after JDK-8252661 In-Reply-To: References: Message-ID: On Tue, 8 Sep 2020 00:14:59 GMT, David Holmes wrote: >> Hi all, >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8252887 >> >> Zero VM is broken due to 'block_if_requested' is not a member of 'SafepointMechanism'. >> The reason is that 'block_if_requested' has been replaced by 'process_if_requested' after JDK-8252661. >> >> The fix just replaces 'block_if_requested' with 'process_if_requested'. >> >> Thanks. >> Best regards, >> Jie >> >> >> >> >> 8252887: Zero VM is broken after JDK-8252661 > > Looks good and trivial. Thanks David for your review. ------------- PR: https://git.openjdk.java.net/jdk/pull/64 From jiefu at openjdk.java.net Tue Sep 8 02:38:42 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Tue, 8 Sep 2020 02:38:42 GMT Subject: Integrated: 8252887: Zero VM is broken after JDK-8252661 In-Reply-To: References: Message-ID: <1_VaXtqHV7hzF2Utz6VPYXSurW2_VnHWJPAaKkhpV1Y=.7f5b60de-1f08-4500-9be7-b3805855ec8f@github.com> On Mon, 7 Sep 2020 23:54:06 GMT, Jie Fu wrote: > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8252887 > > Zero VM is broken due to 'block_if_requested' is not a member of 'SafepointMechanism'. > The reason is that 'block_if_requested' has been replaced by 'process_if_requested' after JDK-8252661. > > The fix just replaces 'block_if_requested' with 'process_if_requested'. > > Thanks. > Best regards, > Jie > > > > > 8252887: Zero VM is broken after JDK-8252661 This pull request has now been integrated. Changeset: 891886b6 Author: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/891886b6 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8252887: Zero VM is broken after JDK-8252661 Zero VM is broken due to 'block_if_requested' is not a member of 'SafepointMechanism'. Reviewed-by: dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/64 From iklam at openjdk.java.net Tue Sep 8 07:05:30 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 8 Sep 2020 07:05:30 GMT Subject: RFR: 8250563: Add KVHashtable::add_if_absent [v2] In-Reply-To: References: Message-ID: > Please review this XS change. I added a new **KVHashtable::add_if_absent** function (modeled after > ResourceHashtable::put_if_absent from JDK-8244733). > - I used "add" instead of "put" to be consistent with the naming convention in utility/hashtable.hpp > - I also fixed a type in the comments in resourceHashtable.hpp > > Running mach5 tiers1/2. Ioi Lam has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8250563: Add KVHashtable::add_if_absent ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/66/files - new: https://git.openjdk.java.net/jdk/pull/66/files/f41faaad..5249b6be Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=66&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=66&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/66.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/66/head:pull/66 PR: https://git.openjdk.java.net/jdk/pull/66 From shade at openjdk.java.net Tue Sep 8 07:08:43 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 8 Sep 2020 07:08:43 GMT Subject: RFR: 8250563: Add KVHashtable::add_if_absent [v2] In-Reply-To: References: Message-ID: On Tue, 8 Sep 2020 07:05:30 GMT, Ioi Lam wrote: >> Please review this XS change. I added a new **KVHashtable::add_if_absent** function (modeled after >> ResourceHashtable::put_if_absent from JDK-8244733). >> - I used "add" instead of "put" to be consistent with the naming convention in utility/hashtable.hpp >> - I also fixed a type in the comments in resourceHashtable.hpp >> >> Running mach5 tiers1/2. > > Ioi Lam has refreshed the contents of this pull request, and previous commits have been removed. The incremental views > will show differences compared to the previous content of the PR. Otherwise looks good, modulo the single signature question in review. src/hotspot/share/utilities/hashtable.hpp line 327: > 325: // pointer to the value. > 326: // *p_created is true if entry was created, false if entry pre-existed. > 327: V* add_if_absent(K const& key, V const& value, bool* p_created) { Does it really need `const &` here? It looks inconsistent with `add(K,V)` and `lookup(K)` in the same class. ------------- Changes requested by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/66 From iklam at openjdk.java.net Tue Sep 8 07:21:43 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 8 Sep 2020 07:21:43 GMT Subject: RFR: 8250563: Add KVHashtable::add_if_absent [v2] In-Reply-To: References: Message-ID: <4HO71oYmsOY3WyHFYKD8X1T87BzqiZcqiVq6qUcw050=.04fce34e-eb51-4c13-acb1-a00d0c5634a6@github.com> On Tue, 8 Sep 2020 07:05:12 GMT, Aleksey Shipilev wrote: >> Ioi Lam has refreshed the contents of this pull request, and previous commits have been removed. The incremental views >> will show differences compared to the previous content of the PR. The pull request contains one new commit since the >> last revision: >> 8250563: Add KVHashtable::add_if_absent > > src/hotspot/share/utilities/hashtable.hpp line 327: > >> 325: // pointer to the value. >> 326: // *p_created is true if entry was created, false if entry pre-existed. >> 327: V* add_if_absent(K const& key, V const& value, bool* p_created) { > > Does it really need `const &` here? It looks inconsistent with `add(K,V)` and `lookup(K)` in the same class. I'll remove the `const&`. I copied that from resourceHash.hpp but it doesn't fit in with the rest of hashtable.hpp. ------------- PR: https://git.openjdk.java.net/jdk/pull/66 From erikj at openjdk.java.net Tue Sep 8 13:04:24 2020 From: erikj at openjdk.java.net (Erik Joelsson) Date: Tue, 8 Sep 2020 13:04:24 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port In-Reply-To: References: Message-ID: On Mon, 7 Sep 2020 11:23:28 GMT, Aleksei Voitylov wrote: > continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > >> The download side of using JNI in these tests is that it complicates the >> setup a bit for those that run jtreg directly and/or just build the JDK >> and not the test libraries. You could reduce this burden a bit by >> limiting the load library/isMusl check to Linux only, meaning isMusl >> would not be called on other platforms. >> >> The alternative you suggest above might indeed be better. I assume you >> don't mean splitting the tests but rather just adding a second @test >> description so that the vm.musl case runs the test with a system >> property that allows the test know the expected load library path behavior. > > I have updated the PR to split the two tests in multiple @test s. > >> The updated comment in java_md.c in this looks good. A minor comment on >> Platform.isBusybox is Files.isSymbolicLink returning true implies that >> the link exists so no need to check for exists too. Also the >> if-then-else style for the new class in ProcessBuilder/Basic.java is >> inconsistent with the rest of the test so it stands out. > > Thank you, these changes are done in the updated PR. > >> Given the repo transition this weekend then I assume you'll create a PR >> for the final review at least. Also I see JEP 386 hasn't been targeted >> yet but I assume Boris, as owner, will propose-to-target and wait for it >> to be targeted before it is integrated. > > Yes. How can this be best accomplished with the new git workflow? > - we can continue the review process till the end and I will request the integration to happen only after the JEP is > targeted. I guess this step is now done by typing "slash integrate" in a comment. > - we can pause the review process now until the JEP is targeted. > > In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. Build changes look ok. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/49 From iklam at openjdk.java.net Tue Sep 8 15:27:49 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 8 Sep 2020 15:27:49 GMT Subject: RFR: 8250563: Add KVHashtable::add_if_absent [v3] In-Reply-To: References: Message-ID: > Please review this XS change. I added a new **KVHashtable::add_if_absent** function (modeled after > ResourceHashtable::put_if_absent from JDK-8244733). > - I used "add" instead of "put" to be consistent with the naming convention in utility/hashtable.hpp > - I also fixed a type in the comments in resourceHashtable.hpp > > Running mach5 tiers1/2. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: (a) remove const& from add_if_absent; (b) use EQUALS template param instead of == ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/66/files - new: https://git.openjdk.java.net/jdk/pull/66/files/5249b6be..ca5000a5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=66&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=66&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/66.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/66/head:pull/66 PR: https://git.openjdk.java.net/jdk/pull/66 From ccheung at openjdk.java.net Tue Sep 8 15:27:49 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Tue, 8 Sep 2020 15:27:49 GMT Subject: RFR: 8250563: Add KVHashtable::add_if_absent [v3] In-Reply-To: References: Message-ID: On Tue, 8 Sep 2020 07:06:27 GMT, Aleksey Shipilev wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> (a) remove const& from add_if_absent; (b) use EQUALS template param instead of == > > Otherwise looks good, modulo the single signature question in review. Looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/66 From ccheung at openjdk.java.net Tue Sep 8 15:33:31 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Tue, 8 Sep 2020 15:33:31 GMT Subject: RFR: 8250563: Add KVHashtable::add_if_absent [v3] In-Reply-To: References: Message-ID: <_oCiBgy__fHLgVh3lj8thOMnMcdVfYqdUipGeD2aAds=.2e421f78-8bbd-4bdf-b4de-e318d8923bdd@github.com> On Tue, 8 Sep 2020 15:27:49 GMT, Ioi Lam wrote: >> Please review this XS change. I added a new **KVHashtable::add_if_absent** function (modeled after >> ResourceHashtable::put_if_absent from JDK-8244733). >> - I used "add" instead of "put" to be consistent with the naming convention in utility/hashtable.hpp >> - I also fixed a type in the comments in resourceHashtable.hpp >> >> Running mach5 tiers1/2. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > (a) remove const& from add_if_absent; (b) use EQUALS template param instead of == Marked as reviewed by ccheung (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/66 From coleenp at openjdk.java.net Tue Sep 8 15:48:41 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 8 Sep 2020 15:48:41 GMT Subject: RFR: 8250563: Add KVHashtable::add_if_absent [v3] In-Reply-To: References: Message-ID: On Tue, 8 Sep 2020 15:27:49 GMT, Ioi Lam wrote: >> Please review this XS change. I added a new **KVHashtable::add_if_absent** function (modeled after >> ResourceHashtable::put_if_absent from JDK-8244733). >> - I used "add" instead of "put" to be consistent with the naming convention in utility/hashtable.hpp >> - I also fixed a type in the comments in resourceHashtable.hpp >> >> Running mach5 tiers1/2. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > (a) remove const& from add_if_absent; (b) use EQUALS template param instead of == Marked as reviewed by coleenp (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/66 From daniel.daugherty at oracle.com Tue Sep 8 16:15:36 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 8 Sep 2020 12:15:36 -0400 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> Message-ID: <96ad21a3-cae4-2218-b047-6912e6a07b21@oracle.com> Hi Richard, I haven't seen a review from anyone on the Serviceability team and I think you should get a review from them since JVM/TI is involved. Perhaps I missed it... Dan On 9/7/20 10:09 AM, Reingruber, Richard wrote: > Hi, > > I would like to close the review of this change. > > It has received a lot of helpful feedback during the process and 2 full > Reviews. Thanks everybody! > > I'm planning to push it this week on Thursday as solution for JBS items: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > Version to be pushed: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Hope to get my GIT/Skara setup going until then... :) > > Thanks, Richard. > > -----Original Message----- > From: hotspot-compiler-dev On Behalf Of Reingruber, Richard > Sent: Mittwoch, 2. September 2020 23:27 > To: Robbin Ehn ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi Robin, > >> On 2020-09-02 15:48, Reingruber, Richard wrote: >>> Hi Robbin, >>> >>> // taking the discussion back to the mailing lists >>> >>> > I still don't understand why you don't deoptimize the objects inside the >>> > handshake/safepoint instead? >> So for handshakes using asynch handshake and allowing blocking inside >> would fix that. (future fix, I'm working on that now) > Just to make it clear: I'm not fond of the extra suspension mechanism currently > used for JDK-8227745 either. I want to get rid of it and I will work on it. Asynch > handshakes (JDK-8238761) could be a replacement for it. At least I think they > can be used to suspend the target thread. > >> For safepoint, since we have suspended all threads, ~'safepointed them' >> with a JavaThread, you _could_ just execute the action directly (e.g. >> skipping VM_HeapWalkOperation safepoint) since they are suppose to be >> safely suspended until the destructor of EB, no? > Yes, this should be possible. This would be an advanced change though. I would > like EscapeBarriers to be a no-op and fall back to current implementation, if > C2-EscapeAnalysis/Graal are disabled. > >> So I suggest future work to instead just execute the safepoint with the >> requesting JT instead of having a this special safepoiting mechanism. >> Since you are missing above functionality I see why you went this way. >> If you need to push it, it's fine by me. > We will work on further improvements. Top of the list would > be eliminating the extra suspend mechanism. > > The implementation has matured for more than 12 months now [1]. It's been tested > extensively at SAP over that time and passed also extended testing at Oracle > kindly conducted by Vladimir Kozlov. We've got two full Reviews and incorporated > extensive feedback from a number of OpenJDK Reviewers (including you, > thanks!). Based on that I reckon we're good to push the change as enhancement > (JDK-8227745) and bug fix (JDK-8233915). > >> Thanks for explaining once again :) > Pleasure :) > > Thanks, Richard. > > [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028729.html > > -----Original Message----- > From: Robbin Ehn > Sent: Mittwoch, 2. September 2020 16:54 > To: Reingruber, Richard ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi Richard, > > On 2020-09-02 15:48, Reingruber, Richard wrote: >> Hi Robbin, >> >> // taking the discussion back to the mailing lists >> >> > I still don't understand why you don't deoptimize the objects inside the >> > handshake/safepoint instead? > So for handshakes using asynch handshake and allowing blocking inside > would fix that. (future fix, I'm working on that now) > > For safepoint, since we have suspended all threads, ~'safepointed them' > with a JavaThread, you _could_ just execute the action directly (e.g. > skipping VM_HeapWalkOperation safepoint) since they are suppose to be > safely suspended until the destructor of EB, no? > > So I suggest future work to instead just execute the safepoint with the > requesting JT instead of having a this special safepoiting mechanism. > > Since you are missing above functionality I see why you went this way. > If you need to push it, it's fine by me. > > Thanks for explaining once again :) > > /Robbin > >> This is unfortunately not possible. Deoptimizing objects includes reallocating >> scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This >> cannot be done at a safepoint or handshake. >> >> 1. The vm thread is not allowed to allocate on the java heap >> See for instance assertions in ParallelScavengeHeap::mem_allocate() >> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKya9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ >> >> This is not easy to change, I suppose, because it will be difficult to gc if >> necessary. >> >> 2. Using a direct handshake would not work either. The problem there is again >> gc. Let J be the JavaThread that is executing the direct handshake. The vm >> would deadlock if the vm thread waits for J to execute the closure of a >> handshake-all and J waits for the vm thread to execute a gc vm operation. >> Patricio Chilano made me aware of this: https://bugs.openjdk.java.net/browse/JDK-8230594 >> >> Cheers, Richard. >> >> -----Original Message----- >> From: Robbin Ehn >> Sent: Mittwoch, 2. September 2020 13:56 >> To: Reingruber, Richard >> Cc: Lindenmaier, Goetz ; Vladimir Kozlov ; David Holmes >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents >> >> Hi, >> >> I still don't understand why you don't deoptimize the objects inside the >> handshake/safepoint instead? >> >> E.g. >> >> JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the code >> from: >> eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over the >> stack, so: >> >> void >> GetOwnedMonitorInfoClosure::do_thread(Thread *target) { >> assert(target->is_Java_thread(), "just checking"); >> JavaThread *jt = (JavaThread *)target; >> >> if (!jt->is_exiting() && (jt->threadObj() != NULL)) { >> + if (EscapeBarrier::deoptimize_objects(jt, MaxJavaStackTraceDepth)) { >> _result = >> ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, >> _owned_monitors_list); >> } else { >> _result = JVMTI_ERROR_OUT_OF_MEMORY; >> } >> } >> } >> >> Why try 'suspend' the thread first? >> >> >> When we de-optimize all threads why not just in the following safepoint? >> E.g. >> VM_HeapWalkOperation::doit() { >> + EscapeBarrier::deoptimize_objects_all_threads(); >> ... >> } >> >> Thanks, Robbin >> >> From richard.reingruber at sap.com Tue Sep 8 16:45:15 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 8 Sep 2020 16:45:15 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <96ad21a3-cae4-2218-b047-6912e6a07b21@oracle.com> References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> <96ad21a3-cae4-2218-b047-6912e6a07b21@oracle.com> Message-ID: Hi Dan, I'd be very happy about a review from somebody on the Serviceability team. I have asked for reviews many times (kindly I hope). And the change is for review for more than a year now. According to [1] I'd think all requirements to push are met already. But maybe I missed something? After renaming of methods in SafepointMechanism the change needs to be rebased (already done). I'll publish a pull request as soon as possible. Thanks, Richard. [1] https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change -----Original Message----- From: Daniel D. Daugherty Sent: Dienstag, 8. September 2020 18:16 To: Reingruber, Richard ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, I haven't seen a review from anyone on the Serviceability team and I think you should get a review from them since JVM/TI is involved. Perhaps I missed it... Dan On 9/7/20 10:09 AM, Reingruber, Richard wrote: > Hi, > > I would like to close the review of this change. > > It has received a lot of helpful feedback during the process and 2 full > Reviews. Thanks everybody! > > I'm planning to push it this week on Thursday as solution for JBS items: > > https://bugs.openjdk.java.net/browse/JDK-8227745 > https://bugs.openjdk.java.net/browse/JDK-8233915 > > Version to be pushed: > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > Hope to get my GIT/Skara setup going until then... :) > > Thanks, Richard. > > -----Original Message----- > From: hotspot-compiler-dev On Behalf Of Reingruber, Richard > Sent: Mittwoch, 2. September 2020 23:27 > To: Robbin Ehn ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi Robin, > >> On 2020-09-02 15:48, Reingruber, Richard wrote: >>> Hi Robbin, >>> >>> // taking the discussion back to the mailing lists >>> >>> > I still don't understand why you don't deoptimize the objects inside the >>> > handshake/safepoint instead? >> So for handshakes using asynch handshake and allowing blocking inside >> would fix that. (future fix, I'm working on that now) > Just to make it clear: I'm not fond of the extra suspension mechanism currently > used for JDK-8227745 either. I want to get rid of it and I will work on it. Asynch > handshakes (JDK-8238761) could be a replacement for it. At least I think they > can be used to suspend the target thread. > >> For safepoint, since we have suspended all threads, ~'safepointed them' >> with a JavaThread, you _could_ just execute the action directly (e.g. >> skipping VM_HeapWalkOperation safepoint) since they are suppose to be >> safely suspended until the destructor of EB, no? > Yes, this should be possible. This would be an advanced change though. I would > like EscapeBarriers to be a no-op and fall back to current implementation, if > C2-EscapeAnalysis/Graal are disabled. > >> So I suggest future work to instead just execute the safepoint with the >> requesting JT instead of having a this special safepoiting mechanism. >> Since you are missing above functionality I see why you went this way. >> If you need to push it, it's fine by me. > We will work on further improvements. Top of the list would > be eliminating the extra suspend mechanism. > > The implementation has matured for more than 12 months now [1]. It's been tested > extensively at SAP over that time and passed also extended testing at Oracle > kindly conducted by Vladimir Kozlov. We've got two full Reviews and incorporated > extensive feedback from a number of OpenJDK Reviewers (including you, > thanks!). Based on that I reckon we're good to push the change as enhancement > (JDK-8227745) and bug fix (JDK-8233915). > >> Thanks for explaining once again :) > Pleasure :) > > Thanks, Richard. > > [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028729.html > > -----Original Message----- > From: Robbin Ehn > Sent: Mittwoch, 2. September 2020 16:54 > To: Reingruber, Richard ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi Richard, > > On 2020-09-02 15:48, Reingruber, Richard wrote: >> Hi Robbin, >> >> // taking the discussion back to the mailing lists >> >> > I still don't understand why you don't deoptimize the objects inside the >> > handshake/safepoint instead? > So for handshakes using asynch handshake and allowing blocking inside > would fix that. (future fix, I'm working on that now) > > For safepoint, since we have suspended all threads, ~'safepointed them' > with a JavaThread, you _could_ just execute the action directly (e.g. > skipping VM_HeapWalkOperation safepoint) since they are suppose to be > safely suspended until the destructor of EB, no? > > So I suggest future work to instead just execute the safepoint with the > requesting JT instead of having a this special safepoiting mechanism. > > Since you are missing above functionality I see why you went this way. > If you need to push it, it's fine by me. > > Thanks for explaining once again :) > > /Robbin > >> This is unfortunately not possible. Deoptimizing objects includes reallocating >> scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This >> cannot be done at a safepoint or handshake. >> >> 1. The vm thread is not allowed to allocate on the java heap >> See for instance assertions in ParallelScavengeHeap::mem_allocate() >> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKya9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ >> >> This is not easy to change, I suppose, because it will be difficult to gc if >> necessary. >> >> 2. Using a direct handshake would not work either. The problem there is again >> gc. Let J be the JavaThread that is executing the direct handshake. The vm >> would deadlock if the vm thread waits for J to execute the closure of a >> handshake-all and J waits for the vm thread to execute a gc vm operation. >> Patricio Chilano made me aware of this: https://bugs.openjdk.java.net/browse/JDK-8230594 >> >> Cheers, Richard. >> >> -----Original Message----- >> From: Robbin Ehn >> Sent: Mittwoch, 2. September 2020 13:56 >> To: Reingruber, Richard >> Cc: Lindenmaier, Goetz ; Vladimir Kozlov ; David Holmes >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents >> >> Hi, >> >> I still don't understand why you don't deoptimize the objects inside the >> handshake/safepoint instead? >> >> E.g. >> >> JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the code >> from: >> eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over the >> stack, so: >> >> void >> GetOwnedMonitorInfoClosure::do_thread(Thread *target) { >> assert(target->is_Java_thread(), "just checking"); >> JavaThread *jt = (JavaThread *)target; >> >> if (!jt->is_exiting() && (jt->threadObj() != NULL)) { >> + if (EscapeBarrier::deoptimize_objects(jt, MaxJavaStackTraceDepth)) { >> _result = >> ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, >> _owned_monitors_list); >> } else { >> _result = JVMTI_ERROR_OUT_OF_MEMORY; >> } >> } >> } >> >> Why try 'suspend' the thread first? >> >> >> When we de-optimize all threads why not just in the following safepoint? >> E.g. >> VM_HeapWalkOperation::doit() { >> + EscapeBarrier::deoptimize_objects_all_threads(); >> ... >> } >> >> Thanks, Robbin >> >> From martin.thompson at oracle.com Tue Sep 8 16:54:37 2020 From: martin.thompson at oracle.com (Marty Thompson) Date: Tue, 8 Sep 2020 09:54:37 -0700 (PDT) Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> <96ad21a3-cae4-2218-b047-6912e6a07b21@oracle.com> Message-ID: Hello Richard, It would be good if Serguei Spitsyn could review before this is pushed. Serguei is out this week. Can you wait until Serguei is back in the office the week of Sept 14? Regards, Marty > -----Original Message----- > From: Reingruber, Richard > Sent: Tuesday, September 8, 2020 9:45 AM > To: Daniel Daugherty ; serviceability-dev > ; hotspot-compiler- > dev at openjdk.java.net; Hotspot dev runtime dev at openjdk.java.net> > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Dan, > > I'd be very happy about a review from somebody on the Serviceability team. > I have asked for reviews many times (kindly I hope). And the change is for > review for more than a year now. > > According to [1] I'd think all requirements to push are met already. But > maybe I missed something? > > After renaming of methods in SafepointMechanism the change needs to be > rebased (already done). I'll publish a pull request as soon as possible. > > Thanks, Richard. > > [1] > https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change > > -----Original Message----- > From: Daniel D. Daugherty > Sent: Dienstag, 8. September 2020 18:16 > To: Reingruber, Richard ; serviceability-dev > ; hotspot-compiler- > dev at openjdk.java.net; Hotspot dev runtime dev at openjdk.java.net> > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Richard, > > I haven't seen a review from anyone on the Serviceability team and I think > you should get a review from them since JVM/TI is involved. > Perhaps I missed it... > > Dan > > > On 9/7/20 10:09 AM, Reingruber, Richard wrote: > > Hi, > > > > I would like to close the review of this change. > > > > It has received a lot of helpful feedback during the process and 2 > > full Reviews. Thanks everybody! > > > > I'm planning to push it this week on Thursday as solution for JBS items: > > > > https://bugs.openjdk.java.net/browse/JDK-8227745 > > https://bugs.openjdk.java.net/browse/JDK-8233915 > > > > Version to be pushed: > > > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > > > Hope to get my GIT/Skara setup going until then... :) > > > > Thanks, Richard. > > > > -----Original Message----- > > From: hotspot-compiler-dev > > On Behalf Of Reingruber, > > Richard > > Sent: Mittwoch, 2. September 2020 23:27 > > To: Robbin Ehn ; serviceability-dev > > ; > > hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > > > > Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for > > Better Performance in the Presence of JVMTI Agents > > > > Hi Robin, > > > >> On 2020-09-02 15:48, Reingruber, Richard wrote: > >>> Hi Robbin, > >>> > >>> // taking the discussion back to the mailing lists > >>> > >>> > I still don't understand why you don't deoptimize the objects inside > the > >>> > handshake/safepoint instead? > >> So for handshakes using asynch handshake and allowing blocking inside > >> would fix that. (future fix, I'm working on that now) > > Just to make it clear: I'm not fond of the extra suspension mechanism > > currently used for JDK-8227745 either. I want to get rid of it and I > > will work on it. Asynch handshakes (JDK-8238761) could be a > > replacement for it. At least I think they can be used to suspend the target > thread. > > > >> For safepoint, since we have suspended all threads, ~'safepointed them' > >> with a JavaThread, you _could_ just execute the action directly (e.g. > >> skipping VM_HeapWalkOperation safepoint) since they are suppose to be > >> safely suspended until the destructor of EB, no? > > Yes, this should be possible. This would be an advanced change though. > > I would like EscapeBarriers to be a no-op and fall back to current > > implementation, if C2-EscapeAnalysis/Graal are disabled. > > > >> So I suggest future work to instead just execute the safepoint with > >> the requesting JT instead of having a this special safepoiting mechanism. > >> Since you are missing above functionality I see why you went this way. > >> If you need to push it, it's fine by me. > > We will work on further improvements. Top of the list would be > > eliminating the extra suspend mechanism. > > > > The implementation has matured for more than 12 months now [1]. It's > > been tested extensively at SAP over that time and passed also extended > > testing at Oracle kindly conducted by Vladimir Kozlov. We've got two > > full Reviews and incorporated extensive feedback from a number of > > OpenJDK Reviewers (including you, thanks!). Based on that I reckon > > we're good to push the change as enhancement > > (JDK-8227745) and bug fix (JDK-8233915). > > > >> Thanks for explaining once again :) > > Pleasure :) > > > > Thanks, Richard. > > > > [1] > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/02 > > 8729.html > > > > -----Original Message----- > > From: Robbin Ehn > > Sent: Mittwoch, 2. September 2020 16:54 > > To: Reingruber, Richard ; > > serviceability-dev ; > > hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > > > > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > Performance in the Presence of JVMTI Agents > > > > Hi Richard, > > > > On 2020-09-02 15:48, Reingruber, Richard wrote: > >> Hi Robbin, > >> > >> // taking the discussion back to the mailing lists > >> > >> > I still don't understand why you don't deoptimize the objects inside > the > >> > handshake/safepoint instead? > > So for handshakes using asynch handshake and allowing blocking inside > > would fix that. (future fix, I'm working on that now) > > > > For safepoint, since we have suspended all threads, ~'safepointed them' > > with a JavaThread, you _could_ just execute the action directly (e.g. > > skipping VM_HeapWalkOperation safepoint) since they are suppose to be > > safely suspended until the destructor of EB, no? > > > > So I suggest future work to instead just execute the safepoint with > > the requesting JT instead of having a this special safepoiting mechanism. > > > > Since you are missing above functionality I see why you went this way. > > If you need to push it, it's fine by me. > > > > Thanks for explaining once again :) > > > > /Robbin > > > >> This is unfortunately not possible. Deoptimizing objects includes > >> reallocating scalar replaced objects, i.e. calling > >> Deoptimization::realloc_objects(). This cannot be done at a safepoint or > handshake. > >> > >> 1. The vm thread is not allowed to allocate on the java heap > >> See for instance assertions in ParallelScavengeHeap::mem_allocate() > >> > >> > https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e > >> > 045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/par > >> > allelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKy > a > >> 9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ > >> > >> This is not easy to change, I suppose, because it will be difficult to gc if > >> necessary. > >> > >> 2. Using a direct handshake would not work either. The problem there is > again > >> gc. Let J be the JavaThread that is executing the direct handshake. The > vm > >> would deadlock if the vm thread waits for J to execute the closure of a > >> handshake-all and J waits for the vm thread to execute a gc vm > operation. > >> Patricio Chilano made me aware of this: > >> https://bugs.openjdk.java.net/browse/JDK-8230594 > >> > >> Cheers, Richard. > >> > >> -----Original Message----- > >> From: Robbin Ehn > >> Sent: Mittwoch, 2. September 2020 13:56 > >> To: Reingruber, Richard > >> Cc: Lindenmaier, Goetz ; Vladimir Kozlov > >> ; David Holmes > > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >> Performance in the Presence of JVMTI Agents > >> > >> Hi, > >> > >> I still don't understand why you don't deoptimize the objects inside > >> the handshake/safepoint instead? > >> > >> E.g. > >> > >> JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the > >> code > >> from: > >> eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over > >> the stack, so: > >> > >> void > >> GetOwnedMonitorInfoClosure::do_thread(Thread *target) { > >> assert(target->is_Java_thread(), "just checking"); > >> JavaThread *jt = (JavaThread *)target; > >> > >> if (!jt->is_exiting() && (jt->threadObj() != NULL)) { > >> + if (EscapeBarrier::deoptimize_objects(jt, > >> + MaxJavaStackTraceDepth)) { > >> _result = > >> ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, > >> _owned_monitors_list); > >> } else { > >> _result = JVMTI_ERROR_OUT_OF_MEMORY; > >> } > >> } > >> } > >> > >> Why try 'suspend' the thread first? > >> > >> > >> When we de-optimize all threads why not just in the following safepoint? > >> E.g. > >> VM_HeapWalkOperation::doit() { > >> + EscapeBarrier::deoptimize_objects_all_threads(); > >> ... > >> } > >> > >> Thanks, Robbin > >> > >> > From richard.reingruber at sap.com Tue Sep 8 17:02:29 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 8 Sep 2020 17:02:29 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> <96ad21a3-cae4-2218-b047-6912e6a07b21@oracle.com> Message-ID: Hello Marty, Sure. I'd be happy if Serguei could review the change. Thanks, Richard. -----Original Message----- From: Marty Thompson Sent: Dienstag, 8. September 2020 18:55 To: Reingruber, Richard ; Daniel Daugherty ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hello Richard, It would be good if Serguei Spitsyn could review before this is pushed. Serguei is out this week. Can you wait until Serguei is back in the office the week of Sept 14? Regards, Marty > -----Original Message----- > From: Reingruber, Richard > Sent: Tuesday, September 8, 2020 9:45 AM > To: Daniel Daugherty ; serviceability-dev > ; hotspot-compiler- > dev at openjdk.java.net; Hotspot dev runtime dev at openjdk.java.net> > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Dan, > > I'd be very happy about a review from somebody on the Serviceability team. > I have asked for reviews many times (kindly I hope). And the change is for > review for more than a year now. > > According to [1] I'd think all requirements to push are met already. But > maybe I missed something? > > After renaming of methods in SafepointMechanism the change needs to be > rebased (already done). I'll publish a pull request as soon as possible. > > Thanks, Richard. > > [1] > https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change > > -----Original Message----- > From: Daniel D. Daugherty > Sent: Dienstag, 8. September 2020 18:16 > To: Reingruber, Richard ; serviceability-dev > ; hotspot-compiler- > dev at openjdk.java.net; Hotspot dev runtime dev at openjdk.java.net> > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > Hi Richard, > > I haven't seen a review from anyone on the Serviceability team and I think > you should get a review from them since JVM/TI is involved. > Perhaps I missed it... > > Dan > > > On 9/7/20 10:09 AM, Reingruber, Richard wrote: > > Hi, > > > > I would like to close the review of this change. > > > > It has received a lot of helpful feedback during the process and 2 > > full Reviews. Thanks everybody! > > > > I'm planning to push it this week on Thursday as solution for JBS items: > > > > https://bugs.openjdk.java.net/browse/JDK-8227745 > > https://bugs.openjdk.java.net/browse/JDK-8233915 > > > > Version to be pushed: > > > > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ > > > > Hope to get my GIT/Skara setup going until then... :) > > > > Thanks, Richard. > > > > -----Original Message----- > > From: hotspot-compiler-dev > > On Behalf Of Reingruber, > > Richard > > Sent: Mittwoch, 2. September 2020 23:27 > > To: Robbin Ehn ; serviceability-dev > > ; > > hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > > > > Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for > > Better Performance in the Presence of JVMTI Agents > > > > Hi Robin, > > > >> On 2020-09-02 15:48, Reingruber, Richard wrote: > >>> Hi Robbin, > >>> > >>> // taking the discussion back to the mailing lists > >>> > >>> > I still don't understand why you don't deoptimize the objects inside > the > >>> > handshake/safepoint instead? > >> So for handshakes using asynch handshake and allowing blocking inside > >> would fix that. (future fix, I'm working on that now) > > Just to make it clear: I'm not fond of the extra suspension mechanism > > currently used for JDK-8227745 either. I want to get rid of it and I > > will work on it. Asynch handshakes (JDK-8238761) could be a > > replacement for it. At least I think they can be used to suspend the target > thread. > > > >> For safepoint, since we have suspended all threads, ~'safepointed them' > >> with a JavaThread, you _could_ just execute the action directly (e.g. > >> skipping VM_HeapWalkOperation safepoint) since they are suppose to be > >> safely suspended until the destructor of EB, no? > > Yes, this should be possible. This would be an advanced change though. > > I would like EscapeBarriers to be a no-op and fall back to current > > implementation, if C2-EscapeAnalysis/Graal are disabled. > > > >> So I suggest future work to instead just execute the safepoint with > >> the requesting JT instead of having a this special safepoiting mechanism. > >> Since you are missing above functionality I see why you went this way. > >> If you need to push it, it's fine by me. > > We will work on further improvements. Top of the list would be > > eliminating the extra suspend mechanism. > > > > The implementation has matured for more than 12 months now [1]. It's > > been tested extensively at SAP over that time and passed also extended > > testing at Oracle kindly conducted by Vladimir Kozlov. We've got two > > full Reviews and incorporated extensive feedback from a number of > > OpenJDK Reviewers (including you, thanks!). Based on that I reckon > > we're good to push the change as enhancement > > (JDK-8227745) and bug fix (JDK-8233915). > > > >> Thanks for explaining once again :) > > Pleasure :) > > > > Thanks, Richard. > > > > [1] > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/02 > > 8729.html > > > > -----Original Message----- > > From: Robbin Ehn > > Sent: Mittwoch, 2. September 2020 16:54 > > To: Reingruber, Richard ; > > serviceability-dev ; > > hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > > > > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > > Performance in the Presence of JVMTI Agents > > > > Hi Richard, > > > > On 2020-09-02 15:48, Reingruber, Richard wrote: > >> Hi Robbin, > >> > >> // taking the discussion back to the mailing lists > >> > >> > I still don't understand why you don't deoptimize the objects inside > the > >> > handshake/safepoint instead? > > So for handshakes using asynch handshake and allowing blocking inside > > would fix that. (future fix, I'm working on that now) > > > > For safepoint, since we have suspended all threads, ~'safepointed them' > > with a JavaThread, you _could_ just execute the action directly (e.g. > > skipping VM_HeapWalkOperation safepoint) since they are suppose to be > > safely suspended until the destructor of EB, no? > > > > So I suggest future work to instead just execute the safepoint with > > the requesting JT instead of having a this special safepoiting mechanism. > > > > Since you are missing above functionality I see why you went this way. > > If you need to push it, it's fine by me. > > > > Thanks for explaining once again :) > > > > /Robbin > > > >> This is unfortunately not possible. Deoptimizing objects includes > >> reallocating scalar replaced objects, i.e. calling > >> Deoptimization::realloc_objects(). This cannot be done at a safepoint or > handshake. > >> > >> 1. The vm thread is not allowed to allocate on the java heap > >> See for instance assertions in ParallelScavengeHeap::mem_allocate() > >> > >> > https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e > >> > 045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/par > >> > allelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKy > a > >> 9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ > >> > >> This is not easy to change, I suppose, because it will be difficult to gc if > >> necessary. > >> > >> 2. Using a direct handshake would not work either. The problem there is > again > >> gc. Let J be the JavaThread that is executing the direct handshake. The > vm > >> would deadlock if the vm thread waits for J to execute the closure of a > >> handshake-all and J waits for the vm thread to execute a gc vm > operation. > >> Patricio Chilano made me aware of this: > >> https://bugs.openjdk.java.net/browse/JDK-8230594 > >> > >> Cheers, Richard. > >> > >> -----Original Message----- > >> From: Robbin Ehn > >> Sent: Mittwoch, 2. September 2020 13:56 > >> To: Reingruber, Richard > >> Cc: Lindenmaier, Goetz ; Vladimir Kozlov > >> ; David Holmes > > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >> Performance in the Presence of JVMTI Agents > >> > >> Hi, > >> > >> I still don't understand why you don't deoptimize the objects inside > >> the handshake/safepoint instead? > >> > >> E.g. > >> > >> JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the > >> code > >> from: > >> eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over > >> the stack, so: > >> > >> void > >> GetOwnedMonitorInfoClosure::do_thread(Thread *target) { > >> assert(target->is_Java_thread(), "just checking"); > >> JavaThread *jt = (JavaThread *)target; > >> > >> if (!jt->is_exiting() && (jt->threadObj() != NULL)) { > >> + if (EscapeBarrier::deoptimize_objects(jt, > >> + MaxJavaStackTraceDepth)) { > >> _result = > >> ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, > >> _owned_monitors_list); > >> } else { > >> _result = JVMTI_ERROR_OUT_OF_MEMORY; > >> } > >> } > >> } > >> > >> Why try 'suspend' the thread first? > >> > >> > >> When we de-optimize all threads why not just in the following safepoint? > >> E.g. > >> VM_HeapWalkOperation::doit() { > >> + EscapeBarrier::deoptimize_objects_all_threads(); > >> ... > >> } > >> > >> Thanks, Robbin > >> > >> > From iklam at openjdk.java.net Tue Sep 8 18:30:25 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 8 Sep 2020 18:30:25 GMT Subject: Integrated: 8250563: Add KVHashtable::add_if_absent In-Reply-To: References: Message-ID: On Tue, 8 Sep 2020 01:31:21 GMT, Ioi Lam wrote: > Please review this XS change. I added a new **KVHashtable::add_if_absent** function (modeled after > ResourceHashtable::put_if_absent from JDK-8244733). > - I used "add" instead of "put" to be consistent with the naming convention in utility/hashtable.hpp > - I also fixed a type in the comments in resourceHashtable.hpp > > Running mach5 tiers1/2. This pull request has now been integrated. Changeset: 001e51d9 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/001e51d9 Stats: 34 lines in 4 files changed: 3 ins; 21 del; 10 mod 8250563: Add KVHashtable::add_if_absent Reviewed-by: ccheung, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/66 From ccheung at openjdk.java.net Tue Sep 8 18:49:15 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Tue, 8 Sep 2020 18:49:15 GMT Subject: RFR: 8249625: cleanup unused SkippedException in the tests under cds/appcds/dynamicArchive/methodHandles Message-ID: A simple fix to remove two unused import statements and a line of unneeded code from the tests in appcds/dynamicArchive/methodHandles. Tested locally on linux-x64. Running mach5 tiers 1 and 2 tests. ------------- Commit messages: - 8249625: cleanup unused SkippedException in the tests under cds/appcds/dynamicArchive/methodHandles Changes: https://git.openjdk.java.net/jdk/pull/83/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=83&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8249625 Stats: 35 lines in 7 files changed: 0 ins; 35 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/83.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/83/head:pull/83 PR: https://git.openjdk.java.net/jdk/pull/83 From iklam at openjdk.java.net Tue Sep 8 18:59:59 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 8 Sep 2020 18:59:59 GMT Subject: RFR: 8249625: cleanup unused SkippedException in the tests under cds/appcds/dynamicArchive/methodHandles In-Reply-To: References: Message-ID: On Tue, 8 Sep 2020 18:40:16 GMT, Calvin Cheung wrote: > A simple fix to remove two unused import statements and a line of unneeded code from the tests in > appcds/dynamicArchive/methodHandles. > Tested locally on linux-x64. > Running mach5 tiers 1 and 2 tests. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/83 From iklam at openjdk.java.net Tue Sep 8 19:00:00 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 8 Sep 2020 19:00:00 GMT Subject: RFR: 8249625: cleanup unused SkippedException in the tests under cds/appcds/dynamicArchive/methodHandles In-Reply-To: References: Message-ID: On Tue, 8 Sep 2020 18:56:51 GMT, Ioi Lam wrote: >> A simple fix to remove two unused import statements and a line of unneeded code from the tests in >> appcds/dynamicArchive/methodHandles. >> Tested locally on linux-x64. >> Running mach5 tiers 1 and 2 tests. > > Marked as reviewed by iklam (Reviewer). Looks good. This is a trivial change. ------------- PR: https://git.openjdk.java.net/jdk/pull/83 From ccheung at openjdk.java.net Tue Sep 8 19:03:32 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Tue, 8 Sep 2020 19:03:32 GMT Subject: Integrated: 8249625: cleanup unused SkippedException in the tests under cds/appcds/dynamicArchive/methodHandles In-Reply-To: References: Message-ID: On Tue, 8 Sep 2020 18:40:16 GMT, Calvin Cheung wrote: > A simple fix to remove two unused import statements and a line of unneeded code from the tests in > appcds/dynamicArchive/methodHandles. > Tested locally on linux-x64. > Running mach5 tiers 1 and 2 tests. This pull request has now been integrated. Changeset: e20004d7 Author: Calvin Cheung URL: https://git.openjdk.java.net/jdk/commit/e20004d7 Stats: 35 lines in 7 files changed: 35 ins; 0 del; 0 mod 8249625: cleanup unused SkippedException in the tests under cds/appcds/dynamicArchive/methodHandles Reviewed-by: iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/83 From erikj at openjdk.java.net Tue Sep 8 19:27:08 2020 From: erikj at openjdk.java.net (Erik Joelsson) Date: Tue, 8 Sep 2020 19:27:08 GMT Subject: RFR: 8244778: Archive full module graph in CDS In-Reply-To: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: On Tue, 8 Sep 2020 15:59:33 GMT, Ioi Lam wrote: > This is the same patch as > [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) > published in > [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). > The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. Build changes look good. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/80 From coleen.phillimore at oracle.com Tue Sep 8 21:41:19 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 8 Sep 2020 17:41:19 -0400 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: References: Message-ID: <1e77549c-5ad1-6a6b-63e7-b539e7639ba8@oracle.com> On 9/5/20 4:47 AM, Thomas St?fe wrote: > Hi all, > > This is Round Two of the review for JEP 387 "Elastic Metaspace". > Please find the first round of reviews here: > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html > > I did massage in all feedback - plus some smaller changes from me - > and here is the new version: > > Full Webrev: > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/ > Patch applies cleanly atop?of "8247352: improve error messages for > sealed classes and records" > > Parts: > Core: > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-core/webrev/ > Test: > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-test/webrev/ > Misc: > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-misc/webrev/ > > Delta webrev (somewhat useless due to many file renamings): > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/delta.webrev/webrev/ > > List of fine granular commits since last webrev: > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/changes_since_last_review.txt > > Review guide (updated; added FAQ): > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/guide/review-guide.pdf > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/guide/review-guide.html > > JEP: https://openjdk.java.net/jeps/387 > > ------ > > What changed? > > 1) Aesthetics > > Some of you requested style changes so there are a lot: > > - Most files in memory/metaspace/ now follow a common theme with a > common prefix ("ms"). The only exception is > metaspacesSizesSnapshot.(cpp|hpp) which I plan to remove in a follow > up (see JDK-8251342). I really don't like the ms prefix on all the files at all.? I didn't think there were any conflicting names in the metaspace directory but if there were, they could be renamed.? Now the class names don't match the file names!? The other thing about sort of generic names in the metspace directory, even if they don't match other names in the system, like binList.hpp or blockTree.hpp, is that they can provide a hint to people before they write similar code.? These could be a basis for generalization into the utilities directory if possible. http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msContext.cpp.html For example in this case, the class name is MetaspaceContext which is a much better name than MSContext. Sorry I didn't object to this sooner on the thread. More scattered comments: http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/gc/shared/genCollectedHeap.cpp.udiff.html Why is MetaspaceSizesSnapshot included? http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msAllocationGuard.hpp.html This file defines a class called Prefix.? Maybe this could be Metaprefix to follow the (somewhat overused) Metaword naming convention. http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msArena.cpp.html The renaming of SpaceManager to MetaspaceArena is one of my favorite things about this change. http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msChunklevel.hpp.html I find this namespace nesting and naming inconsistent.? Can you just make this ChunkLevel an AllStatic class??? And if it's too generic, it could be MetachunkLevel instead. http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msPrintMetaspaceInfoKlassClosure.hpp.html http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msPrintCLDMetaspaceInfoClosure.hpp.html Can you just include this code into the files that use it?? The closure seems pretty specific to the caller (4 less files). Besides the naming, I don't see anything major anymore that would prevent me from hitting approved in 'git' when you send your pull request. thanks, Coleen > - Files in gtest/metaspace have also been partly renamed to be more > consistent > - Did rename enums, members and structures to adhere to hotspot C++ > style guide > - Provided constructors for most structures > - Did use placement new where appropriate > - Switched to enum class where possible. Note though that enum classes > do not allow (easily) to iterate over its values, therefore I cannot > use it for MetaspaceType, MetadataType. > - Removed metaspaceEnum.cpp/hpp and moved all those helper functions > back to metaspace.cpp/hpp. I also renamed them back to their original > names (e.g. metaspace::is_class() -> > Metaspace::is_class_space_allocation()) to reduce the total patch > size. These renamings will show up in the delta diff though. > - gtest: Did remove all includes from metaspaceTestCommon.hpp and > moved them to the individual cpp files as Coleen suggested > - Fixed include guard names and copyrights > - Whitespace- and empty-line-cleanup > > http://hg.openjdk.java.net/jdk/sandbox/rev/018c370bbbd5 ?Rename static > constants according to naming laws > http://hg.openjdk.java.net/jdk/sandbox/rev/d509e2c607c7 > ?MetaspaceReporter::ReportFlag: rename and make enum class > http://hg.openjdk.java.net/jdk/sandbox/rev/b029ae20f238 > ?Metachunk::state_t rename and enum class > http://hg.openjdk.java.net/jdk/sandbox/rev/8e285bdfcca1 ?Correct > formatting (tabs) > http://hg.openjdk.java.net/jdk/sandbox/rev/a14ba4a2fb8c ?Remove > superfluous spaces > http://hg.openjdk.java.net/jdk/sandbox/rev/223c19deea53 ?Add > constructors to structs > http://hg.openjdk.java.net/jdk/sandbox/rev/0bf8e94f9e09 ?Rename struct > members to follow naming laws > http://hg.openjdk.java.net/jdk/sandbox/rev/29513e1f6d49 ?Remove unused > struct BitCounterClosure > http://hg.openjdk.java.net/jdk/sandbox/rev/7facf1afd137 ?Wholesale > rename all structs to adhere to naming laws > http://hg.openjdk.java.net/jdk/sandbox/rev/d09f635f3a07 ?Wholesale > file renamings <<< Thats the big rename change <<< > http://hg.openjdk.java.net/jdk/sandbox/rev/a56e69aba3b1 ?Fix include > guard names in gtest headers > http://hg.openjdk.java.net/jdk/sandbox/rev/788f6454226f ?Fix > copyrights in new gtest sources > http://hg.openjdk.java.net/jdk/sandbox/rev/bcb4bbc71cc9 ?Remove > metaspaceEnums.cpp,hpp > http://hg.openjdk.java.net/jdk/sandbox/rev/a3d33ce1b9db ?Rename > FreeBlocks::_v to _blocks > http://hg.openjdk.java.net/jdk/sandbox/rev/fe5aa08178d3 ?Fix include > guard names > http://hg.openjdk.java.net/jdk/sandbox/rev/1e89f4dfd8a7 ?Remove empty > lines > http://hg.openjdk.java.net/jdk/sandbox/rev/f7bf12bdbfe4 ?Gtests: > consistent naming of context instances > http://hg.openjdk.java.net/jdk/sandbox/rev/551226c61470 ?Beautify > comment for MetaspaceArena > > > 2) Changes to BlockTree and friends > > Both Leo and Richard reviewed the free block management in detail. > Changes: > > - Made all BlockTree functions non-recursive. > - Did remove offending typedefs (BinListXX etc) > - Richard found two bugs (BlockTree::find_closest_fit, > BlockTree::remove_from_list); I fixed those and wrote regression > gtests (see test_blocktree.cpp) > - In BlockTree::insert, renamed "forebear" argument to "insertion_point" > - Did consistently rename "get_block" to "remove_block" in all classes > - Tweaked gtests for BlockTree and BinList to make them clearer > - Did remove the "splinter threshold" logic in FreeBlocks; that logic > was used to determine when chopping off remainder space from a > returned block was worth the effort (see FreeBlocks::remove_block()). > But it is always worth doing, so no need for this threshold. > - Did remove, from BlockTree, the "largest block added" logic since > Leo convinced me it was less useful than I thought. > > http://hg.openjdk.java.net/jdk/sandbox/rev/78a34af45cb8 ?make > BlockTree::print_tree non-recursive > http://hg.openjdk.java.net/jdk/sandbox/rev/00246785a20e ?Grooming > BlockTree (and various unrelated fixes) > http://hg.openjdk.java.net/jdk/sandbox/rev/24ed785d5c51 ?Simplify, > comment BlockTree_basic_siblings test > http://hg.openjdk.java.net/jdk/sandbox/rev/e1df0dbc7cc9 ?Fix > BlockTree::find_closest_fit and add test > http://hg.openjdk.java.net/jdk/sandbox/rev/0284f4705973 ?Rename Forebear > http://hg.openjdk.java.net/jdk/sandbox/rev/4340648bd624 ?Remove > BinList8, BinList16, BinList64, SmallBlocksType typedef > http://hg.openjdk.java.net/jdk/sandbox/rev/7c74250c35fd ?Clarify > BlockTree gtest > http://hg.openjdk.java.net/jdk/sandbox/rev/4680ad0ae1db ?Remove > largest-block-added optimization from BlockTree > http://hg.openjdk.java.net/jdk/sandbox/rev/255d1c34356f ?FreeBlocks, > BinList, BlockTree: code grooming > http://hg.openjdk.java.net/jdk/sandbox/rev/e07c51de3056 ?Remove unused > splinter_threshold from FreeBlocks > > 3) Coleen asked me to remove the gtest for Metaspace reporting, and > instead to provide a jtreg test. I then found that I had > written?bearish such a test already, and just expanded it a bit. I > also re-enabled it for ZGC, for some reason it had been disabled. > > http://hg.openjdk.java.net/jdk/sandbox/rev/abba106ec3c0 ?Extend cases > for PrintMetaspaceDcmd > > 4) Did some code grooming for allocation guards, nothing major, and > added a new gtest, using the newly discovered assertion test feature > :), to test that we notice overwriters in metaspace: > > http://hg.openjdk.java.net/jdk/sandbox/rev/07089d3e4be4 ?Reform > allocation guard coding and add gtest > > 5) I did beef up the gtests for chunk enlargement, and made them > easier to read: > > http://hg.openjdk.java.net/jdk/sandbox/rev/8fe76f9ad8ee > ?Improve/extend testing for chunk enlargement > > 6) Did remove a number of dead code sections > > http://hg.openjdk.java.net/jdk/sandbox/rev/29513e1f6d49 ?Remove unused > struct BitCounterClosure > http://hg.openjdk.java.net/jdk/sandbox/rev/50bfadc554cd ?Remove dead > code in metaspaceContext.cpp > http://hg.openjdk.java.net/jdk/sandbox/rev/f00879c58720 ?Remove dead > gtest files > http://hg.openjdk.java.net/jdk/sandbox/rev/074a374e9c5c ?Remove a dead > portion of code > http://hg.openjdk.java.net/jdk/sandbox/rev/b10cf8275580 ?Remove > setting which had been effectively unused. > > 7) I removed the "slow" parameter from all "::verify()" methods since > it was not that useful. Expensive code section I place now inside a > SOMETIMES clause which executes the designated code sometimes, at > regular intervals, controlled via VerifyMetaspaceInterval (e.g. > -XX:VerifyMetaspaceInterval=1 will execute the code always. That way > we don't pay the full performance loss for these tests but still run > them occasionally. > > http://hg.openjdk.java.net/jdk/sandbox/rev/f18d566c5875 ?Rework > xxx::verify() functions > > 8) While testing on 32bit I found an error in destruction of > MetaspaceTestContext? where I forgot to unmap the ReservedSpace for > the space-provided-from-outside case (which simulates ccs): > http://hg.openjdk.java.net/jdk/sandbox/rev/d4f358b658a5 clean up > spaces for non-expandable test contexts > > ----- > > Follow up items: > > I refrained from doing too large changes to not disturb the review > process,?and to keep the?patch stable. Changes which are not strictly > necessary but maybe a good idea I collect in follow up issues to be > done once this patch is upstream: > > https://bugs.openjdk.java.net/browse/JDK-8251342 "Rework JFR metaspace > free chunk statistics after JEP 387" > https://bugs.openjdk.java.net/browse/JDK-8251392 "Brush up and > consolidate Metaspace statistics after JEP 387" > https://bugs.openjdk.java.net/browse/JDK-8252014 "Find a better place > for counter utility classes after JEP387" > https://bugs.openjdk.java.net/browse/JDK-8252132 "Investigate > MetaspaceArena locking after JEP387" > https://bugs.openjdk.java.net/browse/JDK-8252187 "Optimize freeblocks > storage in MetaspaceArena after JEP387" > https://bugs.openjdk.java.net/browse/JDK-8252189 "Clarify meaning of > OOM texts for out-of-metaspace errors" > > ----- > > Thanks alot for your review work! I know it is hard, and it is very > appreciated.?I hope we now got all across-the-board style changes > done, so the next delta diff will be easier to read. > > Also, feel free to contact me in case you have quick questions, or for > a quick zoom meeting should that be easier. Please note that starting > today I will have vacation but should be back by mid September. > > Cheers, Thomas > > > > > From david.holmes at oracle.com Tue Sep 8 22:29:03 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 9 Sep 2020 08:29:03 +1000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> <96ad21a3-cae4-2218-b047-6912e6a07b21@oracle.com> Message-ID: <9c1f2053-2055-42a0-1fd6-94793c0ff2e2@oracle.com> Hi Richard, I suspect this one fell off the radar due to the extended review period. The actual review started last December (there was prior discussion IIRC) and only seemed to get partial reviews. I only looked at some parts. Robbin may have given things a deeper look, but seemed focused on the handshake aspects. Vladimir said he would do a full review but I can't find it. Eventually Martin and Goetz took over reviewing and everyone else dropped off. :( As this covers a number of areas it really does need "approval" from each area (and yes the hotspot wiki should reflect this). I will try to take another look while we await Serguei's return (and I never did follow up on the problem I had with the nested lock elimination handling. :( ). Meanwhile this will need to be converted to a PR in any case. Thanks, David On 9/09/2020 3:02 am, Reingruber, Richard wrote: > Hello Marty, > > Sure. I'd be happy if Serguei could review the change. > > Thanks, Richard. > > -----Original Message----- > From: Marty Thompson > Sent: Dienstag, 8. September 2020 18:55 > To: Reingruber, Richard ; Daniel Daugherty ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hello Richard, > > It would be good if Serguei Spitsyn could review before this is pushed. Serguei is out this week. Can you wait until Serguei is back in the office the week of Sept 14? > > Regards, > > Marty > >> -----Original Message----- >> From: Reingruber, Richard >> Sent: Tuesday, September 8, 2020 9:45 AM >> To: Daniel Daugherty ; serviceability-dev >> ; hotspot-compiler- >> dev at openjdk.java.net; Hotspot dev runtime > dev at openjdk.java.net> >> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance >> in the Presence of JVMTI Agents >> >> Hi Dan, >> >> I'd be very happy about a review from somebody on the Serviceability team. >> I have asked for reviews many times (kindly I hope). And the change is for >> review for more than a year now. >> >> According to [1] I'd think all requirements to push are met already. But >> maybe I missed something? >> >> After renaming of methods in SafepointMechanism the change needs to be >> rebased (already done). I'll publish a pull request as soon as possible. >> >> Thanks, Richard. >> >> [1] >> https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change >> >> -----Original Message----- >> From: Daniel D. Daugherty >> Sent: Dienstag, 8. September 2020 18:16 >> To: Reingruber, Richard ; serviceability-dev >> ; hotspot-compiler- >> dev at openjdk.java.net; Hotspot dev runtime > dev at openjdk.java.net> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance >> in the Presence of JVMTI Agents >> >> Hi Richard, >> >> I haven't seen a review from anyone on the Serviceability team and I think >> you should get a review from them since JVM/TI is involved. >> Perhaps I missed it... >> >> Dan >> >> >> On 9/7/20 10:09 AM, Reingruber, Richard wrote: >>> Hi, >>> >>> I would like to close the review of this change. >>> >>> It has received a lot of helpful feedback during the process and 2 >>> full Reviews. Thanks everybody! >>> >>> I'm planning to push it this week on Thursday as solution for JBS items: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>> https://bugs.openjdk.java.net/browse/JDK-8233915 >>> >>> Version to be pushed: >>> >>> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ >>> >>> Hope to get my GIT/Skara setup going until then... :) >>> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: hotspot-compiler-dev >>> On Behalf Of Reingruber, >>> Richard >>> Sent: Mittwoch, 2. September 2020 23:27 >>> To: Robbin Ehn ; serviceability-dev >>> ; >>> hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime >>> >>> Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for >>> Better Performance in the Presence of JVMTI Agents >>> >>> Hi Robin, >>> >>>> On 2020-09-02 15:48, Reingruber, Richard wrote: >>>>> Hi Robbin, >>>>> >>>>> // taking the discussion back to the mailing lists >>>>> >>>>> > I still don't understand why you don't deoptimize the objects inside >> the >>>>> > handshake/safepoint instead? >>>> So for handshakes using asynch handshake and allowing blocking inside >>>> would fix that. (future fix, I'm working on that now) >>> Just to make it clear: I'm not fond of the extra suspension mechanism >>> currently used for JDK-8227745 either. I want to get rid of it and I >>> will work on it. Asynch handshakes (JDK-8238761) could be a >>> replacement for it. At least I think they can be used to suspend the target >> thread. >>> >>>> For safepoint, since we have suspended all threads, ~'safepointed them' >>>> with a JavaThread, you _could_ just execute the action directly (e.g. >>>> skipping VM_HeapWalkOperation safepoint) since they are suppose to be >>>> safely suspended until the destructor of EB, no? >>> Yes, this should be possible. This would be an advanced change though. >>> I would like EscapeBarriers to be a no-op and fall back to current >>> implementation, if C2-EscapeAnalysis/Graal are disabled. >>> >>>> So I suggest future work to instead just execute the safepoint with >>>> the requesting JT instead of having a this special safepoiting mechanism. >>>> Since you are missing above functionality I see why you went this way. >>>> If you need to push it, it's fine by me. >>> We will work on further improvements. Top of the list would be >>> eliminating the extra suspend mechanism. >>> >>> The implementation has matured for more than 12 months now [1]. It's >>> been tested extensively at SAP over that time and passed also extended >>> testing at Oracle kindly conducted by Vladimir Kozlov. We've got two >>> full Reviews and incorporated extensive feedback from a number of >>> OpenJDK Reviewers (including you, thanks!). Based on that I reckon >>> we're good to push the change as enhancement >>> (JDK-8227745) and bug fix (JDK-8233915). >>> >>>> Thanks for explaining once again :) >>> Pleasure :) >>> >>> Thanks, Richard. >>> >>> [1] >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/02 >>> 8729.html >>> >>> -----Original Message----- >>> From: Robbin Ehn >>> Sent: Mittwoch, 2. September 2020 16:54 >>> To: Reingruber, Richard ; >>> serviceability-dev ; >>> hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime >>> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>> Performance in the Presence of JVMTI Agents >>> >>> Hi Richard, >>> >>> On 2020-09-02 15:48, Reingruber, Richard wrote: >>>> Hi Robbin, >>>> >>>> // taking the discussion back to the mailing lists >>>> >>>> > I still don't understand why you don't deoptimize the objects inside >> the >>>> > handshake/safepoint instead? >>> So for handshakes using asynch handshake and allowing blocking inside >>> would fix that. (future fix, I'm working on that now) >>> >>> For safepoint, since we have suspended all threads, ~'safepointed them' >>> with a JavaThread, you _could_ just execute the action directly (e.g. >>> skipping VM_HeapWalkOperation safepoint) since they are suppose to be >>> safely suspended until the destructor of EB, no? >>> >>> So I suggest future work to instead just execute the safepoint with >>> the requesting JT instead of having a this special safepoiting mechanism. >>> >>> Since you are missing above functionality I see why you went this way. >>> If you need to push it, it's fine by me. >>> >>> Thanks for explaining once again :) >>> >>> /Robbin >>> >>>> This is unfortunately not possible. Deoptimizing objects includes >>>> reallocating scalar replaced objects, i.e. calling >>>> Deoptimization::realloc_objects(). This cannot be done at a safepoint or >> handshake. >>>> >>>> 1. The vm thread is not allowed to allocate on the java heap >>>> See for instance assertions in ParallelScavengeHeap::mem_allocate() >>>> >>>> >> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e >>>> >> 045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/par >>>> >> allelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKy >> a >>>> 9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ >>>> >>>> This is not easy to change, I suppose, because it will be difficult to gc if >>>> necessary. >>>> >>>> 2. Using a direct handshake would not work either. The problem there is >> again >>>> gc. Let J be the JavaThread that is executing the direct handshake. The >> vm >>>> would deadlock if the vm thread waits for J to execute the closure of a >>>> handshake-all and J waits for the vm thread to execute a gc vm >> operation. >>>> Patricio Chilano made me aware of this: >>>> https://bugs.openjdk.java.net/browse/JDK-8230594 >>>> >>>> Cheers, Richard. >>>> >>>> -----Original Message----- >>>> From: Robbin Ehn >>>> Sent: Mittwoch, 2. September 2020 13:56 >>>> To: Reingruber, Richard >>>> Cc: Lindenmaier, Goetz ; Vladimir Kozlov >>>> ; David Holmes >> >>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>> Performance in the Presence of JVMTI Agents >>>> >>>> Hi, >>>> >>>> I still don't understand why you don't deoptimize the objects inside >>>> the handshake/safepoint instead? >>>> >>>> E.g. >>>> >>>> JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the >>>> code >>>> from: >>>> eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over >>>> the stack, so: >>>> >>>> void >>>> GetOwnedMonitorInfoClosure::do_thread(Thread *target) { >>>> assert(target->is_Java_thread(), "just checking"); >>>> JavaThread *jt = (JavaThread *)target; >>>> >>>> if (!jt->is_exiting() && (jt->threadObj() != NULL)) { >>>> + if (EscapeBarrier::deoptimize_objects(jt, >>>> + MaxJavaStackTraceDepth)) { >>>> _result = >>>> ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, >>>> _owned_monitors_list); >>>> } else { >>>> _result = JVMTI_ERROR_OUT_OF_MEMORY; >>>> } >>>> } >>>> } >>>> >>>> Why try 'suspend' the thread first? >>>> >>>> >>>> When we de-optimize all threads why not just in the following safepoint? >>>> E.g. >>>> VM_HeapWalkOperation::doit() { >>>> + EscapeBarrier::deoptimize_objects_all_threads(); >>>> ... >>>> } >>>> >>>> Thanks, Robbin >>>> >>>> >> From dholmes at openjdk.java.net Wed Sep 9 00:11:23 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Sep 2020 00:11:23 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port In-Reply-To: References: Message-ID: On Mon, 7 Sep 2020 11:23:28 GMT, Aleksei Voitylov wrote: > continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > >> The download side of using JNI in these tests is that it complicates the >> setup a bit for those that run jtreg directly and/or just build the JDK >> and not the test libraries. You could reduce this burden a bit by >> limiting the load library/isMusl check to Linux only, meaning isMusl >> would not be called on other platforms. >> >> The alternative you suggest above might indeed be better. I assume you >> don't mean splitting the tests but rather just adding a second @test >> description so that the vm.musl case runs the test with a system >> property that allows the test know the expected load library path behavior. > > I have updated the PR to split the two tests in multiple @test s. > >> The updated comment in java_md.c in this looks good. A minor comment on >> Platform.isBusybox is Files.isSymbolicLink returning true implies that >> the link exists so no need to check for exists too. Also the >> if-then-else style for the new class in ProcessBuilder/Basic.java is >> inconsistent with the rest of the test so it stands out. > > Thank you, these changes are done in the updated PR. > >> Given the repo transition this weekend then I assume you'll create a PR >> for the final review at least. Also I see JEP 386 hasn't been targeted >> yet but I assume Boris, as owner, will propose-to-target and wait for it >> to be targeted before it is integrated. > > Yes. How can this be best accomplished with the new git workflow? > - we can continue the review process till the end and I will request the integration to happen only after the JEP is > targeted. I guess this step is now done by typing "slash integrate" in a comment. > - we can pause the review process now until the JEP is targeted. > > In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. Attempting to use the GitHub UI for further review. If this doesn't work out well I will revert to direct email. make/autoconf/platform.m4 line 536: > 534: AC_SUBST(HOTSPOT_$1_CPU_DEFINE) > 535: > 536: if test "x$OPENJDK_$1_LIBC" = "xmusl"; then I'm not clear why we only check for musl when setting the HOTSPOT_$1_LIBC variable src/hotspot/os/linux/os_linux.cpp line 624: > 622: // confstr() from musl libc returns EINVAL for > 623: // _CS_GNU_LIBC_VERSION and _CS_GNU_LIBPTHREAD_VERSION > 624: os::Linux::set_libc_version("unknown"); This should be "musl - unknown" as we don't know an exact version but we do know that it is musl. src/hotspot/os/linux/os_linux.cpp line 625: > 623: // _CS_GNU_LIBC_VERSION and _CS_GNU_LIBPTHREAD_VERSION > 624: os::Linux::set_libc_version("unknown"); > 625: os::Linux::set_libpthread_version("unknown"); This should be "musl - unknown" as we don't know an exact version but we do know that it is musl. src/hotspot/share/runtime/abstract_vm_version.cpp line 263: > 261: #define LIBC_STR "-" XSTR(LIBC) > 262: #else > 263: #define LIBC_STR "" Again I'm not clear why we do nothing in the non-musl case? Shouldn't we be reporting glibc or musl? src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c line 284: > 282: // To improve portability across platforms and avoid conflicts > 283: // between GNU and XSI versions of strerror_r, plain strerror is used. > 284: // It's safe because this code is not used in any multithreaded environment. I still question this assertion. The issue is not that the current code path that leads to strerror use may be executed concurrently but that any other strerror use could be concurrent with this one. I would consider this a "must fix" if not for the fact we already use strerror in the code and so this doesn't really change the exposure to the problem. test/hotspot/jtreg/runtime/StackGuardPages/exeinvoke.c line 282: > 280: > 281: pthread_attr_init(&thread_attr); > 282: pthread_attr_setstacksize(&thread_attr, stack_size); Just a comment in response to the explanation as to why this change is needed. If the default thread stacksize under musl is insufficient to successfully attach such a thread to the VM then this will cause problems for applications that embed the VM directly (or which otherwise directly attach existing threads). test/hotspot/jtreg/runtime/TLS/exestack-tls.c line 60: > 58: } > 59: > 60: #if defined(__GLIBC) Why do we use this form here but at line 30 we have: #ifdef __GLIBC__ ? ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/49 From richard.reingruber at sap.com Wed Sep 9 07:14:15 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 9 Sep 2020 07:14:15 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <9c1f2053-2055-42a0-1fd6-94793c0ff2e2@oracle.com> References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> <96ad21a3-cae4-2218-b047-6912e6a07b21@oracle.com> <9c1f2053-2055-42a0-1fd6-94793c0ff2e2@oracle.com> Message-ID: > Hi Richard, > I suspect this one fell off the radar due to the extended review period. > The actual review started last December (there was prior discussion > IIRC) and only seemed to get partial reviews. I only looked at some > parts. Robbin may have given things a deeper look, but seemed focused on > the handshake aspects. Vladimir said he would do a full review but I > can't find it. Eventually Martin and Goetz took over reviewing and > everyone else dropped off. :( That's how it went I reckon. I repeatedly asked for feedback and reviews, and also tried to keep Vladimir, Robbin, and you in the loop addressing you directly (e.g. [1]) > As this covers a number of areas it really does need "approval" from > each area (and yes the hotspot wiki should reflect this). I agree. The wiki should define that in a clear manner. And the community should be involved in that definition. > I will try to take another look while we await Serguei's return (and I > never did follow up on the problem I had with the nested lock > elimination handling. :( ). Thanks for doing it. > Meanwhile this will need to be converted to a PR in any case. I hope to get the PR out later but we've got a team outing today... we haven't seen each other since months... :) Cheers, Richard. [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/030911.html -----Original Message----- From: David Holmes Sent: Mittwoch, 9. September 2020 00:29 To: Reingruber, Richard ; Marty Thompson ; Daniel Daugherty ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime ; Robbin Ehn ; Vladimir Kozlov Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, I suspect this one fell off the radar due to the extended review period. The actual review started last December (there was prior discussion IIRC) and only seemed to get partial reviews. I only looked at some parts. Robbin may have given things a deeper look, but seemed focused on the handshake aspects. Vladimir said he would do a full review but I can't find it. Eventually Martin and Goetz took over reviewing and everyone else dropped off. :( As this covers a number of areas it really does need "approval" from each area (and yes the hotspot wiki should reflect this). I will try to take another look while we await Serguei's return (and I never did follow up on the problem I had with the nested lock elimination handling. :( ). Meanwhile this will need to be converted to a PR in any case. Thanks, David On 9/09/2020 3:02 am, Reingruber, Richard wrote: > Hello Marty, > > Sure. I'd be happy if Serguei could review the change. > > Thanks, Richard. > > -----Original Message----- > From: Marty Thompson > Sent: Dienstag, 8. September 2020 18:55 > To: Reingruber, Richard ; Daniel Daugherty ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hello Richard, > > It would be good if Serguei Spitsyn could review before this is pushed. Serguei is out this week. Can you wait until Serguei is back in the office the week of Sept 14? > > Regards, > > Marty > >> -----Original Message----- >> From: Reingruber, Richard >> Sent: Tuesday, September 8, 2020 9:45 AM >> To: Daniel Daugherty ; serviceability-dev >> ; hotspot-compiler- >> dev at openjdk.java.net; Hotspot dev runtime > dev at openjdk.java.net> >> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance >> in the Presence of JVMTI Agents >> >> Hi Dan, >> >> I'd be very happy about a review from somebody on the Serviceability team. >> I have asked for reviews many times (kindly I hope). And the change is for >> review for more than a year now. >> >> According to [1] I'd think all requirements to push are met already. But >> maybe I missed something? >> >> After renaming of methods in SafepointMechanism the change needs to be >> rebased (already done). I'll publish a pull request as soon as possible. >> >> Thanks, Richard. >> >> [1] >> https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change >> >> -----Original Message----- >> From: Daniel D. Daugherty >> Sent: Dienstag, 8. September 2020 18:16 >> To: Reingruber, Richard ; serviceability-dev >> ; hotspot-compiler- >> dev at openjdk.java.net; Hotspot dev runtime > dev at openjdk.java.net> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance >> in the Presence of JVMTI Agents >> >> Hi Richard, >> >> I haven't seen a review from anyone on the Serviceability team and I think >> you should get a review from them since JVM/TI is involved. >> Perhaps I missed it... >> >> Dan >> >> >> On 9/7/20 10:09 AM, Reingruber, Richard wrote: >>> Hi, >>> >>> I would like to close the review of this change. >>> >>> It has received a lot of helpful feedback during the process and 2 >>> full Reviews. Thanks everybody! >>> >>> I'm planning to push it this week on Thursday as solution for JBS items: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>> https://bugs.openjdk.java.net/browse/JDK-8233915 >>> >>> Version to be pushed: >>> >>> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ >>> >>> Hope to get my GIT/Skara setup going until then... :) >>> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: hotspot-compiler-dev >>> On Behalf Of Reingruber, >>> Richard >>> Sent: Mittwoch, 2. September 2020 23:27 >>> To: Robbin Ehn ; serviceability-dev >>> ; >>> hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime >>> >>> Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for >>> Better Performance in the Presence of JVMTI Agents >>> >>> Hi Robin, >>> >>>> On 2020-09-02 15:48, Reingruber, Richard wrote: >>>>> Hi Robbin, >>>>> >>>>> // taking the discussion back to the mailing lists >>>>> >>>>> > I still don't understand why you don't deoptimize the objects inside >> the >>>>> > handshake/safepoint instead? >>>> So for handshakes using asynch handshake and allowing blocking inside >>>> would fix that. (future fix, I'm working on that now) >>> Just to make it clear: I'm not fond of the extra suspension mechanism >>> currently used for JDK-8227745 either. I want to get rid of it and I >>> will work on it. Asynch handshakes (JDK-8238761) could be a >>> replacement for it. At least I think they can be used to suspend the target >> thread. >>> >>>> For safepoint, since we have suspended all threads, ~'safepointed them' >>>> with a JavaThread, you _could_ just execute the action directly (e.g. >>>> skipping VM_HeapWalkOperation safepoint) since they are suppose to be >>>> safely suspended until the destructor of EB, no? >>> Yes, this should be possible. This would be an advanced change though. >>> I would like EscapeBarriers to be a no-op and fall back to current >>> implementation, if C2-EscapeAnalysis/Graal are disabled. >>> >>>> So I suggest future work to instead just execute the safepoint with >>>> the requesting JT instead of having a this special safepoiting mechanism. >>>> Since you are missing above functionality I see why you went this way. >>>> If you need to push it, it's fine by me. >>> We will work on further improvements. Top of the list would be >>> eliminating the extra suspend mechanism. >>> >>> The implementation has matured for more than 12 months now [1]. It's >>> been tested extensively at SAP over that time and passed also extended >>> testing at Oracle kindly conducted by Vladimir Kozlov. We've got two >>> full Reviews and incorporated extensive feedback from a number of >>> OpenJDK Reviewers (including you, thanks!). Based on that I reckon >>> we're good to push the change as enhancement >>> (JDK-8227745) and bug fix (JDK-8233915). >>> >>>> Thanks for explaining once again :) >>> Pleasure :) >>> >>> Thanks, Richard. >>> >>> [1] >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/02 >>> 8729.html >>> >>> -----Original Message----- >>> From: Robbin Ehn >>> Sent: Mittwoch, 2. September 2020 16:54 >>> To: Reingruber, Richard ; >>> serviceability-dev ; >>> hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime >>> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>> Performance in the Presence of JVMTI Agents >>> >>> Hi Richard, >>> >>> On 2020-09-02 15:48, Reingruber, Richard wrote: >>>> Hi Robbin, >>>> >>>> // taking the discussion back to the mailing lists >>>> >>>> > I still don't understand why you don't deoptimize the objects inside >> the >>>> > handshake/safepoint instead? >>> So for handshakes using asynch handshake and allowing blocking inside >>> would fix that. (future fix, I'm working on that now) >>> >>> For safepoint, since we have suspended all threads, ~'safepointed them' >>> with a JavaThread, you _could_ just execute the action directly (e.g. >>> skipping VM_HeapWalkOperation safepoint) since they are suppose to be >>> safely suspended until the destructor of EB, no? >>> >>> So I suggest future work to instead just execute the safepoint with >>> the requesting JT instead of having a this special safepoiting mechanism. >>> >>> Since you are missing above functionality I see why you went this way. >>> If you need to push it, it's fine by me. >>> >>> Thanks for explaining once again :) >>> >>> /Robbin >>> >>>> This is unfortunately not possible. Deoptimizing objects includes >>>> reallocating scalar replaced objects, i.e. calling >>>> Deoptimization::realloc_objects(). This cannot be done at a safepoint or >> handshake. >>>> >>>> 1. The vm thread is not allowed to allocate on the java heap >>>> See for instance assertions in ParallelScavengeHeap::mem_allocate() >>>> >>>> >> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e >>>> >> 045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/par >>>> >> allelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKy >> a >>>> 9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ >>>> >>>> This is not easy to change, I suppose, because it will be difficult to gc if >>>> necessary. >>>> >>>> 2. Using a direct handshake would not work either. The problem there is >> again >>>> gc. Let J be the JavaThread that is executing the direct handshake. The >> vm >>>> would deadlock if the vm thread waits for J to execute the closure of a >>>> handshake-all and J waits for the vm thread to execute a gc vm >> operation. >>>> Patricio Chilano made me aware of this: >>>> https://bugs.openjdk.java.net/browse/JDK-8230594 >>>> >>>> Cheers, Richard. >>>> >>>> -----Original Message----- >>>> From: Robbin Ehn >>>> Sent: Mittwoch, 2. September 2020 13:56 >>>> To: Reingruber, Richard >>>> Cc: Lindenmaier, Goetz ; Vladimir Kozlov >>>> ; David Holmes >> >>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>> Performance in the Presence of JVMTI Agents >>>> >>>> Hi, >>>> >>>> I still don't understand why you don't deoptimize the objects inside >>>> the handshake/safepoint instead? >>>> >>>> E.g. >>>> >>>> JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the >>>> code >>>> from: >>>> eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over >>>> the stack, so: >>>> >>>> void >>>> GetOwnedMonitorInfoClosure::do_thread(Thread *target) { >>>> assert(target->is_Java_thread(), "just checking"); >>>> JavaThread *jt = (JavaThread *)target; >>>> >>>> if (!jt->is_exiting() && (jt->threadObj() != NULL)) { >>>> + if (EscapeBarrier::deoptimize_objects(jt, >>>> + MaxJavaStackTraceDepth)) { >>>> _result = >>>> ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, >>>> _owned_monitors_list); >>>> } else { >>>> _result = JVMTI_ERROR_OUT_OF_MEMORY; >>>> } >>>> } >>>> } >>>> >>>> Why try 'suspend' the thread first? >>>> >>>> >>>> When we de-optimize all threads why not just in the following safepoint? >>>> E.g. >>>> VM_HeapWalkOperation::doit() { >>>> + EscapeBarrier::deoptimize_objects_all_threads(); >>>> ... >>>> } >>>> >>>> Thanks, Robbin >>>> >>>> >> From sgehwolf at openjdk.java.net Wed Sep 9 07:42:15 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 9 Sep 2020 07:42:15 GMT Subject: RFR: 8252957: Wrong comment in CgroupV1Subsystem::cpu_quota Message-ID: The comment is wrong. The 'us' in 'cpu.cfs_quota_us' stands for microseconds, which is read verbatim. Similarly for cgroups v2 all units in 'cpu.max' are in microseconds. ------------- Commit messages: - 8252957: Wrong comment in CgroupV1Subsystem::cpu_quota Changes: https://git.openjdk.java.net/jdk/pull/91/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=91&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252957 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/91.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/91/head:pull/91 PR: https://git.openjdk.java.net/jdk/pull/91 From sgehwolf at openjdk.java.net Wed Sep 9 07:42:16 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 9 Sep 2020 07:42:16 GMT Subject: RFR: 8252957: Wrong comment in CgroupV1Subsystem::cpu_quota In-Reply-To: References: Message-ID: On Wed, 9 Sep 2020 07:34:50 GMT, Severin Gehwolf wrote: > The comment is wrong. The 'us' in 'cpu.cfs_quota_us' stands for > microseconds, which is read verbatim. Similarly for cgroups v2 > all units in 'cpu.max' are in microseconds. @bobvandette Please have a look it's a pretty trivial change. ------------- PR: https://git.openjdk.java.net/jdk/pull/91 From shade at openjdk.java.net Wed Sep 9 07:46:55 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 9 Sep 2020 07:46:55 GMT Subject: RFR: 8252957: Wrong comment in CgroupV1Subsystem::cpu_quota In-Reply-To: References: Message-ID: On Wed, 9 Sep 2020 07:34:50 GMT, Severin Gehwolf wrote: > The comment is wrong. The 'us' in 'cpu.cfs_quota_us' stands for > microseconds, which is read verbatim. Similarly for cgroups v2 > all units in 'cpu.max' are in microseconds. Looks fine, as we discussed in #89. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/91 From sgehwolf at openjdk.java.net Wed Sep 9 07:49:15 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 9 Sep 2020 07:49:15 GMT Subject: RFR: 8252957: Wrong comment in CgroupV1Subsystem::cpu_quota In-Reply-To: References: Message-ID: On Wed, 9 Sep 2020 07:44:23 GMT, Aleksey Shipilev wrote: >> The comment is wrong. The 'us' in 'cpu.cfs_quota_us' stands for >> microseconds, which is read verbatim. Similarly for cgroups v2 >> all units in 'cpu.max' are in microseconds. > > Looks fine, as we discussed in #89. The comment is wrong. The 'us' in 'cpu.cfs_quota_us' stands for microseconds, which is read verbatim. Similarly for cgroups v2 all units in 'cpu.max' are in microseconds. ------------- PR: https://git.openjdk.java.net/jdk/pull/91 From hseigel at openjdk.java.net Wed Sep 9 12:51:01 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 9 Sep 2020 12:51:01 GMT Subject: RFR: 8252957: Wrong comment in CgroupV1Subsystem::cpu_quota In-Reply-To: References: Message-ID: On Wed, 9 Sep 2020 07:34:50 GMT, Severin Gehwolf wrote: > The comment is wrong. The 'us' in 'cpu.cfs_quota_us' stands for > microseconds, which is read verbatim. Similarly for cgroups v2 > all units in 'cpu.max' are in microseconds. Looks good! Thanks. ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/91 From bob.vandette at oracle.com Wed Sep 9 13:36:13 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 9 Sep 2020 09:36:13 -0400 Subject: RFR: 8252957: Wrong comment in CgroupV1Subsystem::cpu_quota In-Reply-To: References: Message-ID: <595ADDB2-AE05-45C4-964F-452BACB8CC45@oracle.com> Yes, this is fine. I looked through the java.base implementation as well and verified that there are no other instances and that we were not assuming milliseconds in any calculations. Bob. > On Sep 9, 2020, at 3:42 AM, Severin Gehwolf wrote: > > On Wed, 9 Sep 2020 07:34:50 GMT, Severin Gehwolf wrote: > >> The comment is wrong. The 'us' in 'cpu.cfs_quota_us' stands for >> microseconds, which is read verbatim. Similarly for cgroups v2 >> all units in 'cpu.max' are in microseconds. > > @bobvandette Please have a look it's a pretty trivial change. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/91 From sgehwolf at openjdk.java.net Wed Sep 9 13:55:41 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 9 Sep 2020 13:55:41 GMT Subject: Integrated: 8252957: Wrong comment in CgroupV1Subsystem::cpu_quota In-Reply-To: References: Message-ID: On Wed, 9 Sep 2020 07:34:50 GMT, Severin Gehwolf wrote: > The comment is wrong. The 'us' in 'cpu.cfs_quota_us' stands for > microseconds, which is read verbatim. Similarly for cgroups v2 > all units in 'cpu.max' are in microseconds. This pull request has now been integrated. Changeset: 51660946 Author: Severin Gehwolf URL: https://git.openjdk.java.net/jdk/commit/51660946 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod 8252957: Wrong comment in CgroupV1Subsystem::cpu_quota The comment is wrong. The 'us' in 'cpu.cfs_quota_us' stands for microseconds, which is read verbatim. Similarly for cgroups v2 all units in 'cpu.max' are in microseconds. Reviewed-by: shade, hseigel ------------- PR: https://git.openjdk.java.net/jdk/pull/91 From coleenp at openjdk.java.net Wed Sep 9 14:22:05 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 9 Sep 2020 14:22:05 GMT Subject: RFR: 8244778: Archive full module graph in CDS In-Reply-To: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: On Tue, 8 Sep 2020 15:59:33 GMT, Ioi Lam wrote: > This is the same patch as > [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) > published in > [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). > The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. Marked as reviewed by coleenp (Reviewer). src/hotspot/share/classfile/classLoaderDataShared.cpp line 132: > 130: assert(loader_data != NULL, "must be"); > 131: return loader_data; > 132: } This and other private functions should probably be a static function inside classLoaderDataShared.cpp. src/hotspot/share/classfile/classLoaderDataShared.hpp line 28: > 26: #define SHARE_CLASSFILE_CLASSLOADERDATASHARED_HPP > 27: > 28: #include "utilities/exceptions.hpp" There's a memory/allStatic.hpp file now that this should include. src/hotspot/share/classfile/modules.cpp line 495: > 493: } > 494: } > 495: #endif Nit: can you add // INCLUDE_CDS_JAVA_HEAP src/hotspot/share/classfile/classLoaderDataShared.cpp line 171: > 169: } > 170: > 171: void ClassLoaderDataShared::serialize(class SerializeClosure* f) { Why is there a 'class' keyword here? ------------- PR: https://git.openjdk.java.net/jdk/pull/80 From github.com+66382410+lfoltan at openjdk.java.net Wed Sep 9 18:49:12 2020 From: github.com+66382410+lfoltan at openjdk.java.net (Lois Foltan) Date: Wed, 9 Sep 2020 18:49:12 GMT Subject: RFR: 8244778: Archive full module graph in CDS In-Reply-To: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: On Tue, 8 Sep 2020 15:59:33 GMT, Ioi Lam wrote: > This is the same patch as > [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) > published in > [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). > The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. Thanks Ioi for addressing my review comments. Overall, looks great! src/hotspot/share/classfile/moduleEntry.cpp line 419: > 417: } > 418: > 419: GrowableArray* ModuleEntry::restore_growable_array(Array* archived_array) { Thanks for renaming these methods src/hotspot/share/oops/instanceKlass.cpp line 2550: > 2548: // clear _nest_host to ensure re-load at runtime > 2549: _nest_host = NULL; > 2550: _package_entry = NULL; // TODO -- point it to the archived PackageEntry (JDK-8249262) Would you consider removing this comment? I tend not to like TODO comments since sometimes the open enhancement remains unaddressed. ------------- Marked as reviewed by lfoltan at github.com (no known OpenJDK username). PR: https://git.openjdk.java.net/jdk/pull/80 From dcubed at openjdk.java.net Wed Sep 9 21:14:43 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 9 Sep 2020 21:14:43 GMT Subject: RFR: 8252980: comment only changes extracted from JDK-8247281 Message-ID: <3jFYZAAgPok_HCqW21E100Axx0Ot43Dc6BdP0O9GLak=.99ed4a68-33c7-46e1-b261-e46aa922d78a@github.com> This sub-task is tracking comment only changes extracted from Erik's work on JDK-8247281. This extraction is done to ease the code review for the JDK-8247281 changes. ------------- Commit messages: - 8252980: comment only changes extracted from JDK-8247281 Changes: https://git.openjdk.java.net/jdk/pull/98/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=98&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252980 Stats: 10 lines in 2 files changed: 0 ins; 3 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/98.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/98/head:pull/98 PR: https://git.openjdk.java.net/jdk/pull/98 From iklam at openjdk.java.net Wed Sep 9 21:58:24 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Sep 2020 21:58:24 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v2] In-Reply-To: References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: <62HGLspnAFezsLXCpQn7kBBlBiOYnPxLXfF2xImO5m0=.c44b6297-ee33-4ee6-a049-2701f7f80813@github.com> On Wed, 9 Sep 2020 18:41:25 GMT, Lois Foltan wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes >> the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last >> revision: >> - Removed TODO comment referring to JBS issue >> - Merge branch 'master' into 8244778-archive-full-module-graph >> - fixed trailing spaces >> - Renamed ModuleEntry::write_growable_array >> - Update to latest repo (JDK-8251557); added comments >> - 8244778: Archive full module graph in CDS > > src/hotspot/share/oops/instanceKlass.cpp line 2550: > >> 2548: // clear _nest_host to ensure re-load at runtime >> 2549: _nest_host = NULL; >> 2550: _package_entry = NULL; // TODO -- point it to the archived PackageEntry (JDK-8249262) > > Would you consider removing this comment? I tend not to like TODO comments since sometimes the open enhancement > remains unaddressed. I removed the comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/80 From iklam at openjdk.java.net Wed Sep 9 21:58:16 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Sep 2020 21:58:16 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v2] In-Reply-To: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: > This is the same patch as > [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) > published in > [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). > The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Removed TODO comment referring to JBS issue - Merge branch 'master' into 8244778-archive-full-module-graph - fixed trailing spaces - Renamed ModuleEntry::write_growable_array - Update to latest repo (JDK-8251557); added comments - 8244778: Archive full module graph in CDS ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/80/files - new: https://git.openjdk.java.net/jdk/pull/80/files/89f33274..92b4202b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=00-01 Stats: 439 lines in 53 files changed: 165 ins; 162 del; 112 mod Patch: https://git.openjdk.java.net/jdk/pull/80.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/80/head:pull/80 PR: https://git.openjdk.java.net/jdk/pull/80 From iklam at openjdk.java.net Wed Sep 9 22:08:26 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Sep 2020 22:08:26 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v3] In-Reply-To: References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: On Tue, 8 Sep 2020 17:32:44 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback from Coleen > > src/hotspot/share/classfile/classLoaderDataShared.cpp line 171: > >> 169: } >> 170: >> 171: void ClassLoaderDataShared::serialize(class SerializeClosure* f) { > > Why is there a 'class' keyword here? I fixed this one and other issues you reported for the same commit (I didn't want to respond to each one individually to avoid the e-mail avalanche). New version: [e541890](https://github.com/openjdk/jdk/pull/80/commits/e541890e037ff40ec9134a54e5c7a878ab9259f3) ------------- PR: https://git.openjdk.java.net/jdk/pull/80 From iklam at openjdk.java.net Wed Sep 9 22:08:25 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Sep 2020 22:08:25 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v3] In-Reply-To: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: > This is the same patch as > [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) > published in > [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). > The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Feedback from Coleen ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/80/files - new: https://git.openjdk.java.net/jdk/pull/80/files/92b4202b..e541890e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=01-02 Stats: 14 lines in 3 files changed: 3 ins; 5 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/80.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/80/head:pull/80 PR: https://git.openjdk.java.net/jdk/pull/80 From coleenp at openjdk.java.net Wed Sep 9 22:21:19 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 9 Sep 2020 22:21:19 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v3] In-Reply-To: References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: On Wed, 9 Sep 2020 18:46:33 GMT, Lois Foltan wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback from Coleen > > Thanks Ioi for addressing my review comments. Overall, looks great! Ok thanks! So many emails ... ------------- PR: https://git.openjdk.java.net/jdk/pull/80 From dcubed at openjdk.java.net Wed Sep 9 22:30:00 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 9 Sep 2020 22:30:00 GMT Subject: RFR: 8252980: comment only changes extracted from JDK-8247281 In-Reply-To: <3jFYZAAgPok_HCqW21E100Axx0Ot43Dc6BdP0O9GLak=.99ed4a68-33c7-46e1-b261-e46aa922d78a@github.com> References: <3jFYZAAgPok_HCqW21E100Axx0Ot43Dc6BdP0O9GLak=.99ed4a68-33c7-46e1-b261-e46aa922d78a@github.com> Message-ID: <8G5hRiT39DnPXzov7eDciRw8Aiw_l5Gba_CKqsxrKxg=.16602dea-5a70-4dae-8a58-fa8c7eab4ef2@github.com> On Wed, 9 Sep 2020 21:08:18 GMT, Daniel D. Daugherty wrote: > This sub-task is tracking comment only changes extracted > from Erik's work on JDK-8247281. This extraction is done > to ease the code review for the JDK-8247281 changes. This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing along with JDK-8252981 and JDK-8247281. ------------- PR: https://git.openjdk.java.net/jdk/pull/98 From iklam at openjdk.java.net Wed Sep 9 23:00:49 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Sep 2020 23:00:49 GMT Subject: RFR: 8252980: comment only changes extracted from JDK-8247281 In-Reply-To: <3jFYZAAgPok_HCqW21E100Axx0Ot43Dc6BdP0O9GLak=.99ed4a68-33c7-46e1-b261-e46aa922d78a@github.com> References: <3jFYZAAgPok_HCqW21E100Axx0Ot43Dc6BdP0O9GLak=.99ed4a68-33c7-46e1-b261-e46aa922d78a@github.com> Message-ID: On Wed, 9 Sep 2020 21:08:18 GMT, Daniel D. Daugherty wrote: > This sub-task is tracking comment only changes extracted > from Erik's work on JDK-8247281. This extraction is done > to ease the code review for the JDK-8247281 changes. LGTM. I agree it's a trivial change. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/98 From dholmes at openjdk.java.net Wed Sep 9 23:25:49 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Sep 2020 23:25:49 GMT Subject: RFR: 8252980: comment only changes extracted from JDK-8247281 In-Reply-To: <3jFYZAAgPok_HCqW21E100Axx0Ot43Dc6BdP0O9GLak=.99ed4a68-33c7-46e1-b261-e46aa922d78a@github.com> References: <3jFYZAAgPok_HCqW21E100Axx0Ot43Dc6BdP0O9GLak=.99ed4a68-33c7-46e1-b261-e46aa922d78a@github.com> Message-ID: On Wed, 9 Sep 2020 21:08:18 GMT, Daniel D. Daugherty wrote: > This sub-task is tracking comment only changes extracted > from Erik's work on JDK-8247281. This extraction is done > to ease the code review for the JDK-8247281 changes. LGTM ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/98 From dcubed at openjdk.java.net Wed Sep 9 23:38:09 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 9 Sep 2020 23:38:09 GMT Subject: Integrated: 8252980: comment only changes extracted from JDK-8247281 In-Reply-To: <3jFYZAAgPok_HCqW21E100Axx0Ot43Dc6BdP0O9GLak=.99ed4a68-33c7-46e1-b261-e46aa922d78a@github.com> References: <3jFYZAAgPok_HCqW21E100Axx0Ot43Dc6BdP0O9GLak=.99ed4a68-33c7-46e1-b261-e46aa922d78a@github.com> Message-ID: On Wed, 9 Sep 2020 21:08:18 GMT, Daniel D. Daugherty wrote: > This sub-task is tracking comment only changes extracted > from Erik's work on JDK-8247281. This extraction is done > to ease the code review for the JDK-8247281 changes. This pull request has now been integrated. Changeset: f9339616 Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/f9339616 Stats: 10 lines in 2 files changed: 3 ins; 0 del; 7 mod 8252980: comment only changes extracted from JDK-8247281 Reviewed-by: iklam, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/98 From rehn at openjdk.java.net Thu Sep 10 12:21:48 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 10 Sep 2020 12:21:48 GMT Subject: RFR: 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold Message-ID: These flags make little sense. They measure non safepointing VM ops, that's only handshake now. Which have little relation to compiles. Builds runs locally, running T1 now. ------------- Commit messages: - Removed Changes: https://git.openjdk.java.net/jdk/pull/111/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=111&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253008 Stats: 24 lines in 2 files changed: 0 ins; 23 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/111.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/111/head:pull/111 PR: https://git.openjdk.java.net/jdk/pull/111 From shade at openjdk.java.net Thu Sep 10 12:31:19 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 10 Sep 2020 12:31:19 GMT Subject: RFR: 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold In-Reply-To: References: Message-ID: On Thu, 10 Sep 2020 12:15:57 GMT, Robbin Ehn wrote: > These flags make little sense. > They measure non safepointing VM ops, that's only handshake now. > Which have little relation to compiles. > > Builds runs locally, running T1 now. Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/111 From david.holmes at oracle.com Thu Sep 10 13:31:04 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 10 Sep 2020 23:31:04 +1000 Subject: RFR: 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold In-Reply-To: References: Message-ID: <98d6952e-a7d6-1304-ce0c-7b39c59138fe@oracle.com> Hi Robbin, On 10/09/2020 10:21 pm, Robbin Ehn wrote: > These flags make little sense. > They measure non safepointing VM ops, that's only handshake now. > Which have little relation to compiles. As this predates handshakes, what non-safepoint VMoperations would be measured here? I checked the code back to 6u10 and there are zero non-safepoint VMops. So this is basically long dead code. Cheers, David ----- > Builds runs locally, running T1 now. > > ------------- > > Commit messages: > - Removed > > Changes: https://git.openjdk.java.net/jdk/pull/111/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=111&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8253008 > Stats: 24 lines in 2 files changed: 0 ins; 23 del; 1 mod > Patch: https://git.openjdk.java.net/jdk/pull/111.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/111/head:pull/111 > > PR: https://git.openjdk.java.net/jdk/pull/111 > From dholmes at openjdk.java.net Thu Sep 10 13:33:50 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Sep 2020 13:33:50 GMT Subject: RFR: 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold In-Reply-To: References: Message-ID: On Thu, 10 Sep 2020 12:15:57 GMT, Robbin Ehn wrote: > These flags make little sense. > They measure non safepointing VM ops, that's only handshake now. > Which have little relation to compiles. > > Builds runs locally, running T1 now. Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/111 From harold.seigel at oracle.com Thu Sep 10 14:52:18 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Thu, 10 Sep 2020 10:52:18 -0400 Subject: RFR 8250984: Memory Docker tests fail on some Linux kernels w/o swap limit capabilities In-Reply-To: <8E9ADAB4-1877-40EF-9BB0-D35A30572AE8@oracle.com> References: <8E9ADAB4-1877-40EF-9BB0-D35A30572AE8@oracle.com> Message-ID: <1dc4eec9-32eb-671a-ba3e-4bdea8f1d741@oracle.com> Hi Bob, I came up with these ways to handle the test failures when swap limiting is disabled (JDK-8250984).? Please let me know if any of them sound viable. One way is to add logging to CgroupSubsystemController when it fails to open a file such as .../memsw.linit_in_bytes.? The tests would enable logging and then look for these messages to determine if swap limiting was disabled.? This is yet another string for the tests to parse, but the JDK controls the contents of the strings, so there is less concern about them changing.? Here's a webrev showing this potential change: http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr.log/webrev/index.html Another way is for methods such as CgroupSubsystemController.getLongEntry() to return a -2 status, indicating not-implemented, when it cannot access a file.? The callers of getLongEntry() could then decide whether or not to propagate that status back to their callers, or return some other default.? A partial webrev for that change is here: http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr.RetVal/webrev/index.html We could also change methods such as CgroupV1Subsystem.getMemoryAndSwapLimit() to explicitly check for the existence of the files they want to read from and return -2 if the check fails.? This may have a performance impact? Thanks, Harold On 9/1/2020 12:04 PM, Bob Vandette wrote: > I really dislike encoding all these strings in our tests that could possibly change. > > I wish we did something like check for the existence of /sys/fs/cgroup/memory/memsw.limit_in_bytes > assuming that this file is not present when swap limiting is disabled. The problem with this approach > and yours is that we need to make that these fixes we can run on docker, podman, cgroupv1 and cgroupv2. > > Others are struggling with these types of issues ? > > https://github.com/containers/podman/issues/6365 > > The Metrics API I added provides for the possibility that the call to getMemoryAndSwapLimit > could fail. Perhaps the test should be checking for not supported and fix the API implementation > to report the correct error (if it doesn?t already). > > /** > * Returns the maximum amount of physical memory and swap space, > * in bytes, that can be allocated in the Isolation Group. > * > * @return The maximum amount of memory in bytes or -1 if > * there is no limit set or -2 if this metric is not supported. > * > */ > public long getMemoryAndSwapLimit(); > > My .02$ > > Bob. > >> On Sep 1, 2020, at 11:31 AM, Harold Seigel wrote: >> >> Hi, >> >> Please review this fix to enable docker tests TestMemoryAwareness.java and TestDockerMemoryMetrics.java to run on Linux kernels configured without swap limit capabilities. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8250984 >> >> The modified tests were run on Linux kernels with and without swap limit capabilities. >> >> Thanks, Harold >> From bob.vandette at oracle.com Thu Sep 10 15:05:42 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 10 Sep 2020 11:05:42 -0400 Subject: RFR 8250984: Memory Docker tests fail on some Linux kernels w/o swap limit capabilities In-Reply-To: <1dc4eec9-32eb-671a-ba3e-4bdea8f1d741@oracle.com> References: <8E9ADAB4-1877-40EF-9BB0-D35A30572AE8@oracle.com> <1dc4eec9-32eb-671a-ba3e-4bdea8f1d741@oracle.com> Message-ID: <8781C429-52EB-4FF5-8BD4-67180BA55AB6@oracle.com> Harold, I prefer the second approach since it?s consistent with the original specification of the Metrics APIs. You should be able to check for the -2 case in OperatingSystemImpl.java after the limits are tested for >=0 in order to avoid adding any extra overhead. 56 public long getTotalSwapSpaceSize() { 57 if (containerMetrics != null) { 58 long limit = containerMetrics.getMemoryAndSwapLimit(); 59 if (limit == CgroupSubsystem.LONG_RETVAL_NOT_SUPPORTED) { // not supported 60 return CgroupSubsystem.LONG_RETVAL_NOT_SUPPORTED; 61 } 62 // The memory limit metrics is not available if JVM runs on Linux host (not in a docker container) 63 // or if a docker container was started without specifying a memory limit (without '--memory=' 64 // Docker option). In latter case there is no limit on how much memory the container can use and 65 // it can use as much memory as the host's OS allows. 66 long memLimit = containerMetrics.getMemoryLimit(); 67 if (limit >= 0 && memLimit >= 0) { 68 return limit - memLimit; // might potentially be 0 for limit == memLimit 69 } [HERE] 70 } 71 return getTotalSwapSpaceSize0(); 72 } If you go down this path, please check through the other container & docker tests to see if there are other cases where the specific message text is parsed. I believe there was at least one another case of this. Bob. > On Sep 10, 2020, at 10:52 AM, Harold Seigel wrote: > > Hi Bob, > > I came up with these ways to handle the test failures when swap limiting is disabled (JDK-8250984). Please let me know if any of them sound viable. > > One way is to add logging to CgroupSubsystemController when it fails to open a file such as .../memsw.linit_in_bytes. The tests would enable logging and then look for these messages to determine if swap limiting was disabled. This is yet another string for the tests to parse, but the JDK controls the contents of the strings, so there is less concern about them changing. Here's a webrev showing this potential change: > > http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr.log/webrev/index.html > > Another way is for methods such as CgroupSubsystemController.getLongEntry() to return a -2 status, indicating not-implemented, when it cannot access a file. The callers of getLongEntry() could then decide whether or not to propagate that status back to their callers, or return some other default. A partial webrev for that change is here: > > http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr.RetVal/webrev/index.html > > We could also change methods such as CgroupV1Subsystem.getMemoryAndSwapLimit() to explicitly check for the existence of the files they want to read from and return -2 if the check fails. This may have a performance impact? > > Thanks, Harold > > On 9/1/2020 12:04 PM, Bob Vandette wrote: >> I really dislike encoding all these strings in our tests that could possibly change. >> >> I wish we did something like check for the existence of /sys/fs/cgroup/memory/memsw.limit_in_bytes >> assuming that this file is not present when swap limiting is disabled. The problem with this approach >> and yours is that we need to make that these fixes we can run on docker, podman, cgroupv1 and cgroupv2. >> >> Others are struggling with these types of issues ? >> >> >> https://github.com/containers/podman/issues/6365 >> >> >> The Metrics API I added provides for the possibility that the call to getMemoryAndSwapLimit >> could fail. Perhaps the test should be checking for not supported and fix the API implementation >> to report the correct error (if it doesn?t already). >> >> /** >> * Returns the maximum amount of physical memory and swap space, >> * in bytes, that can be allocated in the Isolation Group. >> * >> * @return The maximum amount of memory in bytes or -1 if >> * there is no limit set or -2 if this metric is not supported. >> * >> */ >> public long getMemoryAndSwapLimit(); >> >> My .02$ >> >> Bob. >> >> >>> On Sep 1, 2020, at 11:31 AM, Harold Seigel >>> wrote: >>> >>> Hi, >>> >>> Please review this fix to enable docker tests TestMemoryAwareness.java and TestDockerMemoryMetrics.java to run on Linux kernels configured without swap limit capabilities. >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr/webrev/index.html >>> >>> >>> JBS Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8250984 >>> >>> >>> The modified tests were run on Linux kernels with and without swap limit capabilities. >>> >>> Thanks, Harold >>> >>> From sgehwolf at redhat.com Thu Sep 10 15:39:18 2020 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 10 Sep 2020 17:39:18 +0200 Subject: RFR 8250984: Memory Docker tests fail on some Linux kernels w/o swap limit capabilities In-Reply-To: <8781C429-52EB-4FF5-8BD4-67180BA55AB6@oracle.com> References: <8E9ADAB4-1877-40EF-9BB0-D35A30572AE8@oracle.com> <1dc4eec9-32eb-671a-ba3e-4bdea8f1d741@oracle.com> <8781C429-52EB-4FF5-8BD4-67180BA55AB6@oracle.com> Message-ID: <45eb62e60fbee8edce86df7f35fbce7afd098dd0.camel@redhat.com> On Thu, 2020-09-10 at 11:05 -0400, Bob Vandette wrote: > Harold, > > I prefer the second approach since it?s consistent with the original specification of the Metrics APIs. > You should be able to check for the -2 case in OperatingSystemImpl.java after the limits are tested for > > =0 in order to avoid adding any extra overhead. > > 56 public long getTotalSwapSpaceSize() { > 57 if (containerMetrics != null) { > 58 long limit = containerMetrics.getMemoryAndSwapLimit(); > > 59 if (limit == CgroupSubsystem.LONG_RETVAL_NOT_SUPPORTED) { // not supported > 60 return CgroupSubsystem.LONG_RETVAL_NOT_SUPPORTED; > 61 } > > 62 // The memory limit metrics is not available if JVM runs on Linux host (not in a docker container) > 63 // or if a docker container was started without specifying a memory limit (without '--memory=' > 64 // Docker option). In latter case there is no limit on how much memory the container can use and > 65 // it can use as much memory as the host's OS allows. > 66 long memLimit = containerMetrics.getMemoryLimit(); > 67 if (limit >= 0 && memLimit >= 0) { > 68 return limit - memLimit; // might potentially be 0 for limit == memLimit > 69 } > [HERE] > 70 } > 71 return getTotalSwapSpaceSize0(); > 72 } > I agree. Option 2 seems the preferred one for me too. One additional consideration would be whether or not there are other cases where cgroup files are missing. IIRC cases for their existence are different between cgroup v1 and cgroup v2. Thanks, Severin > If you go down this path, please check through the other container & docker tests to see if there are other cases > where the specific message text is parsed. I believe there was at least one another case of this. > > Bob. > > > > On Sep 10, 2020, at 10:52 AM, Harold Seigel wrote: > > > > Hi Bob, > > > > I came up with these ways to handle the test failures when swap limiting is disabled (JDK-8250984). Please let me know if any of them sound viable. > > > > One way is to add logging to CgroupSubsystemController when it fails to open a file such as .../memsw.linit_in_bytes. The tests would enable logging and then look for these messages to determine if swap limiting was disabled. This is yet another string for the tests to parse, but the JDK controls the contents of the strings, so there is less concern about them changing. Here's a webrev showing this potential change: > > > > http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr.log/webrev/index.html > > > > Another way is for methods such as CgroupSubsystemController.getLongEntry() to return a -2 status, indicating not-implemented, when it cannot access a file. The callers of getLongEntry() could then decide whether or not to propagate that status back to their callers, or return some other default. A partial webrev for that change is here: > > > > http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr.RetVal/webrev/index.html > > > > We could also change methods such as CgroupV1Subsystem.getMemoryAndSwapLimit() to explicitly check for the existence of the files they want to read from and return -2 if the check fails. This may have a performance impact? > > > > Thanks, Harold > > > > On 9/1/2020 12:04 PM, Bob Vandette wrote: > > > I really dislike encoding all these strings in our tests that could possibly change. > > > > > > I wish we did something like check for the existence of /sys/fs/cgroup/memory/memsw.limit_in_bytes > > > assuming that this file is not present when swap limiting is disabled. The problem with this approach > > > and yours is that we need to make that these fixes we can run on docker, podman, cgroupv1 and cgroupv2. > > > > > > Others are struggling with these types of issues ? > > > > > > > > > https://github.com/containers/podman/issues/6365 > > > > > > > > > The Metrics API I added provides for the possibility that the call to getMemoryAndSwapLimit > > > could fail. Perhaps the test should be checking for not supported and fix the API implementation > > > to report the correct error (if it doesn?t already). > > > > > > /** > > > * Returns the maximum amount of physical memory and swap space, > > > * in bytes, that can be allocated in the Isolation Group. > > > * > > > * @return The maximum amount of memory in bytes or -1 if > > > * there is no limit set or -2 if this metric is not supported. > > > * > > > */ > > > public long getMemoryAndSwapLimit(); > > > > > > My .02$ > > > > > > Bob. > > > > > > > > > > On Sep 1, 2020, at 11:31 AM, Harold Seigel > > > > wrote: > > > > > > > > Hi, > > > > > > > > Please review this fix to enable docker tests TestMemoryAwareness.java and TestDockerMemoryMetrics.java to run on Linux kernels configured without swap limit capabilities. > > > > > > > > Open Webrev: > > > > http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr/webrev/index.html > > > > > > > > > > > > JBS Bug: > > > > https://bugs.openjdk.java.net/browse/JDK-8250984 > > > > > > > > > > > > The modified tests were run on Linux kernels with and without swap limit capabilities. > > > > > > > > Thanks, Harold > > > > > > > > From rehn at openjdk.java.net Thu Sep 10 16:33:19 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 10 Sep 2020 16:33:19 GMT Subject: RFR: 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold In-Reply-To: References: Message-ID: On Thu, 10 Sep 2020 13:31:22 GMT, David Holmes wrote: >> These flags make little sense. >> They measure non safepointing VM ops, that's only handshake now. >> Which have little relation to compiles. >> >> Builds runs locally, running T1 now. > > Marked as reviewed by dholmes (Reviewer). Thanks @shipilev and @dholmes-ora. And thanks for the digging! ------------- PR: https://git.openjdk.java.net/jdk/pull/111 From rehn at openjdk.java.net Thu Sep 10 16:34:53 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 10 Sep 2020 16:34:53 GMT Subject: RFR: 8253026: Remove dummy call to gc alot from VM Thread Message-ID: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> GC Alot can only be performed by a Java Thread: void InterfaceSupport::gc_alot() { Thread *thread = Thread::current(); if (!thread->is_Java_thread()) return; // Avoid concurrent calls Lets remove the dummy call from the VM thread. ------------- Commit messages: - Removed code Changes: https://git.openjdk.java.net/jdk/pull/113/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=113&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253026 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/113.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/113/head:pull/113 PR: https://git.openjdk.java.net/jdk/pull/113 From coleenp at openjdk.java.net Thu Sep 10 17:19:35 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 10 Sep 2020 17:19:35 GMT Subject: RFR: 8253026: Remove dummy call to gc alot from VM Thread In-Reply-To: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> References: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> Message-ID: On Thu, 10 Sep 2020 16:27:54 GMT, Robbin Ehn wrote: > GC Alot can only be performed by a Java Thread: > > void InterfaceSupport::gc_alot() { > Thread *thread = Thread::current(); > if (!thread->is_Java_thread()) return; // Avoid concurrent calls > > Lets remove the dummy call from the VM thread. Marked as reviewed by coleenp (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/113 From coleenp at openjdk.java.net Thu Sep 10 17:19:36 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 10 Sep 2020 17:19:36 GMT Subject: RFR: 8253026: Remove dummy call to gc alot from VM Thread In-Reply-To: References: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> Message-ID: On Thu, 10 Sep 2020 17:17:10 GMT, Coleen Phillimore wrote: >> GC Alot can only be performed by a Java Thread: >> >> void InterfaceSupport::gc_alot() { >> Thread *thread = Thread::current(); >> if (!thread->is_Java_thread()) return; // Avoid concurrent calls >> >> Lets remove the dummy call from the VM thread. > > Marked as reviewed by coleenp (Reviewer). Looks good, also trivial. ------------- PR: https://git.openjdk.java.net/jdk/pull/113 From dcubed at openjdk.java.net Thu Sep 10 18:22:14 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 10 Sep 2020 18:22:14 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 Message-ID: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing along with JDK-8252980 and JDK-8247281. Since Erik and I are both contributors, we will need one other reviewer. This sub-task is tracking ObjectMonitor::object() cleanup changes extracted from Erik's work on JDK-8247281. This extraction is done to ease the code review for the JDK-8247281 changes. Here's the core cleanup: diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 @@ -328,9 +328,9 @@ public: - void* object() const; - void* object_addr(); - void set_object(void* obj); + oop object() const; + oop* object_addr(); + void set_object(oop obj); void release_set_allocation_state(AllocationState s); void set_allocation_state(AllocationState s); AllocationState allocation_state() const; and those type changes ripple into the other files. Note: The type for the ObjectMonitor::_object field is intentionally not being changed from "void*" in this changeset. That will be done in JDK-8247281. ------------- Commit messages: - 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 Changes: https://git.openjdk.java.net/jdk/pull/114/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=114&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252981 Stats: 45 lines in 8 files changed: 1 ins; 0 del; 44 mod Patch: https://git.openjdk.java.net/jdk/pull/114.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/114/head:pull/114 PR: https://git.openjdk.java.net/jdk/pull/114 From dcubed at openjdk.java.net Thu Sep 10 18:22:14 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 10 Sep 2020 18:22:14 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 In-Reply-To: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: <5aEbCI68lO5K2nQaLYoEFXPeR33QFMC8lhuYjt1eJYs=.aa78690a-7c80-43a4-9432-a03357c5ae5b@github.com> On Thu, 10 Sep 2020 16:56:17 GMT, Daniel D. Daugherty wrote: > This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 > testing along with JDK-8252980 and JDK-8247281. > > Since Erik and I are both contributors, we will need one other reviewer. > > This sub-task is tracking ObjectMonitor::object() cleanup changes > extracted from Erik's work on JDK-8247281. This extraction is done > to ease the code review for the JDK-8247281 changes. > > Here's the core cleanup: > > diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp > --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 > +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 > @@ -328,9 +328,9 @@ > > public: > > - void* object() const; > - void* object_addr(); > - void set_object(void* obj); > + oop object() const; > + oop* object_addr(); > + void set_object(oop obj); > void release_set_allocation_state(AllocationState s); > void set_allocation_state(AllocationState s); > AllocationState allocation_state() const; > > and those type changes ripple into the other files. > > Note: The type for the ObjectMonitor::_object field is intentionally not > being changed from "void*" in this changeset. That will be done in JDK-8247281. @fisk - Please chime in on this review when you get the chance. ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From dcubed at openjdk.java.net Thu Sep 10 18:27:21 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 10 Sep 2020 18:27:21 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 In-Reply-To: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: <0dJXTPzPhMU6u81_ETBTqmaHfL7G2I2xyOQI8_TTeFM=.74325068-a709-4693-920c-36b43ecc1e82@github.com> On Thu, 10 Sep 2020 16:56:17 GMT, Daniel D. Daugherty wrote: > This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 > testing along with JDK-8252980 and JDK-8247281. > > Since Erik and I are both contributors, we will need one other reviewer. > > This sub-task is tracking ObjectMonitor::object() cleanup changes > extracted from Erik's work on JDK-8247281. This extraction is done > to ease the code review for the JDK-8247281 changes. > > Here's the core cleanup: > > diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp > --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 > +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 > @@ -328,9 +328,9 @@ > > public: > > - void* object() const; > - void* object_addr(); > - void set_object(void* obj); > + oop object() const; > + oop* object_addr(); > + void set_object(oop obj); > void release_set_allocation_state(AllocationState s); > void set_allocation_state(AllocationState s); > AllocationState allocation_state() const; > > and those type changes ripple into the other files. > > Note: The type for the ObjectMonitor::_object field is intentionally not > being changed from "void*" in this changeset. That will be done in JDK-8247281. src/hotspot/share/runtime/synchronizer.cpp line 2534: > 2532: if (is_global) { > 2533: ls->print_cr("async-deflating global idle monitors, %3.7f secs, %d monitors", timer.seconds(), > deflated_count); 2534: GVars.stw_random = os::random(); This restoration of the setting of GVars.stw_random is a left over from an earlier version of Erik's work on JDK-8247281. That fix was pushed separately using JDK-8252126. ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From dcubed at openjdk.java.net Thu Sep 10 18:37:05 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 10 Sep 2020 18:37:05 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v2] In-Reply-To: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: > This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 > testing along with JDK-8252980 and JDK-8247281. > > Since Erik and I are both contributors, we will need one other reviewer. > > This sub-task is tracking ObjectMonitor::object() cleanup changes > extracted from Erik's work on JDK-8247281. This extraction is done > to ease the code review for the JDK-8247281 changes. > > Here's the core cleanup: > > diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp > --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 > +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 > @@ -328,9 +328,9 @@ > > public: > > - void* object() const; > - void* object_addr(); > - void set_object(void* obj); > + oop object() const; > + oop* object_addr(); > + void set_object(oop obj); > void release_set_allocation_state(AllocationState s); > void set_allocation_state(AllocationState s); > AllocationState allocation_state() const; > > and those type changes ripple into the other files. > > Note: The type for the ObjectMonitor::_object field is intentionally not > being changed from "void*" in this changeset. That will be done in JDK-8247281. Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: Remove left over setting of GVars.stw_random; fixed separately via JDK-8252126. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/114/files - new: https://git.openjdk.java.net/jdk/pull/114/files/4fe77e0a..36649369 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=114&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=114&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/114.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/114/head:pull/114 PR: https://git.openjdk.java.net/jdk/pull/114 From shade at openjdk.java.net Thu Sep 10 18:37:08 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 10 Sep 2020 18:37:08 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v2] In-Reply-To: References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: On Thu, 10 Sep 2020 18:34:12 GMT, Daniel D. Daugherty wrote: >> This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 >> testing along with JDK-8252980 and JDK-8247281. >> >> Since Erik and I are both contributors, we will need one other reviewer. >> >> This sub-task is tracking ObjectMonitor::object() cleanup changes >> extracted from Erik's work on JDK-8247281. This extraction is done >> to ease the code review for the JDK-8247281 changes. >> >> Here's the core cleanup: >> >> diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp >> --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 >> +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 >> @@ -328,9 +328,9 @@ >> >> public: >> >> - void* object() const; >> - void* object_addr(); >> - void set_object(void* obj); >> + oop object() const; >> + oop* object_addr(); >> + void set_object(oop obj); >> void release_set_allocation_state(AllocationState s); >> void set_allocation_state(AllocationState s); >> AllocationState allocation_state() const; >> >> and those type changes ripple into the other files. >> >> Note: The type for the ObjectMonitor::_object field is intentionally not >> being changed from "void*" in this changeset. That will be done in JDK-8247281. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Remove left over setting of GVars.stw_random; fixed separately via JDK-8252126. Looks okay to me (but it's late and I don't want to formally review it yet). Building with all three {release,fastdebug,slowdebug} is usually useful to catch oop conversion problems early. src/hotspot/share/runtime/objectMonitor.inline.hpp line 102: > 100: > 101: inline oop ObjectMonitor::object() const { > 102: return (oop)_object; `cast_to_oop(_object)` here? src/hotspot/share/runtime/objectMonitor.inline.hpp line 110: > 108: > 109: inline void ObjectMonitor::set_object(oop obj) { > 110: _object = (void*)obj; `cast_from_oop(obj)` here? ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From dcubed at openjdk.java.net Thu Sep 10 18:42:51 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 10 Sep 2020 18:42:51 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v2] In-Reply-To: References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: On Thu, 10 Sep 2020 18:30:10 GMT, Aleksey Shipilev wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove left over setting of GVars.stw_random; fixed separately via JDK-8252126. > > src/hotspot/share/runtime/objectMonitor.inline.hpp line 102: > >> 100: >> 101: inline oop ObjectMonitor::object() const { >> 102: return (oop)_object; > > `cast_to_oop(_object)` here? This function will change again with the next changeset (JDK-8247281) so I'd rather stick with the C-style cast for this changeset. > src/hotspot/share/runtime/objectMonitor.inline.hpp line 110: > >> 108: >> 109: inline void ObjectMonitor::set_object(oop obj) { >> 110: _object = (void*)obj; > > `cast_from_oop(obj)` here? This function will change again with the next changeset (JDK-8247281) so I'd rather stick with the C-style cast for this changeset. ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From rehn at openjdk.java.net Thu Sep 10 19:57:51 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 10 Sep 2020 19:57:51 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v2] In-Reply-To: References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: On Thu, 10 Sep 2020 18:37:05 GMT, Daniel D. Daugherty wrote: >> This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 >> testing along with JDK-8252980 and JDK-8247281. >> >> Since Erik and I are both contributors, we will need one other reviewer. >> >> This sub-task is tracking ObjectMonitor::object() cleanup changes >> extracted from Erik's work on JDK-8247281. This extraction is done >> to ease the code review for the JDK-8247281 changes. >> >> Here's the core cleanup: >> >> diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp >> --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 >> +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 >> @@ -328,9 +328,9 @@ >> >> public: >> >> - void* object() const; >> - void* object_addr(); >> - void set_object(void* obj); >> + oop object() const; >> + oop* object_addr(); >> + void set_object(oop obj); >> void release_set_allocation_state(AllocationState s); >> void set_allocation_state(AllocationState s); >> AllocationState allocation_state() const; >> >> and those type changes ripple into the other files. >> >> Note: The type for the ObjectMonitor::_object field is intentionally not >> being changed from "void*" in this changeset. That will be done in JDK-8247281. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Remove left over setting of GVars.stw_random; fixed separately via JDK-8252126. Marked as reviewed by rehn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From coleenp at openjdk.java.net Thu Sep 10 20:08:53 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 10 Sep 2020 20:08:53 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v2] In-Reply-To: References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: <2EkkefcBDTV9Jleg6yd429eW4_VtkFdPsigK2Yxgmtc=.a1dd9db2-5c00-47ef-a745-d6ac6d091892@github.com> On Thu, 10 Sep 2020 18:37:05 GMT, Daniel D. Daugherty wrote: >> This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 >> testing along with JDK-8252980 and JDK-8247281. >> >> Since Erik and I are both contributors, we will need one other reviewer. >> >> This sub-task is tracking ObjectMonitor::object() cleanup changes >> extracted from Erik's work on JDK-8247281. This extraction is done >> to ease the code review for the JDK-8247281 changes. >> >> Here's the core cleanup: >> >> diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp >> --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 >> +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 >> @@ -328,9 +328,9 @@ >> >> public: >> >> - void* object() const; >> - void* object_addr(); >> - void set_object(void* obj); >> + oop object() const; >> + oop* object_addr(); >> + void set_object(oop obj); >> void release_set_allocation_state(AllocationState s); >> void set_allocation_state(AllocationState s); >> AllocationState allocation_state() const; >> >> and those type changes ripple into the other files. >> >> Note: The type for the ObjectMonitor::_object field is intentionally not >> being changed from "void*" in this changeset. That will be done in JDK-8247281. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Remove left over setting of GVars.stw_random; fixed separately via JDK-8252126. Marked as reviewed by coleenp (Reviewer). src/hotspot/share/runtime/objectMonitor.inline.hpp line 95: > 93: assert(_waiters == 0, "must be 0: waiters=%d", _waiters); > 94: assert(_recursions == 0, "must be 0: recursions=" INTX_FORMAT, _recursions); > 95: assert(object() != NULL, "must be non-NULL"); In your next changeset, you'll want object to peek not resolve when it's an OopHandle here. ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From eosterlund at openjdk.java.net Thu Sep 10 20:08:54 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 10 Sep 2020 20:08:54 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v2] In-Reply-To: References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: On Thu, 10 Sep 2020 18:37:05 GMT, Daniel D. Daugherty wrote: >> This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 >> testing along with JDK-8252980 and JDK-8247281. >> >> Since Erik and I are both contributors, we will need one other reviewer. >> >> This sub-task is tracking ObjectMonitor::object() cleanup changes >> extracted from Erik's work on JDK-8247281. This extraction is done >> to ease the code review for the JDK-8247281 changes. >> >> Here's the core cleanup: >> >> diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp >> --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 >> +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 >> @@ -328,9 +328,9 @@ >> >> public: >> >> - void* object() const; >> - void* object_addr(); >> - void set_object(void* obj); >> + oop object() const; >> + oop* object_addr(); >> + void set_object(oop obj); >> void release_set_allocation_state(AllocationState s); >> void set_allocation_state(AllocationState s); >> AllocationState allocation_state() const; >> >> and those type changes ripple into the other files. >> >> Note: The type for the ObjectMonitor::_object field is intentionally not >> being changed from "void*" in this changeset. That will be done in JDK-8247281. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Remove left over setting of GVars.stw_random; fixed separately via JDK-8252126. Marked as reviewed by eosterlund (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From eosterlund at openjdk.java.net Thu Sep 10 20:08:54 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 10 Sep 2020 20:08:54 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v2] In-Reply-To: <2EkkefcBDTV9Jleg6yd429eW4_VtkFdPsigK2Yxgmtc=.a1dd9db2-5c00-47ef-a745-d6ac6d091892@github.com> References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> <2EkkefcBDTV9Jleg6yd429eW4_VtkFdPsigK2Yxgmtc=.a1dd9db2-5c00-47ef-a745-d6ac6d091892@github.com> Message-ID: On Thu, 10 Sep 2020 20:02:01 GMT, Coleen Phillimore wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove left over setting of GVars.stw_random; fixed separately via JDK-8252126. > > Marked as reviewed by coleenp (Reviewer). Looks good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From dcubed at openjdk.java.net Thu Sep 10 20:08:55 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 10 Sep 2020 20:08:55 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v2] In-Reply-To: <2EkkefcBDTV9Jleg6yd429eW4_VtkFdPsigK2Yxgmtc=.a1dd9db2-5c00-47ef-a745-d6ac6d091892@github.com> References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> <2EkkefcBDTV9Jleg6yd429eW4_VtkFdPsigK2Yxgmtc=.a1dd9db2-5c00-47ef-a745-d6ac6d091892@github.com> Message-ID: On Thu, 10 Sep 2020 20:01:54 GMT, Coleen Phillimore wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove left over setting of GVars.stw_random; fixed separately via JDK-8252126. > > src/hotspot/share/runtime/objectMonitor.inline.hpp line 95: > >> 93: assert(_waiters == 0, "must be 0: waiters=%d", _waiters); >> 94: assert(_recursions == 0, "must be 0: recursions=" INTX_FORMAT, _recursions); >> 95: assert(object() != NULL, "must be non-NULL"); > > In your next changeset, you'll want object to peek not resolve when it's an OopHandle here. Actually that assert is gone in JDK-8247281. Since the _object field becomes a weak handle, we cannot assert that object() is not NULL there. ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From dcubed at openjdk.java.net Thu Sep 10 20:54:04 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 10 Sep 2020 20:54:04 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v2] In-Reply-To: References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: <6Qrw4aItbLiOY7EiEs2qX2RG7knfgy_soggF5UvfmtA=.934360b7-96d8-46ce-a983-fc794c1a0e94@github.com> On Thu, 10 Sep 2020 18:40:15 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/runtime/objectMonitor.inline.hpp line 102: >> >>> 100: >>> 101: inline oop ObjectMonitor::object() const { >>> 102: return (oop)_object; >> >> `cast_to_oop(_object)` here? > > This function will change again with the next changeset (JDK-8247281) > so I'd rather stick with the C-style cast for this changeset. I kicked off a Mach5 Tier[1-3] and all the usual builds have passed. >> src/hotspot/share/runtime/objectMonitor.inline.hpp line 110: >> >>> 108: >>> 109: inline void ObjectMonitor::set_object(oop obj) { >>> 110: _object = (void*)obj; >> >> `cast_from_oop(obj)` here? > > This function will change again with the next changeset (JDK-8247281) > so I'd rather stick with the C-style cast for this changeset. I kicked off a Mach5 Tier[1-3] and all the usual builds have passed. ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From lmesnik at openjdk.java.net Thu Sep 10 23:44:36 2020 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Thu, 10 Sep 2020 23:44:36 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= Message-ID: The NULL oops are corrupted by CheckUnhandledOops and should be re-written with NULL to pass testing with -XX:+CheckUnhandledOops. ------------- Commit messages: - 8253033: CheckUnhandledOops check fails in ThreadSnapshot::initialize(...) Changes: https://git.openjdk.java.net/jdk/pull/123/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=123&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253033 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/123.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/123/head:pull/123 PR: https://git.openjdk.java.net/jdk/pull/123 From adityam at openjdk.java.net Thu Sep 10 23:54:30 2020 From: adityam at openjdk.java.net (Aditya Mandaleeka) Date: Thu, 10 Sep 2020 23:54:30 GMT Subject: RFR: 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold In-Reply-To: References: Message-ID: On Thu, 10 Sep 2020 12:15:57 GMT, Robbin Ehn wrote: > These flags make little sense. > They measure non safepointing VM ops, that's only handshake now. > Which have little relation to compiles. > > Builds runs locally, running T1 now. Marked as reviewed by adityam (Author). ------------- PR: https://git.openjdk.java.net/jdk/pull/111 From coleenp at openjdk.java.net Fri Sep 11 00:22:44 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 11 Sep 2020 00:22:44 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= In-Reply-To: References: Message-ID: On Thu, 10 Sep 2020 23:38:45 GMT, Leonid Mesnik wrote: > The NULL oops are corrupted by CheckUnhandledOops and should be re-written with NULL to pass testing > with -XX:+CheckUnhandledOops. Changes requested by coleenp (Reviewer). src/hotspot/share/services/threadService.cpp line 888: > 886: _thread_status == java_lang_Thread::IN_OBJECT_WAIT_TIMED) { > 887: > 888: Handle obj = ThreadService::get_current_contended_monitor(thread); There must be a safepoint here then. I think this would be better and safer if blocker_object and blocker_object_owner are Handles. Can you change them to Handles? ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From ysuenaga at openjdk.java.net Fri Sep 11 00:26:07 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 11 Sep 2020 00:26:07 GMT Subject: RFR: 8252657: JVMTI agent is not unloaded when Agent_OnAttach is failed In-Reply-To: <1H1wUQdxCLU2qddqEIYSx2iOhIKL3b5etUmjsS6NBlU=.0bf1fe0c-8dcf-4ca0-bd57-b8794d5f2810@github.com> References: <1H1wUQdxCLU2qddqEIYSx2iOhIKL3b5etUmjsS6NBlU=.0bf1fe0c-8dcf-4ca0-bd57-b8794d5f2810@github.com> Message-ID: On Sat, 5 Sep 2020 14:26:17 GMT, Yasumasa Suenaga wrote: > If `Agent_OnAttach()` in JVMTI agent which is attempted to load via JVMTI.agent_load dcmd is failed, it would not be > unloaded. We've [discussed it on > serviceability-dev](https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-September/032839.html). This PR is > a continuation of that. This PR also includes to call `Agent_OnUnload()` when `Agent_OnAttach()` failed. > > How to reproduce: > > 1. Build JVMTI agent for test > $ git clone https://github.com/YaSuenag/jvmti-examples.git > $ cd jvmti-examples/helloworld/out/build > $ cmake ../.. > > 2. Run JShell > > 3. Load JVMTI agent via `jcmd JVMTI.agent_load` with "error" ("error" means `Agent_OnAttach()` returns JNI_ERR) > $ jcmd > 89456 jdk.jshell.execution.RemoteExecutionControl 45651 > 89547 sun.tools.jcmd.JCmd > 89436 jdk.jshell/jdk.internal.jshell.tool.JShellToolProvider > $ jcmd 89436 JVMTI.agent_load `pwd`/libhelloworld.so error > 89436: > return code: -1 > > 4. Check loaded libraries via `jcmd VM.dynlibs` > $ jcmd 89436 VM.dynlibs | grep libhelloworld > 7f2f8b06b000-7f2f8b06c000 r--p 00000000 fd:00 11818202 > /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06c000-7f2f8b06d000 r-xp 00001000 > fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06d000-7f2f8b06e000 > r--p 00002000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so > 7f2f8b06e000-7f2f8b06f000 r--p 00002000 fd:00 11818202 > /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06f000-7f2f8b070000 rw-p 00003000 > fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so @edvbld Can you approve me to run tier1 tests with /test PR command again? ------------- PR: https://git.openjdk.java.net/jdk/pull/19 From ysuenaga at openjdk.java.net Fri Sep 11 00:26:07 2020 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 11 Sep 2020 00:26:07 GMT Subject: RFR: 8252657: JVMTI agent is not unloaded when Agent_OnAttach is failed Message-ID: <1H1wUQdxCLU2qddqEIYSx2iOhIKL3b5etUmjsS6NBlU=.0bf1fe0c-8dcf-4ca0-bd57-b8794d5f2810@github.com> If `Agent_OnAttach()` in JVMTI agent which is attempted to load via JVMTI.agent_load dcmd is failed, it would not be unloaded. We've [discussed it on serviceability-dev](https://mail.openjdk.java.net/pipermail/serviceability-dev/2020-September/032839.html). This PR is a continuation of that. This PR also includes to call `Agent_OnUnload()` when `Agent_OnAttach()` failed. How to reproduce: 1. Build JVMTI agent for test $ git clone https://github.com/YaSuenag/jvmti-examples.git $ cd jvmti-examples/helloworld/out/build $ cmake ../.. 2. Run JShell 3. Load JVMTI agent via `jcmd JVMTI.agent_load` with "error" ("error" means `Agent_OnAttach()` returns JNI_ERR) $ jcmd 89456 jdk.jshell.execution.RemoteExecutionControl 45651 89547 sun.tools.jcmd.JCmd 89436 jdk.jshell/jdk.internal.jshell.tool.JShellToolProvider $ jcmd 89436 JVMTI.agent_load `pwd`/libhelloworld.so error 89436: return code: -1 4. Check loaded libraries via `jcmd VM.dynlibs` $ jcmd 89436 VM.dynlibs | grep libhelloworld 7f2f8b06b000-7f2f8b06c000 r--p 00000000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06c000-7f2f8b06d000 r-xp 00001000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06d000-7f2f8b06e000 r--p 00002000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06e000-7f2f8b06f000 r--p 00002000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so 7f2f8b06f000-7f2f8b070000 rw-p 00003000 fd:00 11818202 /home/ysuenaga/github/jvmti-examples/helloworld/out/build/libhelloworld.so ------------- Commit messages: - JVMTI agent is not unloaded when Agent_OnAttach is failed Changes: https://git.openjdk.java.net/jdk/pull/19/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=19&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252657 Stats: 44 lines in 4 files changed: 32 ins; 6 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/19.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/19/head:pull/19 PR: https://git.openjdk.java.net/jdk/pull/19 From david.holmes at oracle.com Fri Sep 11 02:29:35 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 11 Sep 2020 12:29:35 +1000 Subject: RFR: 8253026: Remove dummy call to gc alot from VM Thread In-Reply-To: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> References: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> Message-ID: Hi Robbin, On 11/09/2020 2:34 am, Robbin Ehn wrote: > GC Alot can only be performed by a Java Thread: > > void InterfaceSupport::gc_alot() { > Thread *thread = Thread::current(); > if (!thread->is_Java_thread()) return; // Avoid concurrent calls > > Lets remove the dummy call from the VM thread. Very odd. That code has never worked. Fix looks good. Thanks, David ----- > ------------- > > Commit messages: > - Removed code > > Changes: https://git.openjdk.java.net/jdk/pull/113/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=113&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8253026 > Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod > Patch: https://git.openjdk.java.net/jdk/pull/113.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/113/head:pull/113 > > PR: https://git.openjdk.java.net/jdk/pull/113 > From dholmes at openjdk.java.net Fri Sep 11 02:32:33 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Sep 2020 02:32:33 GMT Subject: RFR: 8253026: Remove dummy call to gc alot from VM Thread In-Reply-To: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> References: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> Message-ID: <71LWvPK_OnOsL_8P71dsv0N7aPvN4VPB45UyZAjjWl4=.f8b17016-fb0f-451a-8dc9-5c307bef60f3@github.com> On Thu, 10 Sep 2020 16:27:54 GMT, Robbin Ehn wrote: > GC Alot can only be performed by a Java Thread: > > void InterfaceSupport::gc_alot() { > Thread *thread = Thread::current(); > if (!thread->is_Java_thread()) return; // Avoid concurrent calls > > Lets remove the dummy call from the VM thread. Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/113 From dcubed at openjdk.java.net Fri Sep 11 02:38:48 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 02:38:48 GMT Subject: RFR: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 [v3] In-Reply-To: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: > This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 > testing along with JDK-8252980 and JDK-8247281. > > Since Erik and I are both contributors, we will need one other reviewer. > > This sub-task is tracking ObjectMonitor::object() cleanup changes > extracted from Erik's work on JDK-8247281. This extraction is done > to ease the code review for the JDK-8247281 changes. > > Here's the core cleanup: > > diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp > --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 > +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 > @@ -328,9 +328,9 @@ > > public: > > - void* object() const; > - void* object_addr(); > - void set_object(void* obj); > + oop object() const; > + oop* object_addr(); > + void set_object(oop obj); > void release_set_allocation_state(AllocationState s); > void set_allocation_state(AllocationState s); > AllocationState allocation_state() const; > > and those type changes ripple into the other files. > > Note: The type for the ObjectMonitor::_object field is intentionally not > being changed from "void*" in this changeset. That will be done in JDK-8247281. Daniel D. Daugherty has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Merge branch 'master' into JDK-8252981 - Remove left over setting of GVars.stw_random; fixed separately via JDK-8252126. - 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 ------------- Changes: https://git.openjdk.java.net/jdk/pull/114/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=114&range=02 Stats: 44 lines in 8 files changed: 0 ins; 0 del; 44 mod Patch: https://git.openjdk.java.net/jdk/pull/114.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/114/head:pull/114 PR: https://git.openjdk.java.net/jdk/pull/114 From dholmes at openjdk.java.net Fri Sep 11 02:56:26 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Sep 2020 02:56:26 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: References: Message-ID: <7ri_JxWmuRZPFJA7c89UL2zfkNYd9sIrXz8SKPWu--4=.7e7d98c3-7a9f-4db9-9c87-09d00be03f83@github.com> On Fri, 11 Sep 2020 02:53:51 GMT, Leonid Mesnik wrote: >> The NULL oops are corrupted by CheckUnhandledOops and should be re-written with NULL to pass testing >> with -XX:+CheckUnhandledOops. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > 8253033: CheckUnhandledOops check fails in ThreadSnapshot::initialize(...) Changes requested by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From lmesnik at openjdk.java.net Fri Sep 11 02:56:26 2020 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Fri, 11 Sep 2020 02:56:26 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: References: Message-ID: > The NULL oops are corrupted by CheckUnhandledOops and should be re-written with NULL to pass testing > with -XX:+CheckUnhandledOops. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: 8253033: CheckUnhandledOops check fails in ThreadSnapshot::initialize(...) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/123/files - new: https://git.openjdk.java.net/jdk/pull/123/files/0db863a3..c89edef2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=123&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=123&range=00-01 Stats: 6 lines in 1 file changed: 2 ins; 4 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/123.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/123/head:pull/123 PR: https://git.openjdk.java.net/jdk/pull/123 From lmesnik at openjdk.java.net Fri Sep 11 02:56:27 2020 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Fri, 11 Sep 2020 02:56:27 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 02:49:04 GMT, David Holmes wrote: >> src/hotspot/share/services/threadService.cpp line 888: >> >>> 886: _thread_status == java_lang_Thread::IN_OBJECT_WAIT_TIMED) { >>> 887: >>> 888: Handle obj = ThreadService::get_current_contended_monitor(thread); >> >> There must be a safepoint here then. >> I think this would be better and safer if blocker_object and blocker_object_owner are Handles. Can you change them to >> Handles? > > I can't see anywhere a safepoint check would occur in that code. This issue was flagged as being in Loom so perhaps the > loom code is different and is what introduces the safepoint check? But I agree with Coleen that the best solution is to > just use Handles. It is not loom-specific and reproduced n jdk/jdk with -XX:+CheckUnhandledOops. ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From dholmes at openjdk.java.net Fri Sep 11 02:56:27 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Sep 2020 02:56:27 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 00:19:47 GMT, Coleen Phillimore wrote: >> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: >> >> 8253033: CheckUnhandledOops check fails in ThreadSnapshot::initialize(...) > > src/hotspot/share/services/threadService.cpp line 888: > >> 886: _thread_status == java_lang_Thread::IN_OBJECT_WAIT_TIMED) { >> 887: >> 888: Handle obj = ThreadService::get_current_contended_monitor(thread); > > There must be a safepoint here then. > I think this would be better and safer if blocker_object and blocker_object_owner are Handles. Can you change them to > Handles? I can't see anywhere a safepoint check would occur in that code. This issue was flagged as being in Loom so perhaps the loom code is different and is what introduces the safepoint check? But I agree with Coleen that the best solution is to just use Handles. ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From lmesnik at openjdk.java.net Fri Sep 11 02:56:27 2020 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Fri, 11 Sep 2020 02:56:27 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 02:52:15 GMT, Leonid Mesnik wrote: >> I can't see anywhere a safepoint check would occur in that code. This issue was flagged as being in Loom so perhaps the >> loom code is different and is what introduces the safepoint check? But I agree with Coleen that the best solution is to >> just use Handles. > > It is not loom-specific and reproduced n jdk/jdk with -XX:+CheckUnhandledOops. What do you think about moving Handle obj = ThreadService::get_current_contended_monitor(thread); out of scope of block_object oop visibility? It is my second patch. ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From dholmes at openjdk.java.net Fri Sep 11 03:34:34 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Sep 2020 03:34:34 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: References: Message-ID: <6Tyl05gVn7uc-u4kzrEeU124xRXEcs4d4jb7SBkmAFU=.44c9828c-6a04-494f-83d9-ffa667af07cc@github.com> On Fri, 11 Sep 2020 02:53:51 GMT, Leonid Mesnik wrote: >> It is not loom-specific and reproduced n jdk/jdk with -XX:+CheckUnhandledOops. > > What do you think about moving > Handle obj = ThreadService::get_current_contended_monitor(thread); > out of scope of block_object oop visibility? > It is my second patch. I'm missing something. How can a NULL oop get corrupted even if there is a GC? ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From lmesnik at openjdk.java.net Fri Sep 11 03:40:42 2020 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Fri, 11 Sep 2020 03:40:42 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: <6Tyl05gVn7uc-u4kzrEeU124xRXEcs4d4jb7SBkmAFU=.44c9828c-6a04-494f-83d9-ffa667af07cc@github.com> References: <6Tyl05gVn7uc-u4kzrEeU124xRXEcs4d4jb7SBkmAFU=.44c9828c-6a04-494f-83d9-ffa667af07cc@github.com> Message-ID: On Fri, 11 Sep 2020 03:31:54 GMT, David Holmes wrote: >> What do you think about moving >> Handle obj = ThreadService::get_current_contended_monitor(thread); >> out of scope of block_object oop visibility? >> It is my second patch. > > I'm missing something. How can a NULL oop get corrupted even if there is a GC? This is a specific of "CheckUnhandledOops" I've written in bug comment "Another possible fix would be to disable corruption of NULL unhandled oops. They couldn't be changed really." We discussed it with Coleen and seems that moving NULL oops out of possible safepoint or handling them seems easier option than changing UnhandledOops.cpp to don't corrupt NULL. It is here: https://github.com/openjdk/jdk/blob/77bdc3065057b07a676b010562c89bb0f21512b7/src/hotspot/share/runtime/unhandledOops.cpp#L113 ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From avoitylov at openjdk.java.net Fri Sep 11 07:03:37 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Fri, 11 Sep 2020 07:03:37 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: Message-ID: > continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html > >> The download side of using JNI in these tests is that it complicates the >> setup a bit for those that run jtreg directly and/or just build the JDK >> and not the test libraries. You could reduce this burden a bit by >> limiting the load library/isMusl check to Linux only, meaning isMusl >> would not be called on other platforms. >> >> The alternative you suggest above might indeed be better. I assume you >> don't mean splitting the tests but rather just adding a second @test >> description so that the vm.musl case runs the test with a system >> property that allows the test know the expected load library path behavior. > > I have updated the PR to split the two tests in multiple @test s. > >> The updated comment in java_md.c in this looks good. A minor comment on >> Platform.isBusybox is Files.isSymbolicLink returning true implies that >> the link exists so no need to check for exists too. Also the >> if-then-else style for the new class in ProcessBuilder/Basic.java is >> inconsistent with the rest of the test so it stands out. > > Thank you, these changes are done in the updated PR. > >> Given the repo transition this weekend then I assume you'll create a PR >> for the final review at least. Also I see JEP 386 hasn't been targeted >> yet but I assume Boris, as owner, will propose-to-target and wait for it >> to be targeted before it is integrated. > > Yes. How can this be best accomplished with the new git workflow? > - we can continue the review process till the end and I will request the integration to happen only after the JEP is > targeted. I guess this step is now done by typing "slash integrate" in a comment. > - we can pause the review process now until the JEP is targeted. > > In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision: JDK-8247589: Implementation of Alpine Linux/x64 Port ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/49/files - new: https://git.openjdk.java.net/jdk/pull/49/files/f61f546a..d5994cb5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=49&range=00-01 Stats: 19 lines in 4 files changed: 7 ins; 4 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/49.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/49/head:pull/49 PR: https://git.openjdk.java.net/jdk/pull/49 From rehn at openjdk.java.net Fri Sep 11 07:10:33 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 11 Sep 2020 07:10:33 GMT Subject: RFR: 8253026: Remove dummy call to gc alot from VM Thread In-Reply-To: <71LWvPK_OnOsL_8P71dsv0N7aPvN4VPB45UyZAjjWl4=.f8b17016-fb0f-451a-8dc9-5c307bef60f3@github.com> References: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> <71LWvPK_OnOsL_8P71dsv0N7aPvN4VPB45UyZAjjWl4=.f8b17016-fb0f-451a-8dc9-5c307bef60f3@github.com> Message-ID: On Fri, 11 Sep 2020 02:30:06 GMT, David Holmes wrote: >> GC Alot can only be performed by a Java Thread: >> >> void InterfaceSupport::gc_alot() { >> Thread *thread = Thread::current(); >> if (!thread->is_Java_thread()) return; // Avoid concurrent calls >> >> Lets remove the dummy call from the VM thread. > > Marked as reviewed by dholmes (Reviewer). Thanks @coleenp and @dholmes-ora. Integrating as it is pretty trivial and have two reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/113 From rehn at openjdk.java.net Fri Sep 11 07:10:34 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 11 Sep 2020 07:10:34 GMT Subject: Integrated: 8253026: Remove dummy call to gc alot from VM Thread In-Reply-To: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> References: <-2Kgb6wnJvEUf-iJeRD1wu6skhCODqKu7BzkAVs7al8=.92338a28-898d-4b31-8ec3-3ae7d149af18@github.com> Message-ID: On Thu, 10 Sep 2020 16:27:54 GMT, Robbin Ehn wrote: > GC Alot can only be performed by a Java Thread: > > void InterfaceSupport::gc_alot() { > Thread *thread = Thread::current(); > if (!thread->is_Java_thread()) return; // Avoid concurrent calls > > Lets remove the dummy call from the VM thread. This pull request has now been integrated. Changeset: c7062dc2 Author: Robbin Ehn URL: https://git.openjdk.java.net/jdk/commit/c7062dc2 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod 8253026: Remove dummy call to gc alot from VM Thread Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/113 From rehn at openjdk.java.net Fri Sep 11 07:14:18 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 11 Sep 2020 07:14:18 GMT Subject: RFR: 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold [v2] In-Reply-To: References: Message-ID: > These flags make little sense. > They measure non safepointing VM ops, that's only handshake now. > Which have little relation to compiles. > > Builds runs locally, running T1 now. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge branch 'master' into 8253008-long-compile-flags - Removed ------------- Changes: https://git.openjdk.java.net/jdk/pull/111/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=111&range=01 Stats: 23 lines in 2 files changed: 0 ins; 22 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/111.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/111/head:pull/111 PR: https://git.openjdk.java.net/jdk/pull/111 From rehn at openjdk.java.net Fri Sep 11 07:23:46 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 11 Sep 2020 07:23:46 GMT Subject: RFR: 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold [v2] In-Reply-To: References: Message-ID: On Thu, 10 Sep 2020 23:52:03 GMT, Aditya Mandaleeka wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The pull request now >> contains two commits: >> - Merge branch 'master' into 8253008-long-compile-flags >> - Removed > > Marked as reviewed by adityam (Author). Thanks @adityamandaleeka! Conflict was trivial, build locally after merge, integrating now. ------------- PR: https://git.openjdk.java.net/jdk/pull/111 From rehn at openjdk.java.net Fri Sep 11 07:26:52 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 11 Sep 2020 07:26:52 GMT Subject: Integrated: 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold In-Reply-To: References: Message-ID: <3riYsjR1mWK7MUc5EAlZ3nOTEASURR0DBGmMq7-Gx3Q=.f814c5fc-4bd2-4a42-aa64-d44276e5f531@github.com> On Thu, 10 Sep 2020 12:15:57 GMT, Robbin Ehn wrote: > These flags make little sense. > They measure non safepointing VM ops, that's only handshake now. > Which have little relation to compiles. > > Builds runs locally, running T1 now. This pull request has now been integrated. Changeset: 8777ded1 Author: Robbin Ehn URL: https://git.openjdk.java.net/jdk/commit/8777ded1 Stats: 23 lines in 2 files changed: 22 ins; 0 del; 1 mod 8253008: Remove develop flags TraceLongCompiles/LongCompileThreshold Reviewed-by: shade, dholmes, adityam ------------- PR: https://git.openjdk.java.net/jdk/pull/111 From avoitylov at openjdk.java.net Fri Sep 11 07:39:47 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Fri, 11 Sep 2020 07:39:47 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: Message-ID: On Tue, 8 Sep 2020 23:44:58 GMT, David Holmes wrote: >> Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8247589: Implementation of Alpine Linux/x64 Port > > make/autoconf/platform.m4 line 536: > >> 534: AC_SUBST(HOTSPOT_$1_CPU_DEFINE) >> 535: >> 536: if test "x$OPENJDK_$1_LIBC" = "xmusl"; then > > I'm not clear why we only check for musl when setting the HOTSPOT_$1_LIBC variable this check is removed in the updated version. As a consequence, LIBC variable is added to the release file for all platforms, which is probably a good thing. > src/hotspot/os/linux/os_linux.cpp line 624: > >> 622: // confstr() from musl libc returns EINVAL for >> 623: // _CS_GNU_LIBC_VERSION and _CS_GNU_LIBPTHREAD_VERSION >> 624: os::Linux::set_libc_version("unknown"); > > This should be "musl - unknown" as we don't know an exact version but we do know that it is musl. Right, this should be more consistent with glibc which here returns name and version. Updated as suggested. > src/hotspot/os/linux/os_linux.cpp line 625: > >> 623: // _CS_GNU_LIBC_VERSION and _CS_GNU_LIBPTHREAD_VERSION >> 624: os::Linux::set_libc_version("unknown"); >> 625: os::Linux::set_libpthread_version("unknown"); > > This should be "musl - unknown" as we don't know an exact version but we do know that it is musl. The pthread version is also updated to "musl - unknown". Reason being, pthread functionality for musl is built into the library. > src/hotspot/share/runtime/abstract_vm_version.cpp line 263: > >> 261: #define LIBC_STR "-" XSTR(LIBC) >> 262: #else >> 263: #define LIBC_STR "" > > Again I'm not clear why we do nothing in the non-musl case? Shouldn't we be reporting glibc or musl? Unlike the case above, I think it's best to keep it as is. I'd expect there to be a bunch of scripts in the wild which parse it and may get broken when facing a triplet for existing platforms. > src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c line 284: > >> 282: // To improve portability across platforms and avoid conflicts >> 283: // between GNU and XSI versions of strerror_r, plain strerror is used. >> 284: // It's safe because this code is not used in any multithreaded environment. > > I still question this assertion. The issue is not that the current code path that leads to strerror use may be executed > concurrently but that any other strerror use could be concurrent with this one. I would consider this a "must fix" if > not for the fact we already use strerror in the code and so this doesn't really change the exposure to the problem. You are right! The updated version #ifdefs the XSI or GNU versions of strerror_r in this place. Note to self: file a bug to address the usage of strerror in other places, at least for HotSpot. > test/hotspot/jtreg/runtime/StackGuardPages/exeinvoke.c line 282: > >> 280: >> 281: pthread_attr_init(&thread_attr); >> 282: pthread_attr_setstacksize(&thread_attr, stack_size); > > Just a comment in response to the explanation as to why this change is needed. If the default thread stacksize under > musl is insufficient to successfully attach such a thread to the VM then this will cause problems for applications that > embed the VM directly (or which otherwise directly attach existing threads). This fix https://git.musl-libc.org/cgit/musl/commit/src/aio/aio.c?id=1a6d6f131bd60ec2a858b34100049f0c042089f2 addresses the problem for recent versions of musl. The test passes on a recent Alpine Linux 3.11.6 (musl 1.1.24) and fails on Alpine Linux 3.8.2 (musl 1.1.19) without this test fix. There are still older versions of the library in the wild, hence the test fix. The mitigation for such users would be a distro upgrade. > test/hotspot/jtreg/runtime/TLS/exestack-tls.c line 60: > >> 58: } >> 59: >> 60: #if defined(__GLIBC) > > Why do we use this form here but at line 30 we have: > #ifdef __GLIBC__ > ? Fixed to be consistent. ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From dcubed at openjdk.java.net Fri Sep 11 13:39:02 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 13:39:02 GMT Subject: Integrated: 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 In-Reply-To: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> References: <93bdwZv6icG8g5M9x7RnUqrh1TYktKJtckdXSh2fLDE=.c03b5ab9-9f77-4436-97ff-a90c1667e9f0@github.com> Message-ID: On Thu, 10 Sep 2020 16:56:17 GMT, Daniel D. Daugherty wrote: > This is a trivial review request. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 > testing along with JDK-8252980 and JDK-8247281. > > Since Erik and I are both contributors, we will need one other reviewer. > > This sub-task is tracking ObjectMonitor::object() cleanup changes > extracted from Erik's work on JDK-8247281. This extraction is done > to ease the code review for the JDK-8247281 changes. > > Here's the core cleanup: > > diff -r fd7f6a424cd1 src/hotspot/share/runtime/objectMonitor.hpp > --- a/src/hotspot/share/runtime/objectMonitor.hpp Fri Aug 28 16:43:09 2020 -0400 > +++ b/src/hotspot/share/runtime/objectMonitor.hpp Wed Sep 02 17:22:56 2020 -0400 > @@ -328,9 +328,9 @@ > > public: > > - void* object() const; > - void* object_addr(); > - void set_object(void* obj); > + oop object() const; > + oop* object_addr(); > + void set_object(oop obj); > void release_set_allocation_state(AllocationState s); > void set_allocation_state(AllocationState s); > AllocationState allocation_state() const; > > and those type changes ripple into the other files. > > Note: The type for the ObjectMonitor::_object field is intentionally not > being changed from "void*" in this changeset. That will be done in JDK-8247281. This pull request has now been integrated. Changeset: e7a1b9bf Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/e7a1b9bf Stats: 44 lines in 8 files changed: 0 ins; 0 del; 44 mod 8252981: ObjectMonitor::object() cleanup changes extracted from JDK-8247281 Co-authored-by: Erik ?sterlund Co-authored-by: Daniel Daugherty Reviewed-by: rehn, coleenp, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/114 From harold.seigel at oracle.com Fri Sep 11 17:16:59 2020 From: harold.seigel at oracle.com (Harold Seigel) Date: Fri, 11 Sep 2020 13:16:59 -0400 Subject: RFR 8250984: Memory Docker tests fail on some Linux kernels w/o swap limit capabilities In-Reply-To: <45eb62e60fbee8edce86df7f35fbce7afd098dd0.camel@redhat.com> References: <8E9ADAB4-1877-40EF-9BB0-D35A30572AE8@oracle.com> <1dc4eec9-32eb-671a-ba3e-4bdea8f1d741@oracle.com> <8781C429-52EB-4FF5-8BD4-67180BA55AB6@oracle.com> <45eb62e60fbee8edce86df7f35fbce7afd098dd0.camel@redhat.com> Message-ID: Thanks Bob, Severin! I'm glad we are in agreement. Harold On 9/10/2020 11:39 AM, Severin Gehwolf wrote: > On Thu, 2020-09-10 at 11:05 -0400, Bob Vandette wrote: >> Harold, >> >> I prefer the second approach since it?s consistent with the original specification of the Metrics APIs. >> You should be able to check for the -2 case in OperatingSystemImpl.java after the limits are tested for >>> =0 in order to avoid adding any extra overhead. >> 56 public long getTotalSwapSpaceSize() { >> 57 if (containerMetrics != null) { >> 58 long limit = containerMetrics.getMemoryAndSwapLimit(); >> >> 59 if (limit == CgroupSubsystem.LONG_RETVAL_NOT_SUPPORTED) { // not supported >> 60 return CgroupSubsystem.LONG_RETVAL_NOT_SUPPORTED; >> 61 } >> >> 62 // The memory limit metrics is not available if JVM runs on Linux host (not in a docker container) >> 63 // or if a docker container was started without specifying a memory limit (without '--memory=' >> 64 // Docker option). In latter case there is no limit on how much memory the container can use and >> 65 // it can use as much memory as the host's OS allows. >> 66 long memLimit = containerMetrics.getMemoryLimit(); >> 67 if (limit >= 0 && memLimit >= 0) { >> 68 return limit - memLimit; // might potentially be 0 for limit == memLimit >> 69 } >> [HERE] >> 70 } >> 71 return getTotalSwapSpaceSize0(); >> 72 } >> > I agree. Option 2 seems the preferred one for me too. One additional > consideration would be whether or not there are other cases where > cgroup files are missing. IIRC cases for their existence are different > between cgroup v1 and cgroup v2. > > Thanks, > Severin > >> If you go down this path, please check through the other container & docker tests to see if there are other cases >> where the specific message text is parsed. I believe there was at least one another case of this. >> >> Bob. >> >> >>> On Sep 10, 2020, at 10:52 AM, Harold Seigel wrote: >>> >>> Hi Bob, >>> >>> I came up with these ways to handle the test failures when swap limiting is disabled (JDK-8250984). Please let me know if any of them sound viable. >>> >>> One way is to add logging to CgroupSubsystemController when it fails to open a file such as .../memsw.linit_in_bytes. The tests would enable logging and then look for these messages to determine if swap limiting was disabled. This is yet another string for the tests to parse, but the JDK controls the contents of the strings, so there is less concern about them changing. Here's a webrev showing this potential change: >>> >>> http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr.log/webrev/index.html >>> >>> Another way is for methods such as CgroupSubsystemController.getLongEntry() to return a -2 status, indicating not-implemented, when it cannot access a file. The callers of getLongEntry() could then decide whether or not to propagate that status back to their callers, or return some other default. A partial webrev for that change is here: >>> >>> http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr.RetVal/webrev/index.html >>> >>> We could also change methods such as CgroupV1Subsystem.getMemoryAndSwapLimit() to explicitly check for the existence of the files they want to read from and return -2 if the check fails. This may have a performance impact? >>> >>> Thanks, Harold >>> >>> On 9/1/2020 12:04 PM, Bob Vandette wrote: >>>> I really dislike encoding all these strings in our tests that could possibly change. >>>> >>>> I wish we did something like check for the existence of /sys/fs/cgroup/memory/memsw.limit_in_bytes >>>> assuming that this file is not present when swap limiting is disabled. The problem with this approach >>>> and yours is that we need to make that these fixes we can run on docker, podman, cgroupv1 and cgroupv2. >>>> >>>> Others are struggling with these types of issues ? >>>> >>>> >>>> https://urldefense.com/v3/__https://github.com/containers/podman/issues/6365__;!!GqivPVa7Brio!Ld_W04WaQr2HqSF6HkoXu4VlvBPFdipdocJSMW4hNVz21MPdxrMQmMNqPPWW50xx1Q$ >>>> >>>> >>>> The Metrics API I added provides for the possibility that the call to getMemoryAndSwapLimit >>>> could fail. Perhaps the test should be checking for not supported and fix the API implementation >>>> to report the correct error (if it doesn?t already). >>>> >>>> /** >>>> * Returns the maximum amount of physical memory and swap space, >>>> * in bytes, that can be allocated in the Isolation Group. >>>> * >>>> * @return The maximum amount of memory in bytes or -1 if >>>> * there is no limit set or -2 if this metric is not supported. >>>> * >>>> */ >>>> public long getMemoryAndSwapLimit(); >>>> >>>> My .02$ >>>> >>>> Bob. >>>> >>>> >>>>> On Sep 1, 2020, at 11:31 AM, Harold Seigel >>>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> Please review this fix to enable docker tests TestMemoryAwareness.java and TestDockerMemoryMetrics.java to run on Linux kernels configured without swap limit capabilities. >>>>> >>>>> Open Webrev: >>>>> http://cr.openjdk.java.net/~hseigel/bug_8250984.dkr/webrev/index.html >>>>> >>>>> >>>>> JBS Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8250984 >>>>> >>>>> >>>>> The modified tests were run on Linux kernels with and without swap limit capabilities. >>>>> >>>>> Thanks, Harold >>>>> >>>>> From coleenp at openjdk.java.net Fri Sep 11 18:04:57 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 11 Sep 2020 18:04:57 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 02:56:26 GMT, Leonid Mesnik wrote: >> The NULL oops are corrupted by CheckUnhandledOops and should be re-written with NULL to pass testing >> with -XX:+CheckUnhandledOops. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > 8253033: CheckUnhandledOops check fails in ThreadSnapshot::initialize(...) Marked as reviewed by coleenp (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From coleenp at openjdk.java.net Fri Sep 11 18:28:39 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 11 Sep 2020 18:28:39 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: References: <6Tyl05gVn7uc-u4kzrEeU124xRXEcs4d4jb7SBkmAFU=.44c9828c-6a04-494f-83d9-ffa667af07cc@github.com> Message-ID: <-7BChl78qFAGBif9KFg5DC2cWfRrYIk455G78NXrIGc=.663b7c66-638d-47ab-a394-82e899f79fb0@github.com> On Fri, 11 Sep 2020 03:37:55 GMT, Leonid Mesnik wrote: >> I'm missing something. How can a NULL oop get corrupted even if there is a GC? > > This is a specific of "CheckUnhandledOops" > I've written in bug comment "Another possible fix would be to disable corruption of NULL unhandled oops. They couldn't > be changed really." > We discussed it with Coleen and seems that moving NULL oops out of possible safepoint or handling them seems easier > option than changing UnhandledOops.cpp to don't corrupt NULL. It is here: > https://github.com/openjdk/jdk/blob/77bdc3065057b07a676b010562c89bb0f21512b7/src/hotspot/share/runtime/unhandledOops.cpp#L113 ThreadService::get_current_contended_monitor calls Thread::check_for_dangling_thread_pointer calls ThreadsSMRSupport::is_a_protected_JavaThread_with_lock((JavaThread *) thread), The potential safepoint is here, where CheckUnhandledOops puts junk in any oop on the stack. inline bool ThreadsSMRSupport::is_a_protected_JavaThread_with_lock(JavaThread *thread) { MutexLocker ml(Threads_lock->owned_by_self() ? NULL : Threads_lock); return is_a_protected_JavaThread(thread); } ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From minqi at openjdk.java.net Fri Sep 11 18:37:15 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 11 Sep 2020 18:37:15 GMT Subject: RFR: 8252689: Classes are loaded from jrt:/java.base even when CDS is used Message-ID: Java.util.jar.Manifest related classes not archived since they are neither in classlist or loaded in dump process. Manually create a dummy Manifest object will cause those classes loaded in dump and archived. ------------- Commit messages: - 8252689: Classes are loaded from jrt:/java.base even when CDS is used Changes: https://git.openjdk.java.net/jdk/pull/134/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=134&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252689 Stats: 34 lines in 3 files changed: 19 ins; 13 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/134.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/134/head:pull/134 PR: https://git.openjdk.java.net/jdk/pull/134 From dcubed at openjdk.java.net Fri Sep 11 18:55:50 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 18:55:50 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage Message-ID: This RFE is to migrate the following field to OopStorage: class ObjectMonitor { void* volatile _object; // backward object pointer - strong root Unlike the previous patches in this series, there are a lot of collateral changes so this is not a trivial review. Sorry for the tedious parts of the review. Since Erik and I are both contributors to this patch, we would like at least 1 GC team reviewer and 1 Runtime team reviewer. This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing along with JDK-8252980 and JDK-8252981. I also ran it through my inflation stress kit for 48 hours on my Linux-X64 machine. ------------- Commit messages: - 8247281: migrate ObjectMonitor::_object to OopStorage Changes: https://git.openjdk.java.net/jdk/pull/135/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8247281 Stats: 445 lines in 36 files changed: 108 ins; 234 del; 103 mod Patch: https://git.openjdk.java.net/jdk/pull/135.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/135/head:pull/135 PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 18:55:50 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 18:55:50 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 18:45:28 GMT, Daniel D. Daugherty wrote: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. @fisk - Please chime in on this review when you get the chance. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 19:05:43 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 19:05:43 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 18:45:28 GMT, Daniel D. Daugherty wrote: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. The vast majority of these changes came from @fisk so I'm actually a reviewer and stress tester of these changes. Thumbs up! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From eosterlund at openjdk.java.net Fri Sep 11 19:20:56 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 11 Sep 2020 19:20:56 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: References: Message-ID: <06UhnxjoBW9AUQP_YyeBhiFIIOGQCpyT1SE7NWhSqi0=.c266606c-f72f-4496-95e2-424e3de24a6b@github.com> On Fri, 11 Sep 2020 18:45:28 GMT, Daniel D. Daugherty wrote: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. This looks great Dan. I like your addition of is_chainmarker() - it makes this step a lot better, without having any special oops that are not oops. Thanks for sorting this out! ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 19:29:21 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 19:29:21 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: <06UhnxjoBW9AUQP_YyeBhiFIIOGQCpyT1SE7NWhSqi0=.c266606c-f72f-4496-95e2-424e3de24a6b@github.com> References: <06UhnxjoBW9AUQP_YyeBhiFIIOGQCpyT1SE7NWhSqi0=.c266606c-f72f-4496-95e2-424e3de24a6b@github.com> Message-ID: On Fri, 11 Sep 2020 19:18:26 GMT, Erik ?sterlund wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > This looks great Dan. I like your addition of is_chainmarker() - it makes this step a lot better, without having any > special oops that are not oops. Thanks for sorting this out! @fisk - Thanks for the blinding fast review! (Pretty easy when you wrote almost all of the code). Re: is_chainmarker() I figured you would appreciate getting rid of one more "special" oop value! And its use just fits in with the whole AllocationState model. It also gets removed quite easily with the part3 patch... Oh yeah, I gotta file a new RFE for that one. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From redestad at openjdk.java.net Fri Sep 11 19:41:38 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Fri, 11 Sep 2020 19:41:38 GMT Subject: RFR: 8252689: Classes are loaded from jrt:/java.base even when CDS is used In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 18:31:02 GMT, Yumin Qi wrote: > Java.util.jar.Manifest related classes not archived since they are neither in classlist or loaded in dump process. > Manually create a dummy Manifest object will cause those classes loaded in dump and archived. I think you could achieve the same if you extend the [HelloClasslist](https://github.com/openjdk/jdk/blob/master/make/jdk/src/classes/build/tools/classlist/HelloClasslist.java) tool built and used to generate the default classlist file to include a manifest in its jar file. Add a dummy manifest.mf somewhere and edit the CLASSLIST_JAR target in [GenerateLinkOptData](https://github.com/openjdk/jdk/blob/master/make/GenerateLinkOptData.gmk) to include it in the jar. This would avoid hard-coding this in hotspot code and keep the responsibility of curating the default classlist in one place. ------------- PR: https://git.openjdk.java.net/jdk/pull/134 From iklam at openjdk.java.net Fri Sep 11 20:13:15 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Sep 2020 20:13:15 GMT Subject: RFR: 8252689: Classes are loaded from jrt:/java.base even when CDS is used In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 19:38:58 GMT, Claes Redestad wrote: > I think you could achieve the same if you extend the > [HelloClasslist](https://github.com/openjdk/jdk/blob/master/make/jdk/src/classes/build/tools/classlist/HelloClasslist.java) > tool built and used to generate the default classlist file to include a manifest in its jar file. Add a dummy > manifest.mf somewhere and edit the CLASSLIST_JAR target in > [GenerateLinkOptData](https://github.com/openjdk/jdk/blob/master/make/GenerateLinkOptData.gmk) to include it in the > jar. This would avoid hard-coding this in hotspot code and keep the responsibility of curating the default classlist > in one place. During the JDK build, HelloClasslist is already packaged with a manifest: $ jar tf ./support/classlist.jar META-INF/ META-INF/MANIFEST.MF build/tools/classlist/HelloClasslist.class And the default CDS archive contains the 3 classes mentioned in the bug report: $ ./images/jdk/bin/java -cp ./support/classlist.jar -verbose build.tools.classlist.HelloClasslist | \ egrep '((java.util.jar.Attributes)|(java.util.LinkedHashMap)|(java.util.jar.Manifest.FastInputStream)) source' [0.044s][info][class,load] java.util.jar.Attributes source: shared objects file [0.044s][info][class,load] java.util.LinkedHashMap source: shared objects file [0.044s][info][class,load] java.util.jar.Manifest$FastInputStream source: shared objects file However, for some reason, if you run a simple HelloWorld.jar to collect the class list, these 3 classes are not loaded. As a result, these classes may not be stored in custom archives created from your own class lists. $ jar tf ~/tmp/HelloWorld.jar META-INF/ META-INF/MANIFEST.MF HelloWorld.class $ ./images/jdk/bin/java -Xshare:off -cp ~/tmp/HelloWorld.jar -verbose \ -XX:DumpLoadedClassList=foo.list HelloWorld | egrep \ '((java.util.jar.Attributes)|(java.util.LinkedHashMap)|(java.util.jar.Manifest.FastInputStream)) source' | wc 0 0 0 I think the reason is the built-in class loader's Java code takes a different code path than the CDS shared class loading code ([systemDictionaryShared.cpp](https://github.com/openjdk/jdk/blob/5c0d985abf7ef89f9a035692fe9db3ef4001bc2c/src/hotspot/share/classfile/systemDictionaryShared.cpp#L687)). So we should exercise the CDS code during dump time to ensure that the classes used by CDS are always loaded. ------------- PR: https://git.openjdk.java.net/jdk/pull/134 From iklam at openjdk.java.net Fri Sep 11 20:17:13 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Sep 2020 20:17:13 GMT Subject: RFR: 8252689: Classes are loaded from jrt:/java.base even when CDS is used In-Reply-To: References: Message-ID: <4BvFMzlNfu8HecOqJ6QI7Wrc-08vZVqT4EiWoVTK_ZU=.a45f9d8b-10f7-4293-a7a6-fae33b0336e7@github.com> On Fri, 11 Sep 2020 18:31:02 GMT, Yumin Qi wrote: > Java.util.jar.Manifest related classes not archived since they are neither in classlist or loaded in dump process. > Manually create a dummy Manifest object will cause those classes loaded in dump and archived. Marked as reviewed by iklam (Reviewer). src/hotspot/share/memory/metaspaceShared.cpp line 1359: > 1357: HeapShared::init_for_dumping(THREAD); > 1358: > 1359: // create a dummy manifest to cause more classes loaded How about: `// exercise the manifest processing code to ensue classes used by CDS are always archived`? src/hotspot/share/classfile/systemDictionaryShared.cpp line 690: > 688: if (shared_jar_manifest(shared_path_index) == NULL) { > 689: SharedClassPathEntry* ent = FileMapInfo::shared_path(shared_path_index); > 690: size_t size = (size_t)ent->manifest_size(); size_t is unsigned, so I think you should change the following test to `if (size == 0)` ------------- PR: https://git.openjdk.java.net/jdk/pull/134 From dcubed at openjdk.java.net Fri Sep 11 20:21:40 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 20:21:40 GMT Subject: RFR: 8252689: Classes are loaded from jrt:/java.base even when CDS is used In-Reply-To: <4BvFMzlNfu8HecOqJ6QI7Wrc-08vZVqT4EiWoVTK_ZU=.a45f9d8b-10f7-4293-a7a6-fae33b0336e7@github.com> References: <4BvFMzlNfu8HecOqJ6QI7Wrc-08vZVqT4EiWoVTK_ZU=.a45f9d8b-10f7-4293-a7a6-fae33b0336e7@github.com> Message-ID: <_lVOTGN3NBLhAYABBb8fxoFpKkR-YZ3-7UMYciu7Pks=.cc1e3fd0-872d-481c-b1d0-ef9f09411dba@github.com> On Fri, 11 Sep 2020 20:14:20 GMT, Ioi Lam wrote: >> Java.util.jar.Manifest related classes not archived since they are neither in classlist or loaded in dump process. >> Manually create a dummy Manifest object will cause those classes loaded in dump and archived. > > src/hotspot/share/memory/metaspaceShared.cpp line 1359: > >> 1357: HeapShared::init_for_dumping(THREAD); >> 1358: >> 1359: // create a dummy manifest to cause more classes loaded > > How about: `// exercise the manifest processing code to ensue classes used by CDS are always archived`? s/ensue classes/ensure classes/ ------------- PR: https://git.openjdk.java.net/jdk/pull/134 From minqi at openjdk.java.net Fri Sep 11 21:02:12 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 11 Sep 2020 21:02:12 GMT Subject: RFR: 8252689: Classes are loaded from jrt:/java.base even when CDS is used [v2] In-Reply-To: References: Message-ID: > Java.util.jar.Manifest related classes not archived since they are neither in classlist or loaded in dump process. > Manually create a dummy Manifest object will cause those classes loaded in dump and archived. Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: 8252689: Classes are loaded from jrt:/java.base even when CDS is used ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/134/files - new: https://git.openjdk.java.net/jdk/pull/134/files/7c1df12f..ee270886 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=134&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=134&range=00-01 Stats: 75 lines in 3 files changed: 73 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/134.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/134/head:pull/134 PR: https://git.openjdk.java.net/jdk/pull/134 From coleenp at openjdk.java.net Fri Sep 11 21:03:35 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 11 Sep 2020 21:03:35 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 18:45:28 GMT, Daniel D. Daugherty wrote: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. Changes requested by coleenp (Reviewer). src/hotspot/share/prims/jvmtiTagMap.cpp line 3021: > 3019: > 3020: // Inflated monitors > 3021: blk.set_kind(JVMTI_HEAP_REFERENCE_MONITOR); So we don't have to provide the equivalent of JVMTI_HEAP_REFERENCE_MONITOR? src/hotspot/share/runtime/objectMonitor.cpp line 246: > 244: // Check that object() and set_object() are called from the right context: > 245: static void check_object_context() { > 246: Thread *self = Thread::current(); Nit * is in the wrong place. src/hotspot/share/runtime/objectMonitor.cpp line 251: > 249: guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be"); > 250: if (self->is_Java_thread()) { > 251: JavaThread* jt = (JavaThread*)self; With David's new change this should use as_Java_thread(). src/hotspot/share/runtime/objectMonitor.cpp line 268: > 266: return NULL; > 267: } > 268: return _object.resolve(); Why would _object be NULL? It should be non-null after creation. It might point to null but then _object.resolve() would return NULL. This NULL check doesn't make sense to me, same with the peek function below. src/hotspot/share/services/heapDumper.cpp line 1395: > 1393: void do_oop(oop* obj_p) { > 1394: u4 size = 1 + sizeof(address); > 1395: writer()->start_sub_record(HPROF_GC_ROOT_MONITOR_USED, size); I had a similar question to the jvmtiTagMap question above. Are there tools that are going to miss seeing this tag in the heap dump? I hope these tags are implementation defined and we can just remove them. Otherwise, should there be a loop through the OM list to print out these tags for live object monitors? src/hotspot/share/runtime/synchronizer.cpp line 1548: > 1546: bool from_per_thread_alloc) { > 1547: guarantee(m->header().value() == 0, "invariant"); > 1548: guarantee(m->object_peek() == NULL, "invariant"); Because of type stable memory, you don't release the WeakHandle when the OM is released. The oop inside the WeakHandle is replaced in when an OM is reused? src/hotspot/share/runtime/synchronizer.cpp line 163: > 161: > 162: #define CHAINMARKER (cast_to_oop(-1)) > 163: Great. One less non-oop oop! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From eosterlund at openjdk.java.net Fri Sep 11 21:20:21 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 11 Sep 2020 21:20:21 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 20:31:49 GMT, Coleen Phillimore wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > src/hotspot/share/prims/jvmtiTagMap.cpp line 3021: > >> 3019: >> 3020: // Inflated monitors >> 3021: blk.set_kind(JVMTI_HEAP_REFERENCE_MONITOR); > > So we don't have to provide the equivalent of JVMTI_HEAP_REFERENCE_MONITOR? The JVMTI roots are strong roots. This is no longer a strong root, so reporting them would be a bug after this change. > src/hotspot/share/runtime/objectMonitor.cpp line 268: > >> 266: return NULL; >> 267: } >> 268: return _object.resolve(); > > Why would _object be NULL? It should be non-null after creation. It might point to null but then _object.resolve() > would return NULL. This NULL check doesn't make sense to me, same with the peek function below. Because before the TSM removal patch, the monitors are allocated in blocks. Then, the objects are not known when the monitors are allocated. The handles are instead lazily created... for now. With the next patch, that will probably change, if we agree to allocate ObjectMonitors one by one on inflation time. But that is one patch away. > src/hotspot/share/services/heapDumper.cpp line 1395: > >> 1393: void do_oop(oop* obj_p) { >> 1394: u4 size = 1 + sizeof(address); >> 1395: writer()->start_sub_record(HPROF_GC_ROOT_MONITOR_USED, size); > > I had a similar question to the jvmtiTagMap question above. Are there tools that are going to miss seeing this tag in > the heap dump? I hope these tags are implementation defined and we can just remove them. Otherwise, should there be a > loop through the OM list to print out these tags for live object monitors? Same answer as above. No longer a strong root, so can't report it as such. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 21:20:22 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 21:20:22 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: References: Message-ID: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> On Fri, 11 Sep 2020 21:08:59 GMT, Erik ?sterlund wrote: >> src/hotspot/share/prims/jvmtiTagMap.cpp line 3021: >> >>> 3019: >>> 3020: // Inflated monitors >>> 3021: blk.set_kind(JVMTI_HEAP_REFERENCE_MONITOR); >> >> So we don't have to provide the equivalent of JVMTI_HEAP_REFERENCE_MONITOR? > > The JVMTI roots are strong roots. This is no longer a strong root, so reporting them would be a bug after this change. The function is VM_HeapWalkOperation::collect_simple_roots() and we no longer have a root in the ObjectMonitor so my take is no we don't. I believe @fisk concurs with that reasoning. >> src/hotspot/share/runtime/objectMonitor.cpp line 268: >> >>> 266: return NULL; >>> 267: } >>> 268: return _object.resolve(); >> >> Why would _object be NULL? It should be non-null after creation. It might point to null but then _object.resolve() >> would return NULL. This NULL check doesn't make sense to me, same with the peek function below. > > Because before the TSM removal patch, the monitors are allocated in blocks. Then, the objects are not known when the > monitors are allocated. The handles are instead lazily created... for now. With the next patch, that will probably > change, if we agree to allocate ObjectMonitors one by one on inflation time. But that is one patch away. Erik's part3 patch ensures that when the ObjectMonitor is allocated, the weak handle is created and initialized to the oop at that point. Since this is still the TSM version, the ObjectMonitor may not yet have a weak handle allocated at the time that object() is called. >> src/hotspot/share/services/heapDumper.cpp line 1395: >> >>> 1393: void do_oop(oop* obj_p) { >>> 1394: u4 size = 1 + sizeof(address); >>> 1395: writer()->start_sub_record(HPROF_GC_ROOT_MONITOR_USED, size); >> >> I had a similar question to the jvmtiTagMap question above. Are there tools that are going to miss seeing this tag in >> the heap dump? I hope these tags are implementation defined and we can just remove them. Otherwise, should there be a >> loop through the OM list to print out these tags for live object monitors? > > Same answer as above. No longer a strong root, so can't report it as such. We no longer have a root in the ObjectMonitor so no we don't have to dump these is my take. I believe @fisk concurs with that reasoning. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 21:20:24 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 21:20:24 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 20:33:11 GMT, Coleen Phillimore wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > src/hotspot/share/runtime/objectMonitor.cpp line 246: > >> 244: // Check that object() and set_object() are called from the right context: >> 245: static void check_object_context() { >> 246: Thread *self = Thread::current(); > > Nit * is in the wrong place. I'll fix that. > src/hotspot/share/runtime/objectMonitor.cpp line 251: > >> 249: guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be"); >> 250: if (self->is_Java_thread()) { >> 251: JavaThread* jt = (JavaThread*)self; > > With David's new change this should use as_Java_thread(). Yup. Since this is a new function, it didn't pop up as a conflict. I'll fix that. > src/hotspot/share/runtime/synchronizer.cpp line 1548: > >> 1546: bool from_per_thread_alloc) { >> 1547: guarantee(m->header().value() == 0, "invariant"); >> 1548: guarantee(m->object_peek() == NULL, "invariant"); > > Because of type stable memory, you don't release the WeakHandle when the OM is released. The oop inside the WeakHandle > is replaced in when an OM is reused? Correct. See: +void ObjectMonitor::set_object(oop obj) { + check_object_context(); + if (_object.is_null()) { + _object = WeakHandle(_oop_storage, obj); + } else { + _object.replace(obj); + } +} So when 'obj' == NULL, we replace the weak handle's value with NULL and we don't release/delete/whatever the weak handle. > src/hotspot/share/runtime/synchronizer.cpp line 163: > >> 161: >> 162: #define CHAINMARKER (cast_to_oop(-1)) >> 163: > > Great. One less non-oop oop! Exactly! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From eosterlund at openjdk.java.net Fri Sep 11 21:20:24 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 11 Sep 2020 21:20:24 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Fri, 11 Sep 2020 21:09:43 GMT, Daniel D. Daugherty wrote: >> The JVMTI roots are strong roots. This is no longer a strong root, so reporting them would be a bug after this change. > > The function is VM_HeapWalkOperation::collect_simple_roots() > and we no longer have a root in the ObjectMonitor so my take is > no we don't. I believe @fisk concurs with that reasoning. Yes, exactly. >> Same answer as above. No longer a strong root, so can't report it as such. > > We no longer have a root in the ObjectMonitor so no we don't have > to dump these is my take. I believe @fisk concurs with that reasoning. Agreed. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From coleenp at openjdk.java.net Fri Sep 11 21:45:10 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 11 Sep 2020 21:45:10 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 21:42:08 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > coleenp CR - changes to resolve Coleen's comments. I approve of this without seeing any new changes. src/hotspot/share/runtime/thread.cpp line 4593: > 4591: // are used in om_flush(). > 4592: BarrierSet::barrier_set()->on_thread_detach(p); > 4593: One last question, that doesn't require a comment why it's here, but why was this moved? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 21:45:08 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 21:45:08 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: Message-ID: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: coleenp CR - changes to resolve Coleen's comments. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/135/files - new: https://git.openjdk.java.net/jdk/pull/135/files/150a398b..750fe771 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/135.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/135/head:pull/135 PR: https://git.openjdk.java.net/jdk/pull/135 From coleenp at openjdk.java.net Fri Sep 11 21:45:10 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 11 Sep 2020 21:45:10 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 21:33:56 GMT, Coleen Phillimore wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> coleenp CR - changes to resolve Coleen's comments. > > I approve of this without seeing any new changes. I have to also comment that reviewing code in github is really nice! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 21:45:10 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 21:45:10 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 21:35:40 GMT, Coleen Phillimore wrote: >> I approve of this without seeing any new changes. > > I have to also comment that reviewing code in github is really nice! Agreed. When I did my self-review for this fix, I didn't use the webrev link at all. I did generate a local webrev via "git webrev" and sanity checked that webrev to verify that "git webrev" works (and it did). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From coleenp at openjdk.java.net Fri Sep 11 21:45:11 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 11 Sep 2020 21:45:11 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Fri, 11 Sep 2020 21:13:13 GMT, Daniel D. Daugherty wrote: >> Because before the TSM removal patch, the monitors are allocated in blocks. Then, the objects are not known when the >> monitors are allocated. The handles are instead lazily created... for now. With the next patch, that will probably >> change, if we agree to allocate ObjectMonitors one by one on inflation time. But that is one patch away. > > Erik's part3 patch ensures that when the ObjectMonitor is allocated, > the weak handle is created and initialized to the oop at that point. Since > this is still the TSM version, the ObjectMonitor may not yet have a weak > handle allocated at the time that object() is called. Ok, this is cool then and good that this logic is here instead of inside the WeakHandle::resolve/peek functions. The OopHandle similar functions have a null check, so if this was permanent, I'd suggest putting it there. Since this is going to change, having it here and keeping the non-null invariants for WeakHandle::resolve/peek functions is better. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From coleenp at openjdk.java.net Fri Sep 11 21:45:11 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 11 Sep 2020 21:45:11 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Fri, 11 Sep 2020 21:15:56 GMT, Erik ?sterlund wrote: >> We no longer have a root in the ObjectMonitor so no we don't have >> to dump these is my take. I believe @fisk concurs with that reasoning. > > Agreed. Ok, that's good then! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 21:45:12 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 21:45:12 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 21:33:33 GMT, Coleen Phillimore wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> coleenp CR - changes to resolve Coleen's comments. > > src/hotspot/share/runtime/thread.cpp line 4593: > >> 4591: // are used in om_flush(). >> 4592: BarrierSet::barrier_set()->on_thread_detach(p); >> 4593: > > One last question, that doesn't require a comment why it's here, but why was this moved? We had to move that code to make om_flush() happy. om_flush() accesses the (weak) oops in the monitor list after the thread is off the threads list so the barrier change had to move. It will move back in part3. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 21:45:14 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 21:45:14 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 21:10:27 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/runtime/objectMonitor.cpp line 251: >> >>> 249: guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be"); >>> 250: if (self->is_Java_thread()) { >>> 251: JavaThread* jt = (JavaThread*)self; >> >> With David's new change this should use as_Java_thread(). > > Yup. Since this is a new function, it didn't pop up as a conflict. > I'll fix that. Fixed in https://github.com/openjdk/jdk/pull/135/commits/750fe771943178a02f1b71a713f7417512e3628e. >> src/hotspot/share/runtime/objectMonitor.cpp line 246: >> >>> 244: // Check that object() and set_object() are called from the right context: >>> 245: static void check_object_context() { >>> 246: Thread *self = Thread::current(); >> >> Nit * is in the wrong place. > > I'll fix that. Fixed in https://github.com/openjdk/jdk/pull/135/commits/750fe771943178a02f1b71a713f7417512e3628e >> src/hotspot/share/runtime/synchronizer.cpp line 1548: >> >>> 1546: bool from_per_thread_alloc) { >>> 1547: guarantee(m->header().value() == 0, "invariant"); >>> 1548: guarantee(m->object_peek() == NULL, "invariant"); >> >> Because of type stable memory, you don't release the WeakHandle when the OM is released. The oop inside the WeakHandle >> is replaced in when an OM is reused? > > Correct. See: > > +void ObjectMonitor::set_object(oop obj) { > + check_object_context(); > + if (_object.is_null()) { > + _object = WeakHandle(_oop_storage, obj); > + } else { > + _object.replace(obj); > + } > +} > > So when 'obj' == NULL, we replace the weak handle's value with NULL > and we don't release/delete/whatever the weak handle. @fisk - I can't figure out what "magic" to use to get the above code quote to show up as such. Suggestion? ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 21:45:12 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 21:45:12 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Fri, 11 Sep 2020 21:15:06 GMT, Erik ?sterlund wrote: >> The function is VM_HeapWalkOperation::collect_simple_roots() >> and we no longer have a root in the ObjectMonitor so my take is >> no we don't. I believe @fisk concurs with that reasoning. > > Yes, exactly. Thanks for confirmation. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 21:45:14 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 21:45:14 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Fri, 11 Sep 2020 21:31:05 GMT, Coleen Phillimore wrote: >> Erik's part3 patch ensures that when the ObjectMonitor is allocated, >> the weak handle is created and initialized to the oop at that point. Since >> this is still the TSM version, the ObjectMonitor may not yet have a weak >> handle allocated at the time that object() is called. > > Ok, this is cool then and good that this logic is here instead of inside the WeakHandle::resolve/peek functions. The > OopHandle similar functions have a null check, so if this was permanent, I'd suggest putting it there. Since this is > going to change, having it here and keeping the non-null invariants for WeakHandle::resolve/peek functions is better. Thanks for confirmation. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Fri Sep 11 21:45:14 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 11 Sep 2020 21:45:14 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Fri, 11 Sep 2020 21:31:30 GMT, Coleen Phillimore wrote: >> Agreed. > > Ok, that's good then! Thanks for confirmation. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From iklam at openjdk.java.net Fri Sep 11 21:56:22 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Sep 2020 21:56:22 GMT Subject: RFR: 8252689: Classes are loaded from jrt:/java.base even when CDS is used [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 21:02:12 GMT, Yumin Qi wrote: >> Java.util.jar.Manifest related classes not archived since they are neither in classlist or loaded in dump process. >> Manually create a dummy Manifest object will cause those classes loaded in dump and archived. > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > 8252689: Classes are loaded from jrt:/java.base even when CDS is used Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/134 From iklam at openjdk.java.net Fri Sep 11 22:17:50 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Sep 2020 22:17:50 GMT Subject: RFR: 8248186: Move CDS C++ vtable code to cppVtables.cpp Message-ID: I moved the code that supports C++ vtables in the CDS archive from metaspaceShared.cpp to a new file, cppVtables.cpp. To keep the refactoring straightforward, the code is moved verbatim, except for: - Methods are renamed from `MetaspaceShared::xx` to `CppVtables::xx` - Access to `_mc_region` is changed to the inline function `mc_region()` ------------- Commit messages: - 8248186: Move CDS C++ vtable code to cppVtables.cpp Changes: https://git.openjdk.java.net/jdk/pull/136/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=136&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8248186 Stats: 696 lines in 7 files changed: 384 ins; 305 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/136.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/136/head:pull/136 PR: https://git.openjdk.java.net/jdk/pull/136 From ccheung at openjdk.java.net Fri Sep 11 22:23:44 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Fri, 11 Sep 2020 22:23:44 GMT Subject: RFR: 8252689: Classes are loaded from jrt:/java.base even when CDS is used [v2] In-Reply-To: References: Message-ID: <27wueSCjzCK53x6aM6bBpA84EpXmyw37MGi3sHykJJY=.fd02370b-149b-4a55-935a-b3932127c921@github.com> On Fri, 11 Sep 2020 21:02:12 GMT, Yumin Qi wrote: >> Java.util.jar.Manifest related classes not archived since they are neither in classlist or loaded in dump process. >> Manually create a dummy Manifest object will cause those classes loaded in dump and archived. > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > 8252689: Classes are loaded from jrt:/java.base even when CDS is used Looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/134 From coleenp at openjdk.java.net Fri Sep 11 22:32:04 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 11 Sep 2020 22:32:04 GMT Subject: RFR: 8248186: Move CDS C++ vtable code to cppVtables.cpp In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 22:03:27 GMT, Ioi Lam wrote: > I moved the code that supports C++ vtables in the CDS archive from metaspaceShared.cpp to a new file, cppVtables.cpp. > To keep the refactoring straightforward, the code is moved verbatim, except for: > - Methods are renamed from `MetaspaceShared::xx` to `CppVtables::xx` > - Access to `_mc_region` is changed to the inline function `mc_region()` Marked as reviewed by coleenp (Reviewer). src/hotspot/share/memory/cppVtables.cpp line 33: > 31: #include "oops/instanceMirrorKlass.hpp" > 32: #include "oops/instanceRefKlass.hpp" > 33: #include "oops/methodData.hpp" Did you check that you needed all of these #include files? ------------- PR: https://git.openjdk.java.net/jdk/pull/136 From iklam at openjdk.java.net Fri Sep 11 22:39:06 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 11 Sep 2020 22:39:06 GMT Subject: RFR: 8248186: Move CDS C++ vtable code to cppVtables.cpp In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 22:28:45 GMT, Coleen Phillimore wrote: >> I moved the code that supports C++ vtables in the CDS archive from metaspaceShared.cpp to a new file, cppVtables.cpp. >> To keep the refactoring straightforward, the code is moved verbatim, except for: >> - Methods are renamed from `MetaspaceShared::xx` to `CppVtables::xx` >> - Access to `_mc_region` is changed to the inline function `mc_region()` > > src/hotspot/share/memory/cppVtables.cpp line 33: > >> 31: #include "oops/instanceMirrorKlass.hpp" >> 32: #include "oops/instanceRefKlass.hpp" >> 33: #include "oops/methodData.hpp" > > Did you check that you needed all of these #include files? Yes, I started with no headers and kept adding until gcc stopped complaining. I was using a non-PCH build. ------------- PR: https://git.openjdk.java.net/jdk/pull/136 From dholmes at openjdk.java.net Fri Sep 11 23:03:12 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Sep 2020 23:03:12 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: <-7BChl78qFAGBif9KFg5DC2cWfRrYIk455G78NXrIGc=.663b7c66-638d-47ab-a394-82e899f79fb0@github.com> References: <6Tyl05gVn7uc-u4kzrEeU124xRXEcs4d4jb7SBkmAFU=.44c9828c-6a04-494f-83d9-ffa667af07cc@github.com> <-7BChl78qFAGBif9KFg5DC2cWfRrYIk455G78NXrIGc=.663b7c66-638d-47ab-a394-82e899f79fb0@github.com> Message-ID: On Fri, 11 Sep 2020 18:26:03 GMT, Coleen Phillimore wrote: >> This is a specific of "CheckUnhandledOops" >> I've written in bug comment "Another possible fix would be to disable corruption of NULL unhandled oops. They couldn't >> be changed really." >> We discussed it with Coleen and seems that moving NULL oops out of possible safepoint or handling them seems easier >> option than changing UnhandledOops.cpp to don't corrupt NULL. It is here: >> https://github.com/openjdk/jdk/blob/77bdc3065057b07a676b010562c89bb0f21512b7/src/hotspot/share/runtime/unhandledOops.cpp#L113 > > ThreadService::get_current_contended_monitor calls Thread::check_for_dangling_thread_pointer calls > ThreadsSMRSupport::is_a_protected_JavaThread_with_lock((JavaThread *) thread), > The potential safepoint is here, where CheckUnhandledOops puts junk in any oop on the stack. > > inline bool ThreadsSMRSupport::is_a_protected_JavaThread_with_lock(JavaThread *thread) { > MutexLocker ml(Threads_lock->owned_by_self() ? NULL : Threads_lock); > return is_a_protected_JavaThread(thread); > } Thanks Coleen. I'm still not sure that CheckUnhandledOops should be touching NULL oops but ... Leonid the workaround seems okay. ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From dholmes at openjdk.java.net Fri Sep 11 23:03:11 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Sep 2020 23:03:11 GMT Subject: RFR: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 02:56:26 GMT, Leonid Mesnik wrote: >> The NULL oops are corrupted by CheckUnhandledOops and should be re-written with NULL to pass testing >> with -XX:+CheckUnhandledOops. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > 8253033: CheckUnhandledOops check fails in ThreadSnapshot::initialize(...) Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From lmesnik at openjdk.java.net Fri Sep 11 23:07:20 2020 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Fri, 11 Sep 2020 23:07:20 GMT Subject: Integrated: 8253033: CheckUnhandledOops check fails in =?UTF-8?B?VGhyZWFkU25hcHNob3Q6OmluaXRpYWxpemXigKY=?= In-Reply-To: References: Message-ID: <3Uaym2YfriVNSzjK6Y90cmu89nPks-8dJct8PGjVOeM=.ef9afcb1-995a-4216-8268-295696a42926@github.com> On Thu, 10 Sep 2020 23:38:45 GMT, Leonid Mesnik wrote: > The NULL oops are corrupted by CheckUnhandledOops and should be re-written with NULL to pass testing > with -XX:+CheckUnhandledOops. This pull request has now been integrated. Changeset: 306b1663 Author: Leonid Mesnik URL: https://git.openjdk.java.net/jdk/commit/306b1663 Stats: 3 lines in 1 file changed: 1 ins; 2 del; 0 mod 8253033: CheckUnhandledOops check fails in ThreadSnapshot::initialize? Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/123 From kim.barrett at oracle.com Sat Sep 12 00:19:29 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 11 Sep 2020 20:19:29 -0400 Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: References: Message-ID: <5A831568-AD7A-4483-848C-F47D4A0115CD@oracle.com> > On Sep 11, 2020, at 2:55 PM, Daniel D.Daugherty wrote: > > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. > > ------------- > > Commit messages: > - 8247281: migrate ObjectMonitor::_object to OopStorage > > Changes: https://git.openjdk.java.net/jdk/pull/135/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8247281 > Stats: 445 lines in 36 files changed: 108 ins; 234 del; 103 mod > Patch: https://git.openjdk.java.net/jdk/pull/135.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/135/head:pull/135 > > PR: https://git.openjdk.java.net/jdk/pull/135 ------------------------------------------------------------------------------ src/hotspot/share/oops/weakHandle.cpp 36 WeakHandle::WeakHandle(OopStorage* storage, oop obj) : 37 _obj(storage->allocate()) { 38 assert(obj != NULL, "no need to create weak null oop"); Please format this differently so the ctor-init-list is more easily distinguished from the body. I don't care that much which of the several alternatives is used. ------------------------------------------------------------------------------ src/hotspot/share/runtime/objectMonitor.cpp 244 // Check that object() and set_object() are called from the right context: 245 static void check_object_context() { This seems like checking we would normally only do in a debug build. Is this really supposed to be done in product builds too? (It's written to support that, just wondering if that's really what we want.) Maybe these aren't called very often so it doesn't matter? I also see that guarantee (rather than assert) is used a fair amount in this and related code. ------------------------------------------------------------------------------ src/hotspot/share/runtime/objectMonitor.cpp 251 JavaThread* jt = (JavaThread*)self; Use self->as_Java_thread(). Later: Coleen already commented on this and it's been fixed. ------------------------------------------------------------------------------ src/hotspot/share/runtime/objectMonitor.cpp 249 guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be"); 250 if (self->is_Java_thread()) { Maybe instead if (self->is_Java_thread()) { ... } else { guarantee(self->is_VM_thread(), "must be"); } ------------------------------------------------------------------------------ src/hotspot/share/runtime/objectMonitor.cpp Both ObjectMonitor::object() and ObjectMonitor::object_peek() have 265 if (_object.is_null()) { 266 return NULL; Should we really be calling those functions when in such a state? That seems like it might be a bug in the callers? OK, I think I see some places where object_peek() might need such protection because of races, in src/hotspot/share/runtime/synchronizer.cpp. And because we don't seem to ever release() the underlying WeakHandles. 514 if (m->object_peek() == NULL) { 703 if (cur_om->object_peek() == NULL) { But it still seems like it might be a bug to call object() in such a state. Related, see next comment. Later: Looks like Coleen questioned this too. I'm not sure I understand Erik's response though. When do we look at monitors that might not be constructed / initialized? That seems like a bad thing to do. Oh, but the whole creation path for OMs is kind of evil right now. I see... And that's planned to be fixed in later work. OK. ------------------------------------------------------------------------------ src/hotspot/share/runtime/synchronizer.cpp 1548 guarantee(m->object_peek() == NULL, "invariant"); Later: But see previous comment. Some of this might be relevat later though. object_peek() seemed like the wrong operation here. I thought this was attempting to verify that the underlying WeakHandle has been released. But peeking doesn't ensure that. Oh, but we don't actually release() the WeakHandle when we "free" an OM. We're just pushing the OM on the free list. Which means the GC will continue to examine the associated OopStorage entry (and discover it's NULL). So there's some cost to not releasing the WeakHandle when putting an OM on the free list. Of course, it means there won't be any allocations when taking off the free list; I'm not sure which way is better. But this makes me go back and wonder about whether object_peek() should have the _object.is_null() check. After creation it seems like that should never be true. ------------------------------------------------------------------------------ src/hotspot/share/runtime/synchronizer.cpp The old code in chk_in_use_entry seems wrong. It checked for a null object() and recorded that as an error. But then it went on and attempted to use it as if it was not null. That's been fixed by the change. However, the change no longer treats a null as an error. Probably this is because it's weak, and so could have become null. But is that really possible for an "in use" monitor? ------------------------------------------------------------------------------ src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/ObjectMonitor.java 94 public OopHandle object() { 95 Address objAddr = addr.getAddressAt(objectFieldOffset); 96 if (objAddr == null) { 97 return null; 98 } 99 return objAddr.getOopHandleAt(0); How about something a little bit less abstraction smashing? ------------------------------------------------------------------------------ From minqi at openjdk.java.net Sat Sep 12 02:12:15 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Sat, 12 Sep 2020 02:12:15 GMT Subject: RFR: 8252689: Classes are loaded from jrt:/java.base even when CDS is used [v3] In-Reply-To: References: Message-ID: <8-rnxZXz9ostLR_YLbSVQm34E2scQx6VDEQfWz7RS3Y=.508e23f0-4b55-4881-a0ad-269ef87acc6e@github.com> > Java.util.jar.Manifest related classes not archived since they are neither in classlist or loaded in dump process. > Manually create a dummy Manifest object will cause those classes loaded in dump and archived. Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: 8252689: Classes are loaded from jrt:/java.base even when CDS is used ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/134/files - new: https://git.openjdk.java.net/jdk/pull/134/files/ee270886..395b4905 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=134&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=134&range=01-02 Stats: 73 lines in 1 file changed: 0 ins; 73 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/134.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/134/head:pull/134 PR: https://git.openjdk.java.net/jdk/pull/134 From iklam at openjdk.java.net Sat Sep 12 06:35:31 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 12 Sep 2020 06:35:31 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v4] In-Reply-To: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: > This is the same patch as > [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) > published in > [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). > The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'master' into 8244778-archive-full-module-graph - Feedback from Coleen - Removed TODO comment referring to JBS issue - Merge branch 'master' into 8244778-archive-full-module-graph - fixed trailing spaces - Renamed ModuleEntry::write_growable_array - Update to latest repo (JDK-8251557); added comments - 8244778: Archive full module graph in CDS ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/80/files - new: https://git.openjdk.java.net/jdk/pull/80/files/e541890e..a1abed8f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=02-03 Stats: 8945 lines in 269 files changed: 3645 ins; 3698 del; 1602 mod Patch: https://git.openjdk.java.net/jdk/pull/80.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/80/head:pull/80 PR: https://git.openjdk.java.net/jdk/pull/80 From thomas.stuefe at gmail.com Sat Sep 12 10:23:51 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sat, 12 Sep 2020 12:23:51 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: <1e77549c-5ad1-6a6b-63e7-b539e7639ba8@oracle.com> References: <1e77549c-5ad1-6a6b-63e7-b539e7639ba8@oracle.com> Message-ID: Hi Reviewers, about the file renamings with the ms prefix, could I ask for more opinions? I am fine with both new and old variant, but at the moment it's a tie, since Coleen's and Leo's requests contradict each other. Thanks a lot! .:Thomas > 1) Aesthetics > > Some of you requested style changes so there are a lot: > > - Most files in memory/metaspace/ now follow a common theme with a common > prefix ("ms"). The only exception is metaspacesSizesSnapshot.(cpp|hpp) > which I plan to remove in a follow up (see JDK-8251342). > > > > I really don't like the ms prefix on all the files at all. I didn't think > there were any conflicting names in the metaspace directory but if there > were, they could be renamed. Now the class names don't match the file > names! The other thing about sort of generic names in the metspace > directory, even if they don't match other names in the system, like > binList.hpp or blockTree.hpp, is that they can provide a hint to people > before they write similar code. These could be a basis for generalization > into the utilities directory if possible. > > > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msContext.cpp.html > > For example in this case, the class name is MetaspaceContext which is a > much better name than MSContext. > > Sorry I didn't object to this sooner on the thread. > > From redestad at openjdk.java.net Sat Sep 12 12:46:43 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Sat, 12 Sep 2020 12:46:43 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v4] In-Reply-To: References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: On Sat, 12 Sep 2020 06:35:31 GMT, Ioi Lam wrote: >> This is the same patch as >> [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) >> published in >> [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). >> The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes > the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last > revision: > - Merge branch 'master' into 8244778-archive-full-module-graph > - Feedback from Coleen > - Removed TODO comment referring to JBS issue > - Merge branch 'master' into 8244778-archive-full-module-graph > - fixed trailing spaces > - Renamed ModuleEntry::write_growable_array > - Update to latest repo (JDK-8251557); added comments > - 8244778: Archive full module graph in CDS Excellent work! Only a few minor comments inline, which you can choose to ignore. src/hotspot/share/classfile/classLoaderDataShared.cpp line 82: > 80: assert_valid(loader_data); > 81: if (loader_data != NULL) { > 82: // We can't create hashtables at dump time because the hashcode dependes on the dependes -> depends src/hotspot/share/classfile/classLoaderDataShared.cpp line 84: > 82: // We can't create hashtables at dump time because the hashcode dependes on the > 83: // address of the Symbols, which may be relocated at run time due to ASLR. > 84: // So we store the packages/modules in a Arrays. At run time, we create run time -> runtime a Arrays -> Arrays src/hotspot/share/classfile/javaClasses.cpp line 4830: > 4828: if (klass == SystemDictionary::ClassLoader_klass() || // ClassLoader::loader_data is malloc'ed. > 4829: // The next 3 classes are used to implement java.lang.invoke, and are not used directly in > 4830: // regular Java code. The implementation of java.lang.invoke uses generated anonymoys classes pre-existing: anonymoys src/hotspot/share/classfile/modules.cpp line 462: > 460: > 461: // We don't want the classes used by the archived full module graph to be redefined by JVMTI. > 462: // Luckily, such classes are loaded in the JVMTI "early" phase, and CDS is disable if a JVMTI disabled src/hotspot/share/memory/heapShared.cpp line 76: > 74: // assigned at runtime. > 75: static ArchivableStaticFieldInfo closed_archive_subgraph_entry_fields[] = { > 76: {"java/lang/Integer$IntegerCache", 0, "archivedCache"}, Could the changes here be simplified or clarified? I think the new field should be a bool, or we could instead introduce a new array for the fields archived only when archiving the full module graph (the field is ignored on iteration over closed_archive_subgraph_entry_fields anyhow) ------------- Marked as reviewed by redestad (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/80 From iklam at openjdk.java.net Sat Sep 12 18:37:31 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 12 Sep 2020 18:37:31 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v5] In-Reply-To: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: > This is the same patch as > [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) > published in > [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). > The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Claes feedback (typos, refactored archived fields init); also added missing TRAPS param ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/80/files - new: https://git.openjdk.java.net/jdk/pull/80/files/a1abed8f..e9871102 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=03-04 Stats: 91 lines in 6 files changed: 38 ins; 16 del; 37 mod Patch: https://git.openjdk.java.net/jdk/pull/80.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/80/head:pull/80 PR: https://git.openjdk.java.net/jdk/pull/80 From iklam at openjdk.java.net Sat Sep 12 18:37:35 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 12 Sep 2020 18:37:35 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v4] In-Reply-To: References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: On Sat, 12 Sep 2020 12:34:06 GMT, Claes Redestad wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes >> the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last >> revision: >> - Merge branch 'master' into 8244778-archive-full-module-graph >> - Feedback from Coleen >> - Removed TODO comment referring to JBS issue >> - Merge branch 'master' into 8244778-archive-full-module-graph >> - fixed trailing spaces >> - Renamed ModuleEntry::write_growable_array >> - Update to latest repo (JDK-8251557); added comments >> - 8244778: Archive full module graph in CDS > > src/hotspot/share/memory/heapShared.cpp line 76: > >> 74: // assigned at runtime. >> 75: static ArchivableStaticFieldInfo closed_archive_subgraph_entry_fields[] = { >> 76: {"java/lang/Integer$IntegerCache", 0, "archivedCache"}, > > Could the changes here be simplified or clarified? I think the new field should be a bool, or we could instead > introduce a new array for the fields archived only when archiving the full module graph (the field is ignored on > iteration over closed_archive_subgraph_entry_fields anyhow) I split out the new fields into a separate array as you suggested. Also fixed the typos you found. See commit [e987110](https://github.com/openjdk/jdk/pull/80/commits/e98711029879ffa21757ca871120ad7b17344d01). ------------- PR: https://git.openjdk.java.net/jdk/pull/80 From iklam at openjdk.java.net Sat Sep 12 22:41:36 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 12 Sep 2020 22:41:36 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v6] In-Reply-To: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: <0xjoJLaycnQVcrxo61AvhCGz42xU6J3k3dGamDZLASQ=.4441ed23-e92c-4a4f-ad70-cd555dc9be17@github.com> > This is the same patch as > [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) > published in > [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). > The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: fixed failure in runtime/cds/DeterministicDump.java ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/80/files - new: https://git.openjdk.java.net/jdk/pull/80/files/e9871102..97706f58 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=80&range=04-05 Stats: 27 lines in 2 files changed: 10 ins; 3 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/80.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/80/head:pull/80 PR: https://git.openjdk.java.net/jdk/pull/80 From richard.reingruber at sap.com Sat Sep 12 22:54:57 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Sat, 12 Sep 2020 22:54:57 +0000 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: References: <1e77549c-5ad1-6a6b-63e7-b539e7639ba8@oracle.com> Message-ID: Hi Thomas, I?d think the 'ms' prefix is not really needed. There are quite a few gc implementations and there it may be helpful to add a prefix to the source files. But for Metaspace the distinction by directory should be sufficient. Just my .02? :) Richard. From: Thomas St?fe Sent: Samstag, 12. September 2020 12:24 To: Coleen Phillimore ; Leo Korinth ; Reingruber, Richard ; Doerr, Martin ; Albert Yang Cc: Hotspot dev runtime ; Hotspot-Gc-Dev Subject: Re: RFR: Implementation of JEP 387: Elastic Metaspace (round two) Hi Reviewers, about the file renamings with the ms prefix, could I ask for more opinions? I am fine with both new and old variant, but at the moment it's a tie, since Coleen's and Leo's requests contradict each other. Thanks a lot! .:Thomas 1) Aesthetics Some of you requested style changes so there are a lot: - Most files in memory/metaspace/ now follow a common theme with a common prefix ("ms"). The only exception is metaspacesSizesSnapshot.(cpp|hpp) which I plan to remove in a follow up (see JDK-8251342). I really don't like the ms prefix on all the files at all. I didn't think there were any conflicting names in the metaspace directory but if there were, they could be renamed. Now the class names don't match the file names! The other thing about sort of generic names in the metspace directory, even if they don't match other names in the system, like binList.hpp or blockTree.hpp, is that they can provide a hint to people before they write similar code. These could be a basis for generalization into the utilities directory if possible. http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msContext.cpp.html For example in this case, the class name is MetaspaceContext which is a much better name than MSContext. Sorry I didn't object to this sooner on the thread. From redestad at openjdk.java.net Sun Sep 13 13:32:11 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Sun, 13 Sep 2020 13:32:11 GMT Subject: RFR: 8244778: Archive full module graph in CDS [v6] In-Reply-To: <0xjoJLaycnQVcrxo61AvhCGz42xU6J3k3dGamDZLASQ=.4441ed23-e92c-4a4f-ad70-cd555dc9be17@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> <0xjoJLaycnQVcrxo61AvhCGz42xU6J3k3dGamDZLASQ=.4441ed23-e92c-4a4f-ad70-cd555dc9be17@github.com> Message-ID: On Sat, 12 Sep 2020 22:41:36 GMT, Ioi Lam wrote: >> This is the same patch as >> [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) >> published in >> [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). >> The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > fixed failure in runtime/cds/DeterministicDump.java Marked as reviewed by redestad (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/80 From iklam at openjdk.java.net Sun Sep 13 14:47:57 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sun, 13 Sep 2020 14:47:57 GMT Subject: Integrated: 8244778: Archive full module graph in CDS In-Reply-To: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> References: <_zK0u_HNDIcmtKd9K8fTBGf2fuC9rqrWfkCz7IR0G5o=.d71f9618-f177-490f-8983-5191f5d8860b@github.com> Message-ID: On Tue, 8 Sep 2020 15:59:33 GMT, Ioi Lam wrote: > This is the same patch as > [8244778-archive-full-module-graph.v03](http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/) > published in > [hotspot-runtime-dev at openjdk.java.net](https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041496.html). > The rest of the review will continue on GitHub. I will add new commits to respond to comments to the above e-mail. This pull request has now been integrated. Changeset: 03a4df0a Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/03a4df0a Stats: 2080 lines in 59 files changed: 28 ins; 1916 del; 136 mod 8244778: Archive full module graph in CDS Reviewed-by: erikj, coleenp, lfoltan, redestad, alanb, mchung ------------- PR: https://git.openjdk.java.net/jdk/pull/80 From iklam at openjdk.java.net Sun Sep 13 19:18:11 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sun, 13 Sep 2020 19:18:11 GMT Subject: RFR: 8248186: Move CDS C++ vtable code to cppVtables.cpp [v2] In-Reply-To: References: Message-ID: > I moved the code that supports C++ vtables in the CDS archive from metaspaceShared.cpp to a new file, cppVtables.cpp. > To keep the refactoring straightforward, the code is moved verbatim, except for: > - Methods are renamed from `MetaspaceShared::xx` to `CppVtables::xx` > - Access to `_mc_region` is changed to the inline function `mc_region()` Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge branch 'master' into 8248186-refactor-cds-cpp-vtables - 8248186: Move CDS C++ vtable code to cppVtables.cpp ------------- Changes: https://git.openjdk.java.net/jdk/pull/136/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=136&range=01 Stats: 696 lines in 7 files changed: 384 ins; 305 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/136.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/136/head:pull/136 PR: https://git.openjdk.java.net/jdk/pull/136 From iklam at openjdk.java.net Sun Sep 13 19:23:34 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sun, 13 Sep 2020 19:23:34 GMT Subject: Integrated: 8248186: Move CDS C++ vtable code to cppVtables.cpp In-Reply-To: References: Message-ID: <0aSqvUF0RSr-3n6Zk8H0sMpUwgo3zH3csPdW-epufv8=.8f1bf327-d0d2-42b3-9d13-897787f78d50@github.com> On Fri, 11 Sep 2020 22:03:27 GMT, Ioi Lam wrote: > I moved the code that supports C++ vtables in the CDS archive from metaspaceShared.cpp to a new file, cppVtables.cpp. > To keep the refactoring straightforward, the code is moved verbatim, except for: > - Methods are renamed from `MetaspaceShared::xx` to `CppVtables::xx` > - Access to `_mc_region` is changed to the inline function `mc_region()` This pull request has now been integrated. Changeset: c5e63b63 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/c5e63b63 Stats: 696 lines in 7 files changed: 305 ins; 384 del; 7 mod 8248186: Move CDS C++ vtable code to cppVtables.cpp Reviewed-by: coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/136 From dholmes at openjdk.java.net Mon Sep 14 02:04:24 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Sep 2020 02:04:24 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: Message-ID: <0gKQJq1wgjTtGesVdOc7DGfjexnEkm_AcbNUSbYWSTk=.2b5edf85-1cb4-4818-92fe-c4dbf4535b6e@github.com> On Fri, 11 Sep 2020 21:45:08 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > coleenp CR - changes to resolve Coleen's comments. Overall cleanup looks good. More complicated than I had expected, but then this changes the GC races with the deflation code. Not sure about the JVM TI changes! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dholmes at openjdk.java.net Mon Sep 14 02:04:25 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Sep 2020 02:04:25 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: Message-ID: <4S1AuIQpJN081C7m_QDWkifv1yb7WHEzB_qbZ1IwWSA=.69fe7145-edad-46d9-bc48-e528489d3278@github.com> On Fri, 11 Sep 2020 21:35:41 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/runtime/thread.cpp line 4593: >> >>> 4591: // are used in om_flush(). >>> 4592: BarrierSet::barrier_set()->on_thread_detach(p); >>> 4593: >> >> One last question, that doesn't require a comment why it's here, but why was this moved? > > We had to move that code to make om_flush() happy. > om_flush() accesses the (weak) oops in the monitor list > after the thread is off the threads list so the barrier change > had to move. It will move back in part3. Just to be clear the thread is still on the threads_list at this point. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dholmes at openjdk.java.net Mon Sep 14 02:04:25 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Sep 2020 02:04:25 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Fri, 11 Sep 2020 21:39:25 GMT, Daniel D. Daugherty wrote: >> Ok, that's good then! > > Thanks for confirmation. I don't see anything in the HPROF format description that claims this is a strong root. At a minimum this seems to be a behavioural change that would warrant a CSR request. This also seems to be something that the serviceability folk should be made aware of and have a chance to comment on. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dholmes at openjdk.java.net Mon Sep 14 02:04:26 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Sep 2020 02:04:26 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Fri, 11 Sep 2020 21:37:57 GMT, Daniel D. Daugherty wrote: >> Yes, exactly. > > Thanks for confirmation. >From the spec I'm not clear on exactly what JVMTI_HEAP_REFERENCE_MONITOR is intended to be. Serviceability folk should be giving some input here though. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dholmes at openjdk.java.net Mon Sep 14 02:17:23 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Sep 2020 02:17:23 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: <0gKQJq1wgjTtGesVdOc7DGfjexnEkm_AcbNUSbYWSTk=.2b5edf85-1cb4-4818-92fe-c4dbf4535b6e@github.com> References: <0gKQJq1wgjTtGesVdOc7DGfjexnEkm_AcbNUSbYWSTk=.2b5edf85-1cb4-4818-92fe-c4dbf4535b6e@github.com> Message-ID: On Mon, 14 Sep 2020 02:01:58 GMT, David Holmes wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> coleenp CR - changes to resolve Coleen's comments. > > Overall cleanup looks good. More complicated than I had expected, but then this changes the GC races with the deflation > code. > Not sure about the JVM TI changes! Please note that I made comments on the "resolved" discussion on heapDumper.cpp and jvmtiTagMap.cpp but they were not mentioned in the review email and are hidden in the UI unless you explicitly click on "Show resolved". ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From minqi at openjdk.java.net Mon Sep 14 03:41:46 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Mon, 14 Sep 2020 03:41:46 GMT Subject: Integrated: 8252689: Classes are loaded from jrt:/java.base even when CDS is used In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 18:31:02 GMT, Yumin Qi wrote: > Java.util.jar.Manifest related classes not archived since they are neither in classlist or loaded in dump process. > Manually create a dummy Manifest object will cause those classes loaded in dump and archived. This pull request has now been integrated. Changeset: f978f6fe Author: Yumin Qi URL: https://git.openjdk.java.net/jdk/commit/f978f6fe Stats: 35 lines in 3 files changed: 13 ins; 19 del; 3 mod 8252689: Classes are loaded from jrt:/java.base even when CDS is used Reviewed-by: iklam, ccheung ------------- PR: https://git.openjdk.java.net/jdk/pull/134 From david.holmes at oracle.com Mon Sep 14 04:10:11 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 14 Sep 2020 14:10:11 +1000 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: References: <1e77549c-5ad1-6a6b-63e7-b539e7639ba8@oracle.com> Message-ID: <7ae974f3-d2c9-76bb-2451-a743e30b81a8@oracle.com> +1 No need to prefix all new metaspace/* files with ms. Thanks, David On 13/09/2020 8:54 am, Reingruber, Richard wrote: > Hi Thomas, > > I?d think the 'ms' prefix is not really needed. There are quite a few gc implementations and there it may be helpful to add a prefix to the source files. But for Metaspace the distinction by directory should be sufficient. > > Just my .02? :) > Richard. > > From: Thomas St?fe > Sent: Samstag, 12. September 2020 12:24 > To: Coleen Phillimore ; Leo Korinth ; Reingruber, Richard ; Doerr, Martin ; Albert Yang > Cc: Hotspot dev runtime ; Hotspot-Gc-Dev > Subject: Re: RFR: Implementation of JEP 387: Elastic Metaspace (round two) > > Hi Reviewers, > > about the file renamings with the ms prefix, could I ask for more opinions? I am fine with both new and old variant, but at the moment it's a tie, since Coleen's and Leo's requests contradict each other. > > Thanks a lot! > > .:Thomas > 1) Aesthetics > > Some of you requested style changes so there are a lot: > > - Most files in memory/metaspace/ now follow a common theme with a common prefix ("ms"). The only exception is metaspacesSizesSnapshot.(cpp|hpp) which I plan to remove in a follow up (see JDK-8251342). > > > I really don't like the ms prefix on all the files at all. I didn't think there were any conflicting names in the metaspace directory but if there were, they could be renamed. Now the class names don't match the file names! The other thing about sort of generic names in the metspace directory, even if they don't match other names in the system, like binList.hpp or blockTree.hpp, is that they can provide a hint to people before they write similar code. These could be a basis for generalization into the utilities directory if possible. > > http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msContext.cpp.html > > For example in this case, the class name is MetaspaceContext which is a much better name than MSContext. > > Sorry I didn't object to this sooner on the thread. > From dholmes at openjdk.java.net Mon Sep 14 04:21:26 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Sep 2020 04:21:26 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 07:03:37 GMT, Aleksei Voitylov wrote: >> continuing the review thread from here https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068546.html >> >>> The download side of using JNI in these tests is that it complicates the >>> setup a bit for those that run jtreg directly and/or just build the JDK >>> and not the test libraries. You could reduce this burden a bit by >>> limiting the load library/isMusl check to Linux only, meaning isMusl >>> would not be called on other platforms. >>> >>> The alternative you suggest above might indeed be better. I assume you >>> don't mean splitting the tests but rather just adding a second @test >>> description so that the vm.musl case runs the test with a system >>> property that allows the test know the expected load library path behavior. >> >> I have updated the PR to split the two tests in multiple @test s. >> >>> The updated comment in java_md.c in this looks good. A minor comment on >>> Platform.isBusybox is Files.isSymbolicLink returning true implies that >>> the link exists so no need to check for exists too. Also the >>> if-then-else style for the new class in ProcessBuilder/Basic.java is >>> inconsistent with the rest of the test so it stands out. >> >> Thank you, these changes are done in the updated PR. >> >>> Given the repo transition this weekend then I assume you'll create a PR >>> for the final review at least. Also I see JEP 386 hasn't been targeted >>> yet but I assume Boris, as owner, will propose-to-target and wait for it >>> to be targeted before it is integrated. >> >> Yes. How can this be best accomplished with the new git workflow? >> - we can continue the review process till the end and I will request the integration to happen only after the JEP is >> targeted. I guess this step is now done by typing "slash integrate" in a comment. >> - we can pause the review process now until the JEP is targeted. >> >> In the first case I'm kindly asking the Reviewers who already chimed in on that to re-confirm the review here. > > Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8247589: Implementation of Alpine Linux/x64 Port Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From dholmes at openjdk.java.net Mon Sep 14 04:21:27 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Sep 2020 04:21:27 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: Message-ID: <6y6Cyk85aOiF6smCGSK5VH-rBadloV2gW8iVIW1e9BE=.47f56d70-2a7a-4951-a058-10add4118b31@github.com> On Wed, 9 Sep 2020 00:08:35 GMT, David Holmes wrote: >> Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8247589: Implementation of Alpine Linux/x64 Port > > Attempting to use the GitHub UI for further review. If this doesn't work out well I will revert to direct email. Updates look good. Nothing further from me. ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From dholmes at openjdk.java.net Mon Sep 14 04:21:27 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Sep 2020 04:21:27 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 07:36:57 GMT, Aleksei Voitylov wrote: >> test/hotspot/jtreg/runtime/StackGuardPages/exeinvoke.c line 282: >> >>> 280: >>> 281: pthread_attr_init(&thread_attr); >>> 282: pthread_attr_setstacksize(&thread_attr, stack_size); >> >> Just a comment in response to the explanation as to why this change is needed. If the default thread stacksize under >> musl is insufficient to successfully attach such a thread to the VM then this will cause problems for applications that >> embed the VM directly (or which otherwise directly attach existing threads). > > This fix https://git.musl-libc.org/cgit/musl/commit/src/aio/aio.c?id=1a6d6f131bd60ec2a858b34100049f0c042089f2 > addresses the problem for recent versions of musl. The test passes on a recent Alpine Linux 3.11.6 (musl 1.1.24) and > fails on Alpine Linux 3.8.2 (musl 1.1.19) without this test fix. There are still older versions of the library in the > wild, hence the test fix. The mitigation for such users would be a distro upgrade. Thanks for the additional info on this. ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From avoitylov at openjdk.java.net Mon Sep 14 06:33:20 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Mon, 14 Sep 2020 06:33:20 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: References: Message-ID: <6jqlCPXe69fPRvYFrytJsECkaa9tJ1hYWISNgyPP4Eg=.40944ef5-93b0-4db4-948b-80bb7898e9e8@github.com> On Mon, 14 Sep 2020 04:18:39 GMT, David Holmes wrote: >> Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8247589: Implementation of Alpine Linux/x64 Port > > Marked as reviewed by dholmes (Reviewer). thank you Alan, Erik, and David! When the JEP becomes Targeted, I'll use this PR to integrate the changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From jiefu at openjdk.java.net Mon Sep 14 07:10:46 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 14 Sep 2020 07:10:46 GMT Subject: RFR: 8253084: Zero VM is broken after JDK-8252689 Message-ID: Hi all, JBS: https://bugs.openjdk.java.net/browse/JDK-8253084 The build fails due to 'NULL' can't be converted from type 'long' to type 'Handle' The fix returns 'Handle()' instead of 'NULL'. Best regards, Jie ------------- Commit messages: - 8253084: Zero VM is broken after JDK-8252689 Changes: https://git.openjdk.java.net/jdk/pull/147/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=147&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253084 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/147.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/147/head:pull/147 PR: https://git.openjdk.java.net/jdk/pull/147 From jiefu at openjdk.java.net Mon Sep 14 07:10:46 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 14 Sep 2020 07:10:46 GMT Subject: RFR: 8253084: Zero VM is broken after JDK-8252689 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 07:04:41 GMT, Jie Fu wrote: > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8253084 > > The build fails due to 'NULL' can't be converted from type 'long' to type 'Handle' > The fix returns 'Handle()' instead of 'NULL'. > > Best regards, > Jie ------------- PR: https://git.openjdk.java.net/jdk/pull/147 From iklam at openjdk.java.net Mon Sep 14 07:19:35 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 14 Sep 2020 07:19:35 GMT Subject: RFR: 8253084: Zero VM is broken after JDK-8252689 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 07:04:41 GMT, Jie Fu wrote: > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8253084 > > The build fails due to 'NULL' can't be converted from type 'long' to type 'Handle' > The fix returns 'Handle()' instead of 'NULL'. > > Best regards, > Jie LGTM. This is a trivial change so you don't need to wait for a second reviewer. Thanks! ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/147 From dholmes at openjdk.java.net Mon Sep 14 07:27:06 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Sep 2020 07:27:06 GMT Subject: RFR: 8253084: Zero VM is broken after JDK-8252689 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 07:04:41 GMT, Jie Fu wrote: > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8253084 > > The build fails due to 'NULL' can't be converted from type 'long' to type 'Handle' > The fix returns 'Handle()' instead of 'NULL'. > > Best regards, > Jie Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/147 From jiefu at openjdk.java.net Mon Sep 14 07:27:07 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 14 Sep 2020 07:27:07 GMT Subject: RFR: 8253084: Zero VM is broken after JDK-8252689 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 07:16:45 GMT, Ioi Lam wrote: >> Hi all, >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8253084 >> >> The build fails due to 'NULL' can't be converted from type 'long' to type 'Handle' >> The fix returns 'Handle()' instead of 'NULL'. >> >> Best regards, >> Jie > > LGTM. This is a trivial change so you don't need to wait for a second reviewer. Thanks! Thanks Ioi for your review. ------------- PR: https://git.openjdk.java.net/jdk/pull/147 From jiefu at openjdk.java.net Mon Sep 14 07:27:07 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 14 Sep 2020 07:27:07 GMT Subject: RFR: 8253084: Zero VM is broken after JDK-8252689 In-Reply-To: References: Message-ID: <0HwC5WaS8Qk5ND3mGvxRB8nLY-M4nWIS14_04iicnyc=.22a00f0c-a936-4485-ad2c-b41b6442aa59@github.com> On Mon, 14 Sep 2020 07:20:49 GMT, David Holmes wrote: >> Hi all, >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8253084 >> >> The build fails due to 'NULL' can't be converted from type 'long' to type 'Handle' >> The fix returns 'Handle()' instead of 'NULL'. >> >> Best regards, >> Jie > > Marked as reviewed by dholmes (Reviewer). Thanks David for your review. ------------- PR: https://git.openjdk.java.net/jdk/pull/147 From jiefu at openjdk.java.net Mon Sep 14 07:27:07 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 14 Sep 2020 07:27:07 GMT Subject: Integrated: 8253084: Zero VM is broken after JDK-8252689 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 07:04:41 GMT, Jie Fu wrote: > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8253084 > > The build fails due to 'NULL' can't be converted from type 'long' to type 'Handle' > The fix returns 'Handle()' instead of 'NULL'. > > Best regards, > Jie This pull request has now been integrated. Changeset: 779d2c34 Author: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/779d2c34 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8253084: Zero VM is broken after JDK-8252689 Reviewed-by: iklam, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/147 From stefank at openjdk.java.net Mon Sep 14 08:20:30 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 14 Sep 2020 08:20:30 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: Message-ID: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> On Fri, 11 Sep 2020 21:45:08 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > coleenp CR - changes to resolve Coleen's comments. GC changes looks good to me. ------------- Marked as reviewed by stefank (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/135 From leo.korinth at oracle.com Mon Sep 14 11:29:52 2020 From: leo.korinth at oracle.com (Leo Korinth) Date: Mon, 14 Sep 2020 13:29:52 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: <7ae974f3-d2c9-76bb-2451-a743e30b81a8@oracle.com> References: <1e77549c-5ad1-6a6b-63e7-b539e7639ba8@oracle.com> <7ae974f3-d2c9-76bb-2451-a743e30b81a8@oracle.com> Message-ID: <43985fe4-5f4f-c6ce-6d28-8de6cbdd349d@oracle.com> I am okay with reverting. I am very sorry for creating this mess for you Thomas. Hopefully it will be solved in the build system in the future. Thanks, Leo On 14/09/2020 06:10, David Holmes wrote: > +1 > > No need to prefix all new metaspace/* files with ms. > > Thanks, > David > > On 13/09/2020 8:54 am, Reingruber, Richard wrote: >> Hi Thomas, >> >> I?d think the 'ms' prefix is not really needed. There are quite a few gc implementations and there it may be helpful to add a prefix to the source files. But for Metaspace the distinction by directory should be sufficient. >> >> Just my .02? :) >> Richard. >> >> From: Thomas St?fe >> Sent: Samstag, 12. September 2020 12:24 >> To: Coleen Phillimore ; Leo Korinth ; Reingruber, Richard ; Doerr, Martin ; Albert Yang >> Cc: Hotspot dev runtime ; Hotspot-Gc-Dev >> Subject: Re: RFR: Implementation of JEP 387: Elastic Metaspace (round two) >> >> Hi Reviewers, >> >> about the file renamings with the ms prefix, could I ask for more opinions? I am fine with both new and old variant, but at the moment it's a tie, since Coleen's and Leo's requests contradict each other. >> >> Thanks a lot! >> >> .:Thomas >> 1) Aesthetics >> >> Some of you requested style changes so there are a lot: >> >> - Most files in memory/metaspace/ now follow a common theme with a common prefix ("ms"). The only exception is metaspacesSizesSnapshot.(cpp|hpp) which I plan to remove in a follow up (see JDK-8251342). >> >> >> I really don't like the ms prefix on all the files at all.? I didn't think there were any conflicting names in the metaspace directory but if there were, they could be renamed.? Now the class names don't match the file names!? The other thing about sort of generic names in the metspace directory, even if they don't match other names in the system, like binList.hpp or blockTree.hpp, is that they can provide a hint to people before they write similar code.? These could be a basis for generalization into the utilities directory if possible. >> >> http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msContext.cpp.html >> >> For example in this case, the class name is MetaspaceContext which is a much better name than MSContext. >> >> Sorry I didn't object to this sooner on the thread. >> From rkennke at openjdk.java.net Mon Sep 14 11:35:41 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 14 Sep 2020 11:35:41 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: On Mon, 14 Sep 2020 08:18:02 GMT, Stefan Karlsson wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> coleenp CR - changes to resolve Coleen's comments. > > GC changes looks good to me. Shenandoah changes are not complete: ObjectSynchronizer::oops_do(&resolve_mark_cl); ^~~~~~~ ObjectSynchronizer::oops_do(&mark_cl); ^~~~~~~ I will have a look into how to resolve this. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From shade at openjdk.java.net Mon Sep 14 13:05:48 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 14 Sep 2020 13:05:48 GMT Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 Message-ID: It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at fixing this, I think we need to use `u2` consistently for hash code computations. ------------- Commit messages: - No need to cast to short - Indenting - Minor cleanup - Another try, use u2 consistently - 8253089: Windows (MSVC 2017) build fails after JDK-8243208 Changes: https://git.openjdk.java.net/jdk/pull/150/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=150&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253089 Stats: 8 lines in 2 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/150.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/150/head:pull/150 PR: https://git.openjdk.java.net/jdk/pull/150 From coleen.phillimore at oracle.com Mon Sep 14 13:12:53 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 14 Sep 2020 09:12:53 -0400 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: <43985fe4-5f4f-c6ce-6d28-8de6cbdd349d@oracle.com> References: <1e77549c-5ad1-6a6b-63e7-b539e7639ba8@oracle.com> <7ae974f3-d2c9-76bb-2451-a743e30b81a8@oracle.com> <43985fe4-5f4f-c6ce-6d28-8de6cbdd349d@oracle.com> Message-ID: I think Leo's original point that a few of the names are too generic was a good one: http://cr.openjdk.java.net/~stuefe/jep387/review/2020-08-11/webrev-all/webrev/src/hotspot/share/memory/metaspace/counter.hpp.html This should probably be memRangeCounter.hpp.? There's an RFE to make this a useful utility, so this should be good for now. http://cr.openjdk.java.net/~stuefe/jep387/review/2020-08-11/webrev-all/webrev/src/hotspot/share/memory/metaspace/internStat.hpp.html This should be internalStats.hpp http://cr.openjdk.java.net/~stuefe/jep387/review/2020-08-11/webrev-all/webrev/src/hotspot/share/memory/metaspace/settings.hpp.html Maybe Settings should be prefixed with Metaspace like MetaspaceSettings - and metaspaceSettings.hpp even though it's already in namespace metaspace.? Like metaspaceContext.hpp/cpp. Thanks, Coleen On 9/14/20 7:29 AM, Leo Korinth wrote: > I am okay with reverting. I am very sorry for creating this mess for > you Thomas. Hopefully it will be solved in the build system in the > future. > > Thanks, > Leo > > On 14/09/2020 06:10, David Holmes wrote: >> +1 >> >> No need to prefix all new metaspace/* files with ms. >> >> Thanks, >> David >> >> On 13/09/2020 8:54 am, Reingruber, Richard wrote: >>> Hi Thomas, >>> >>> I?d think the 'ms' prefix is not really needed. There are quite a >>> few gc implementations and there it may be helpful to add a prefix >>> to the source files. But for Metaspace the distinction by directory >>> should be sufficient. >>> >>> Just my .02? :) >>> Richard. >>> >>> From: Thomas St?fe >>> Sent: Samstag, 12. September 2020 12:24 >>> To: Coleen Phillimore ; Leo Korinth >>> ; Reingruber, Richard >>> ; Doerr, Martin ; >>> Albert Yang >>> Cc: Hotspot dev runtime ; >>> Hotspot-Gc-Dev >>> Subject: Re: RFR: Implementation of JEP 387: Elastic Metaspace >>> (round two) >>> >>> Hi Reviewers, >>> >>> about the file renamings with the ms prefix, could I ask for more >>> opinions? I am fine with both new and old variant, but at the moment >>> it's a tie, since Coleen's and Leo's requests contradict each other. >>> >>> Thanks a lot! >>> >>> .:Thomas >>> 1) Aesthetics >>> >>> Some of you requested style changes so there are a lot: >>> >>> - Most files in memory/metaspace/ now follow a common theme with a >>> common prefix ("ms"). The only exception is >>> metaspacesSizesSnapshot.(cpp|hpp) which I plan to remove in a follow >>> up (see JDK-8251342). >>> >>> >>> I really don't like the ms prefix on all the files at all.? I didn't >>> think there were any conflicting names in the metaspace directory >>> but if there were, they could be renamed.? Now the class names don't >>> match the file names!? The other thing about sort of generic names >>> in the metspace directory, even if they don't match other names in >>> the system, like binList.hpp or blockTree.hpp, is that they can >>> provide a hint to people before they write similar code.? These >>> could be a basis for generalization into the utilities directory if >>> possible. >>> >>> http://cr.openjdk.java.net/~stuefe/jep387/review-2020-09-04/webrev-all/webrev/src/hotspot/share/memory/metaspace/msContext.cpp.html >>> >>> >>> For example in this case, the class name is MetaspaceContext which >>> is a much better name than MSContext. >>> >>> Sorry I didn't object to this sooner on the thread. >>> From leo.korinth at oracle.com Mon Sep 14 13:27:27 2020 From: leo.korinth at oracle.com (Leo Korinth) Date: Mon, 14 Sep 2020 15:27:27 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: References: Message-ID: <2ec9ad4b-a139-898a-28f6-256dc9b27862@oracle.com> Hi Thomas, I have a few more comments, feel free to ignore them or incorporate them in the future as I understand you want to be able to push sometime and not do changes forever. VirtualSpaceList.hpp 67 // Whether this list can expand by allocating new nodes. 68 const bool _can_expand; 69 70 // Whether this list can be purged. 71 const bool _can_purge; _can_expand and _can_purge is the same variable, it is const and always initialized to the same value in the constructor. I think it would be better to have just one variable, maybe one of type: enum class VirtualSpaceType {CompressedMetaSpace, LinkedMetaSpace} (if you like the idea, please feel free to improve on my name suggestion.) 98 virtual ~VirtualSpaceList(); Why is the destructor virtual, I could find no derived classes. VirtualSpaceList.cpp inconsistant indenting of initializer list at line 45 and 58. 68 // Create the first node spanning the existing ReservedSpace. This will be the only node created 69 // for this list since we cannot expand. 70 VirtualSpaceNode* vsn = VirtualSpaceNode::create_node(rs, _commit_limiter, 71 &_reserved_words_counter, &_committed_words_counter); 72 assert(vsn != NULL, "node creation failed"); Maybe instead assert(vsn != NULL, "node creation can not fail and reach this point") to ensure the invariant. "node creation failed" kind of implies that create_node can fail (and return) and then one can wonder why we assert on it not happening. 112 if (_first_node == NULL || 113 _first_node->free_words() == 0) { 114 115 // Since all allocations from a VirtualSpaceNode happen in 116 // root-chunk-size units, and the node size must be root-chunk-size aligned, 117 // we should never have left-over space. 118 assert(_first_node == NULL || 119 _first_node->free_words() == 0, "Sanity"); This above assert just look weird as it perfectly mirrors the above if-statement. 123 if (_can_expand) { 124 create_new_node(); 125 UL2(debug, "added new node (now: %d).", num_nodes()); 126 } else { 127 UL(debug, "list cannot expand."); 128 return NULL; // We cannot expand this list. 129 } 130 } 131 132 Metachunk* c = _first_node->allocate_root_chunk(); 133 134 assert(c != NULL, "This should have worked"); 135 136 return c; And I think the test for _first_node->free_words() == 0 is quite confusing. I understand that whenever free_words() != 0 that also free_words() >= chunklevel::MAX_CHUNK_WORD_SIZE, but why not test for the latter? Especially as that is what is tested for in VirtualSpaceNode::allocate_root_chunk(). 153 int num = 0, num_purged = 0; 154 while (vsn != NULL) { 155 VirtualSpaceNode* next_vsn = vsn->next(); 156 bool purged = vsn->attempt_purge(freelists); 157 if (purged) { 158 // Note: from now on do not dereference vsn! 159 UL2(debug, "purged node @" PTR_FORMAT ".", p2i(vsn)); 160 if (_first_node == vsn) { 161 _first_node = next_vsn; 162 } 163 DEBUG_ONLY(vsn = (VirtualSpaceNode*)((uintptr_t)(0xdeadbeef));) 164 if (prev_vsn != NULL) { 165 prev_vsn->set_next(next_vsn); 166 } 167 num_purged++; 168 _nodes_counter.decrement(); 169 } else { 170 prev_vsn = vsn; 171 } 172 vsn = next_vsn; 173 num++; 174 } "int num" is unused except for being incremented, remove the variable. Finally, why was it chosen that each node could carry precisely two chunk-roots? It seems somewhat easier to have a one-to-one relation between chunk-root and VirtualSpaceNode (when we are already so close to one). VirtualSpaceNode.hpp 102 // Start pointer of the area. 103 MetaWord* const _base; How does this differ from _rs._base? Really needed? 105 // Size, in words, of the whole node 106 const size_t _word_size; Can we not calculate this from _rs.size()? Thanks, Leo From dcubed at openjdk.java.net Mon Sep 14 13:43:01 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 13:43:01 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: <4S1AuIQpJN081C7m_QDWkifv1yb7WHEzB_qbZ1IwWSA=.69fe7145-edad-46d9-bc48-e528489d3278@github.com> References: <4S1AuIQpJN081C7m_QDWkifv1yb7WHEzB_qbZ1IwWSA=.69fe7145-edad-46d9-bc48-e528489d3278@github.com> Message-ID: On Mon, 14 Sep 2020 01:43:01 GMT, David Holmes wrote: >> We had to move that code to make om_flush() happy. >> om_flush() accesses the (weak) oops in the monitor list >> after the thread is off the threads list so the barrier change >> had to move. It will move back in part3. > > Just to be clear the thread is still on the threads_list at this point. Sorry, I should have been more clear here. We had to move that code after the om_flush() call because om_flush() accesses the (weak) oops in the monitor list after the original location of the code. The om_flush() call is made in the code that removes the thread from the threads list, but the removal happens after the call to om_flush(). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 13:50:31 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 13:50:31 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: On Mon, 14 Sep 2020 08:18:02 GMT, Stefan Karlsson wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> coleenp CR - changes to resolve Coleen's comments. > > GC changes looks good to me. @stefank - Thanks for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 13:50:31 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 13:50:31 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: On Mon, 14 Sep 2020 13:45:16 GMT, Daniel D. Daugherty wrote: >> GC changes looks good to me. > > @stefank - Thanks for the review! @dholmes-ora - Thanks for the review. I've think I have made all the comments you were concerned about as unresolved. I agree that we need a CSR for the change in behavior and I'll look into getting that going. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 13:50:31 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 13:50:31 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: On Mon, 14 Sep 2020 13:46:56 GMT, Daniel D. Daugherty wrote: >> @stefank - Thanks for the review! > > @dholmes-ora - Thanks for the review. I've think I have made all the comments > you were concerned about as unresolved. I agree that we need a CSR for the > change in behavior and I'll look into getting that going. @kimbarrett - Thanks for the review. I believe @fisk is going to address your comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 13:50:31 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 13:50:31 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: On Mon, 14 Sep 2020 13:47:49 GMT, Daniel D. Daugherty wrote: >> @dholmes-ora - Thanks for the review. I've think I have made all the comments >> you were concerned about as unresolved. I agree that we need a CSR for the >> change in behavior and I'll look into getting that going. > > @kimbarrett - Thanks for the review. I believe @fisk is going to > address your comments. @rkennke - Thanks for the review. I believe @fisk is going to address your comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From eosterlund at openjdk.java.net Mon Sep 14 14:18:27 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 14 Sep 2020 14:18:27 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: On Mon, 14 Sep 2020 13:48:25 GMT, Daniel D. Daugherty wrote: >> @kimbarrett - Thanks for the review. I believe @fisk is going to >> address your comments. > > @rkennke - Thanks for the review. I believe @fisk is going to address > your comments. Hi Kim, Here is a partial reply to your review. Thanks for reviewing! Hmm seems like your email was only sent to shenandoah-dev. Not sure if that was intended. I'm not subscribed to that mailing list, so I will send my reply through github and hope for the best. > _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) on > [shenandoah-dev](mailto:shenandoah-dev at openjdk.java.net):_ > ------------------------------------------------------------------------------ > src/hotspot/share/oops/weakHandle.cpp > 36 WeakHandle::WeakHandle(OopStorage* storage, oop obj) : > 37 _obj(storage->allocate()) { > 38 assert(obj != NULL, "no need to create weak null oop"); > > Please format this differently so the ctor-init-list is more easily > distinguished from the body. I don't care that much which of the several > alternatives is used. > > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/objectMonitor.cpp > 244 // Check that object() and set_object() are called from the right context: > 245 static void check_object_context() { > > This seems like checking we would normally only do in a debug build. Is this > really supposed to be done in product builds too? (It's written to support > that, just wondering if that's really what we want.) Maybe these aren't > called very often so it doesn't matter? I also see that guarantee (rather > than assert) is used a fair amount in this and related code. I don't think I have a preference here. As you say, it seems a bit mixed. I would be okay with both. Do you want them to be asserts? > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/objectMonitor.cpp > 251 JavaThread* jt = (JavaThread*)self; > > Use self->as_Java_thread(). > > Later: Coleen already commented on this and it's been fixed. > > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/objectMonitor.cpp > 249 guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be"); > 250 if (self->is_Java_thread()) { > > Maybe instead > > if (self->is_Java_thread()) { > ... > } else { > guarantee(self->is_VM_thread(), "must be"); > } > > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/objectMonitor.cpp > Both ObjectMonitor::object() and ObjectMonitor::object_peek() have > 265 if (_object.is_null()) { > 266 return NULL; > > Should we really be calling those functions when in such a state? That seems > like it might be a bug in the callers? > > OK, I think I see some places where object_peek() might need such protection > because of races, in src/hotspot/share/runtime/synchronizer.cpp. And > because we don't seem to ever release() the underlying WeakHandles. > > 514 if (m->object_peek() == NULL) { > 703 if (cur_om->object_peek() == NULL) { > > But it still seems like it might be a bug to call object() in such a state. The problem is Type Stable Memory (TSM). The idea is that all monitors are allocated in blocks of many (when the object they are associated with is not yet known), and are then re-used, switching object association back and forth. They are also never freed. The changes in this RFR were originally tied together with a patch that also removes TSM. But this has been split into separate parts by Dan in order to perform the OopStorage migration first, and then removing TSM. So in this patch, we are still dealing with block-allocation of monitors before they have an associated object, and changing of object association. Therefore, what you are saying is 100% true, but I would like to defer fixing that until the follow-up patch that removes TSM. In that patch, the accessors are changed to do exactly what you expect them to (resolve/peek), as the ObjectMonitor instances are allocated with a single object association in mind, installed in the constructor, and get deleted after deflation, with destuctors run. So I hope we can agree to wait until that subsequent patch with addressing such concerns. We thought reviewing the whole thing as a single patch was too much clutter. > Related, see next comment. > > Later: Looks like Coleen questioned this too. I'm not sure I understand > Erik's response though. When do we look at monitors that might not be > constructed / initialized? That seems like a bad thing to do. Oh, but the > whole creation path for OMs is kind of evil right now. I see... And that's > planned to be fixed in later work. OK. Yes, exactly. It is currently quite evil, but will soon be cleaned up to look quite ordinary. > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/synchronizer.cpp > 1548 guarantee(m->object_peek() == NULL, "invariant"); > > Later: But see previous comment. Some of this might be relevat later though. > > object_peek() seemed like the wrong operation here. I thought this was > attempting to verify that the underlying WeakHandle has been released. But > peeking doesn't ensure that. Oh, but we don't actually release() the > WeakHandle when we "free" an OM. We're just pushing the OM on the free > list. Which means the GC will continue to examine the associated OopStorage > entry (and discover it's NULL). So there's some cost to not releasing the > WeakHandle when putting an OM on the free list. Of course, it means there > won't be any allocations when taking off the free list; I'm not sure which > way is better. Yeah, so we never really release the monitors, because the whole monitor can't be freed yet in this patch. > But this makes me go back and wonder about whether object_peek() should have > the _object.is_null() check. After creation it seems like that should never > be true. Before the switch away from TSM (next patch), we must assume that monitors have not yet been properly initialized, due to block allocation of a whole bunch of monitors at a time, before they have been associated with objects. They are also exposed to iterators and such, before any object association has been made. But very very soon, that will be a memory of the past. > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/synchronizer.cpp > > The old code in chk_in_use_entry seems wrong. It checked for a null > object() and recorded that as an error. But then it went on and attempted > to use it as if it was not null. That's been fixed by the change. However, > the change no longer treats a null as an error. Probably this is because > it's weak, and so could have become null. But is that really possible for > an "in use" monitor? Well, the monitors are added to some in-use list at deflation time. The instant after, the monitor could have become unused. But it will still be floating around on the in-use list, until deflation time. And yes, now it is suddenly very much valid for a monitor that is no longer used (but placed on the in-use list when it was used), to have a NULL object, due to GC getting to weak processing clearing the oop, before deflation gets to it. > ------------------------------------------------------------------------------ > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/ObjectMonitor.java > 94 public OopHandle object() { > 95 Address objAddr = addr.getAddressAt(objectFieldOffset); > 96 if (objAddr == null) { > 97 return null; > 98 } > 99 return objAddr.getOopHandleAt(0); > > How about something a little bit less abstraction smashing? This is already the "norm", and the way that all other handle loads look in the SA today, and therefore the SA does not support any GC with load barriers properly. While ideally, there would be something equivalent to the Access API in the SA to abstract the access details, I think it seems like way beyond the scope of this change to implement something like that for the SA, as part of this change. Again, thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From eosterlund at openjdk.java.net Mon Sep 14 14:23:23 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 14 Sep 2020 14:23:23 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: On Mon, 14 Sep 2020 14:15:41 GMT, Erik ?sterlund wrote: >> @rkennke - Thanks for the review. I believe @fisk is going to address >> your comments. > > Hi Kim, > > Here is a partial reply to your review. Thanks for reviewing! > Hmm seems like your email was only sent to shenandoah-dev. Not sure if that was intended. I'm not subscribed to that > mailing list, so I will send my reply through github and hope for the best. >> _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) on >> [shenandoah-dev](mailto:shenandoah-dev at openjdk.java.net):_ >> ------------------------------------------------------------------------------ >> src/hotspot/share/oops/weakHandle.cpp >> 36 WeakHandle::WeakHandle(OopStorage* storage, oop obj) : >> 37 _obj(storage->allocate()) { >> 38 assert(obj != NULL, "no need to create weak null oop"); >> >> Please format this differently so the ctor-init-list is more easily >> distinguished from the body. I don't care that much which of the several >> alternatives is used. >> >> ------------------------------------------------------------------------------ >> src/hotspot/share/runtime/objectMonitor.cpp >> 244 // Check that object() and set_object() are called from the right context: >> 245 static void check_object_context() { >> >> This seems like checking we would normally only do in a debug build. Is this >> really supposed to be done in product builds too? (It's written to support >> that, just wondering if that's really what we want.) Maybe these aren't >> called very often so it doesn't matter? I also see that guarantee (rather >> than assert) is used a fair amount in this and related code. > > I don't think I have a preference here. As you say, it seems a bit mixed. I would be okay with both. Do you want them > to be asserts? >> ------------------------------------------------------------------------------ >> src/hotspot/share/runtime/objectMonitor.cpp >> 251 JavaThread* jt = (JavaThread*)self; >> >> Use self->as_Java_thread(). >> >> Later: Coleen already commented on this and it's been fixed. >> >> ------------------------------------------------------------------------------ >> src/hotspot/share/runtime/objectMonitor.cpp >> 249 guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be"); >> 250 if (self->is_Java_thread()) { >> >> Maybe instead >> >> if (self->is_Java_thread()) { >> ... >> } else { >> guarantee(self->is_VM_thread(), "must be"); >> } >> >> ------------------------------------------------------------------------------ >> src/hotspot/share/runtime/objectMonitor.cpp >> Both ObjectMonitor::object() and ObjectMonitor::object_peek() have >> 265 if (_object.is_null()) { >> 266 return NULL; >> >> Should we really be calling those functions when in such a state? That seems >> like it might be a bug in the callers? >> >> OK, I think I see some places where object_peek() might need such protection >> because of races, in src/hotspot/share/runtime/synchronizer.cpp. And >> because we don't seem to ever release() the underlying WeakHandles. >> >> 514 if (m->object_peek() == NULL) { >> 703 if (cur_om->object_peek() == NULL) { >> >> But it still seems like it might be a bug to call object() in such a state. > > The problem is Type Stable Memory (TSM). The idea is that all monitors are allocated in blocks of many (when the object > they are associated with is not yet known), and are then re-used, switching object association back and forth. They are > also never freed. The changes in this RFR were originally tied together with a patch that also removes TSM. But this > has been split into separate parts by Dan in order to perform the OopStorage migration first, and then removing TSM. So > in this patch, we are still dealing with block-allocation of monitors before they have an associated object, and > changing of object association. Therefore, what you are saying is 100% true, but I would like to defer fixing that > until the follow-up patch that removes TSM. In that patch, the accessors are changed to do exactly what you expect them > to (resolve/peek), as the ObjectMonitor instances are allocated with a single object association in mind, installed in > the constructor, and get deleted after deflation, with destuctors run. So I hope we can agree to wait until that > subsequent patch with addressing such concerns. We thought reviewing the whole thing as a single patch was too much > clutter. >> Related, see next comment. >> >> Later: Looks like Coleen questioned this too. I'm not sure I understand >> Erik's response though. When do we look at monitors that might not be >> constructed / initialized? That seems like a bad thing to do. Oh, but the >> whole creation path for OMs is kind of evil right now. I see... And that's >> planned to be fixed in later work. OK. > > Yes, exactly. It is currently quite evil, but will soon be cleaned up to look quite ordinary. > >> ------------------------------------------------------------------------------ >> src/hotspot/share/runtime/synchronizer.cpp >> 1548 guarantee(m->object_peek() == NULL, "invariant"); >> >> Later: But see previous comment. Some of this might be relevat later though. >> >> object_peek() seemed like the wrong operation here. I thought this was >> attempting to verify that the underlying WeakHandle has been released. But >> peeking doesn't ensure that. Oh, but we don't actually release() the >> WeakHandle when we "free" an OM. We're just pushing the OM on the free >> list. Which means the GC will continue to examine the associated OopStorage >> entry (and discover it's NULL). So there's some cost to not releasing the >> WeakHandle when putting an OM on the free list. Of course, it means there >> won't be any allocations when taking off the free list; I'm not sure which >> way is better. > > Yeah, so we never really release the monitors, because the whole monitor can't be freed yet in this patch. > >> But this makes me go back and wonder about whether object_peek() should have >> the _object.is_null() check. After creation it seems like that should never >> be true. > > Before the switch away from TSM (next patch), we must assume that monitors have not yet been properly initialized, due > to block allocation of a whole bunch of monitors at a time, before they have been associated with objects. They are > also exposed to iterators and such, before any object association has been made. But very very soon, that will be a > memory of the past. >> ------------------------------------------------------------------------------ >> src/hotspot/share/runtime/synchronizer.cpp >> >> The old code in chk_in_use_entry seems wrong. It checked for a null >> object() and recorded that as an error. But then it went on and attempted >> to use it as if it was not null. That's been fixed by the change. However, >> the change no longer treats a null as an error. Probably this is because >> it's weak, and so could have become null. But is that really possible for >> an "in use" monitor? > > Well, the monitors are added to some in-use list at deflation time. The instant after, the monitor could have become > unused. But it will still be floating around on the in-use list, until deflation time. And yes, now it is suddenly very > much valid for a monitor that is no longer used (but placed on the in-use list when it was used), to have a NULL > object, due to GC getting to weak processing clearing the oop, before deflation gets to it. >> ------------------------------------------------------------------------------ >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/ObjectMonitor.java >> 94 public OopHandle object() { >> 95 Address objAddr = addr.getAddressAt(objectFieldOffset); >> 96 if (objAddr == null) { >> 97 return null; >> 98 } >> 99 return objAddr.getOopHandleAt(0); >> >> How about something a little bit less abstraction smashing? > > This is already the "norm", and the way that all other handle loads look in the SA today, and therefore the SA does not > support any GC with load barriers properly. While ideally, there would be something equivalent to the Access API in the > SA to abstract the access details, I think it seems like way beyond the scope of this change to implement something > like that for the SA, as part of this change. Again, thanks for reviewing! > Shenandoah changes are not complete: > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp: In member function 'virtual > void ShenandoahFinalMarkingTask::work(uint)': > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:298:31: error: 'oops_do' is > not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&resolve_mark_cl); ^~~~~~~ > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:308:31: error: 'oops_do' is > not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&mark_cl); > ^~~~~~~ > > I will have a look into how to resolve this. > > In order to fix this, I need one of two things: > > * A way to iterate oops of the global-list. Explanation: In Shenandoah's I-U mode, we need to re-mark such ObjectMonitors > during final-mark because they may be the only remaining reference to an object on the heap. > _or_ (even better) > > * Call a write-barrier (IN_NATIVE) whenever an ObjectMonitor is moved to the global-list upon thread-destruction. This > way we could mark that object concurrently. > > > Notice that this does not affect thread-local ObjectMonitors (IOW, most regular ObjectMonitors) because those would be > scanned while scanning thread stacks. Therefore I'd like to avoid to generally place a barrier in ObjectMonitor. > See: https://bugs.openjdk.java.net/browse/JDK-8251451 > > AFAICT, this may affect ZGC too (not 100% sure about this). So this change makes the object monitors weak. So you shouldn't have to remark them at all. If they hold the last reference to the object, then the monitor gets the object cleared as part of normal weak OopStorage reference processing. Subsequent deflation cycles will detect that the GC has cleared the oop, and reuse the monitor. In other words, we should just remove the call to the oops_do function, and rely on OopStorage doing its thing. The GC should not have to care at all about monitors any longer, only about processing weak OopStorages, which is done automatically. Hope this makes sense. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From rkennke at openjdk.java.net Mon Sep 14 14:26:26 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 14 Sep 2020 14:26:26 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: On Mon, 14 Sep 2020 13:48:25 GMT, Daniel D. Daugherty wrote: > @rkennke - Thanks for the review. I believe @fisk is going to address > your comments. Actually, if I understand it correctly, OopStorage already gives us full barriers, so we should be covered, and we can simply revert JDK-8251451: http://cr.openjdk.java.net/~rkennke/8247281-shenandoah.patch (Is there a better way to propose amendments to a PR?!) I believe this probably also means that we don't need to scan object monitor lists during thread-scans, and let SATB (or whatever concurrent marking) take care of it. @fisk WDYT? ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From eosterlund at openjdk.java.net Mon Sep 14 14:40:18 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 14 Sep 2020 14:40:18 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: <0gKQJq1wgjTtGesVdOc7DGfjexnEkm_AcbNUSbYWSTk=.2b5edf85-1cb4-4818-92fe-c4dbf4535b6e@github.com> References: <0gKQJq1wgjTtGesVdOc7DGfjexnEkm_AcbNUSbYWSTk=.2b5edf85-1cb4-4818-92fe-c4dbf4535b6e@github.com> Message-ID: On Mon, 14 Sep 2020 02:01:58 GMT, David Holmes wrote: > Not sure about the JVM TI changes! Here is a generic comment. You mention that the specification doesn't make it clear whether the roots reported are strong or weak. This is true. There is no mention about roots being strong or weak here. Notably, all listed roots, are strong though. I believe that with the chosen interpretation of "root", it is implied that it is strong, and hence why strong vs weak roots is not discussed. According to the memory management glossary (https://www.memorymanagement.org/glossary/r.html#glossary-r), this is the definition of a root: "In tracing garbage collection, a root holds a reference or set of references to objects that are a priori reachable. The root set is used as the starting point in determining all reachable data." Note that, this definition of root makes it implicit that it is also strong. And after a long discussion with GC nerds in Stockholm, this is in fact also why we have IN_NATIVE oop accesses, rather than IN_ROOT oop accesses. Because the opinions differed too much about what a "root" is. But to me it really does look like the JVMTI interpretation of a root, is a strong root, and hence that this is precisely what the text is talking about when mentioning roots, and hence also why all the enumerated roots, are indeed strong roots. Same comments for the HPROF_GC_ROOT_* discussion, as for the JVMTI_HEAP_ROOT_* discussion. That is also why I think it is now a bug to report them, as they are no longer "roots" in the JVMTI/HPROF sense. But you are right, filing a CSR for the behavioural change is probably a good idea. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From rkennke at openjdk.java.net Mon Sep 14 14:40:18 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 14 Sep 2020 14:40:18 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> On Mon, 14 Sep 2020 14:20:29 GMT, Erik ?sterlund wrote: > > Shenandoah changes are not complete: > > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp: In member function 'virtual > > void ShenandoahFinalMarkingTask::work(uint)': > > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:298:31: error: 'oops_do' is > > not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&resolve_mark_cl); ^~~~~~~ > > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:308:31: error: 'oops_do' is > > not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&mark_cl); > > ^~~~~~~ > > I will have a look into how to resolve this. > > In order to fix this, I need one of two things: > > ``` > > * A way to iterate oops of the global-list. Explanation: In Shenandoah's I-U mode, we need to re-mark such ObjectMonitors > > during final-mark because they may be the only remaining reference to an object on the heap. > > _or_ (even better) > > > > * Call a write-barrier (IN_NATIVE) whenever an ObjectMonitor is moved to the global-list upon thread-destruction. This > > way we could mark that object concurrently. > > ``` > > > > > > Notice that this does not affect thread-local ObjectMonitors (IOW, most regular ObjectMonitors) because those would be > > scanned while scanning thread stacks. Therefore I'd like to avoid to generally place a barrier in ObjectMonitor. See: > > https://bugs.openjdk.java.net/browse/JDK-8251451 AFAICT, this may affect ZGC too (not 100% sure about this). > > So this change makes the object monitors weak. So you shouldn't have to remark them at all. If they hold the last > reference to the object, then the monitor gets the object cleared as part of normal weak OopStorage reference > processing. Subsequent deflation cycles will detect that the GC has cleared the oop, and reuse the monitor. In other > words, we should just remove the call to the oops_do function, and rely on OopStorage doing its thing. The GC should > not have to care at all about monitors any longer, only about processing weak OopStorages, which is done automatically. > Hope this makes sense. Yes, definitely. I came to the same conclusion. Thank you! With the patch amended: http://cr.openjdk.java.net/~rkennke/8247281-shenandoah.patch I'm good with the change. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From eosterlund at openjdk.java.net Mon Sep 14 14:43:38 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 14 Sep 2020 14:43:38 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> Message-ID: On Mon, 14 Sep 2020 14:24:02 GMT, Roman Kennke wrote: > > @rkennke - Thanks for the review. I believe @fisk is going to address > > your comments. > > Actually, if I understand it correctly, OopStorage already gives us full barriers, so we should be covered, and we can > simply revert JDK-8251451: http://cr.openjdk.java.net/~rkennke/8247281-shenandoah.patch > (Is there a better way to propose amendments to a PR?!) > > I believe this probably also means that we don't need to scan object monitor lists during thread-scans, and let SATB > (or whatever concurrent marking) take care of it. @fisk WDYT? Absolutely. GC should no longer have to know anything about ObjectMonitors, only the automatically plugged in OopStorage that it processes. GC should not walk any monitor lists at all. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From goetz at openjdk.java.net Mon Sep 14 14:54:05 2020 From: goetz at openjdk.java.net (Goetz Lindenmaier) Date: Mon, 14 Sep 2020 14:54:05 GMT Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 12:58:35 GMT, Aleksey Shipilev wrote: > It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at > fixing this, I think we need to use `u2` consistently for hash code computations. Hi, this fix works in our CI. Looks good. Best regards, Goetz. ------------- PR: https://git.openjdk.java.net/jdk/pull/150 From mdoerr at openjdk.java.net Mon Sep 14 15:13:23 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 14 Sep 2020 15:13:23 GMT Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 12:58:35 GMT, Aleksey Shipilev wrote: > It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at > fixing this, I think we need to use `u2` consistently for hash code computations. Marked as reviewed by mdoerr (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/150 From goetz at openjdk.java.net Mon Sep 14 15:44:41 2020 From: goetz at openjdk.java.net (Goetz Lindenmaier) Date: Mon, 14 Sep 2020 15:44:41 GMT Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 12:58:35 GMT, Aleksey Shipilev wrote: > It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at > fixing this, I think we need to use `u2` consistently for hash code computations. LGTM, Goetz. ------------- Marked as reviewed by goetz (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/150 From kim.barrett at oracle.com Mon Sep 14 16:05:15 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 14 Sep 2020 12:05:15 -0400 Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: > On Sep 14, 2020, at 9:05 AM, Aleksey Shipilev wrote: > > It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at > fixing this, I think we need to use `u2` consistently for hash code computations. > > ------------- > > Commit messages: > - No need to cast to short > - Indenting > - Minor cleanup > - Another try, use u2 consistently > - 8253089: Windows (MSVC 2017) build fails after JDK-8243208 > > Changes: https://git.openjdk.java.net/jdk/pull/150/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=150&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8253089 > Stats: 8 lines in 2 files changed: 0 ins; 0 del; 8 mod > Patch: https://git.openjdk.java.net/jdk/pull/150.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/150/head:pull/150 > > PR: https://git.openjdk.java.net/jdk/pull/150 I suspect the compiler warning is an example of https://developercommunity.visualstudio.com/content/problem/211134/unsigned-integer-overflows-in-constexpr-functionsa.html This would explain why using u2 would dodge the warning. The currently overflowing arithmetic operation would be on the promoted u2 values, so won't overflow. I think changing to using u2 isn't the right solution though. This throws away a lot of bits in the hash-code calculation, potentially making it less effective. I think a better workaround is to locally suppress the warning, which I think is from the multiplication here: 61 h = 31*h + (unsigned int) *s; So I suggest wrapping that line with targeted warning suppression, i.e. using PRAGMA_DIAG_PUSH/POP and PRAGMA_DISABLE_MSVC_WARNING(4307) with an appropriate comment. I don't love cluttering code with that kind of thing, but I think in this case that's a better solution than changing what's being calculated. Note that it might be that such a limited warning disable is insufficient in the long run. This seems like a problem that might arise in other places as we make more use of constexpr. We might instead need to globally disable that warning for some limited range of VS versions. From dcubed at openjdk.java.net Mon Sep 14 16:19:48 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 16:19:48 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> Message-ID: On Mon, 14 Sep 2020 14:27:51 GMT, Roman Kennke wrote: >>> Shenandoah changes are not complete: >>> /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp: In member function 'virtual >>> void ShenandoahFinalMarkingTask::work(uint)': >>> /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:298:31: error: 'oops_do' is >>> not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&resolve_mark_cl); ^~~~~~~ >>> /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:308:31: error: 'oops_do' is >>> not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&mark_cl); >>> ^~~~~~~ >>> >>> I will have a look into how to resolve this. >>> >>> In order to fix this, I need one of two things: >>> >>> * A way to iterate oops of the global-list. Explanation: In Shenandoah's I-U mode, we need to re-mark such ObjectMonitors >>> during final-mark because they may be the only remaining reference to an object on the heap. >>> _or_ (even better) >>> >>> * Call a write-barrier (IN_NATIVE) whenever an ObjectMonitor is moved to the global-list upon thread-destruction. This >>> way we could mark that object concurrently. >>> >>> >>> Notice that this does not affect thread-local ObjectMonitors (IOW, most regular ObjectMonitors) because those would be >>> scanned while scanning thread stacks. Therefore I'd like to avoid to generally place a barrier in ObjectMonitor. >>> See: https://bugs.openjdk.java.net/browse/JDK-8251451 >>> >>> AFAICT, this may affect ZGC too (not 100% sure about this). >> >> So this change makes the object monitors weak. So you shouldn't have to remark them at all. If they hold the last >> reference to the object, then the monitor gets the object cleared as part of normal weak OopStorage reference >> processing. Subsequent deflation cycles will detect that the GC has cleared the oop, and reuse the monitor. In other >> words, we should just remove the call to the oops_do function, and rely on OopStorage doing its thing. The GC should >> not have to care at all about monitors any longer, only about processing weak OopStorages, which is done automatically. >> Hope this makes sense. > >> > Shenandoah changes are not complete: >> > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp: In member function 'virtual >> > void ShenandoahFinalMarkingTask::work(uint)': >> > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:298:31: error: 'oops_do' is >> > not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&resolve_mark_cl); ^~~~~~~ >> > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:308:31: error: 'oops_do' is >> > not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&mark_cl); >> > ^~~~~~~ >> > I will have a look into how to resolve this. >> > In order to fix this, I need one of two things: >> > ``` >> > * A way to iterate oops of the global-list. Explanation: In Shenandoah's I-U mode, we need to re-mark such ObjectMonitors >> > during final-mark because they may be the only remaining reference to an object on the heap. >> > _or_ (even better) >> > >> > * Call a write-barrier (IN_NATIVE) whenever an ObjectMonitor is moved to the global-list upon thread-destruction. This >> > way we could mark that object concurrently. >> > ``` >> > >> > >> > Notice that this does not affect thread-local ObjectMonitors (IOW, most regular ObjectMonitors) because those would be >> > scanned while scanning thread stacks. Therefore I'd like to avoid to generally place a barrier in ObjectMonitor. See: >> > https://bugs.openjdk.java.net/browse/JDK-8251451 AFAICT, this may affect ZGC too (not 100% sure about this). >> >> So this change makes the object monitors weak. So you shouldn't have to remark them at all. If they hold the last >> reference to the object, then the monitor gets the object cleared as part of normal weak OopStorage reference >> processing. Subsequent deflation cycles will detect that the GC has cleared the oop, and reuse the monitor. In other >> words, we should just remove the call to the oops_do function, and rely on OopStorage doing its thing. The GC should >> not have to care at all about monitors any longer, only about processing weak OopStorages, which is done automatically. >> Hope this makes sense. > > Yes, definitely. I came to the same conclusion. Thank you! > With the patch amended: > http://cr.openjdk.java.net/~rkennke/8247281-shenandoah.patch > > I'm good with the change. @rkennke - I've commited the changes in your webrev as https://github.com/openjdk/jdk/pull/135/commits/bbf8dbd09bdf5c1c77c67cc637fbc10fe72d4894. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 16:19:47 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 16:19:47 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v3] In-Reply-To: References: Message-ID: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: rkennke CR - changes from Roman for Shenandoah. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/135/files - new: https://git.openjdk.java.net/jdk/pull/135/files/750fe771..bbf8dbd0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=01-02 Stats: 7 lines in 1 file changed: 0 ins; 7 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/135.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/135/head:pull/135 PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 16:50:05 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 16:50:05 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> Message-ID: On Mon, 14 Sep 2020 16:16:53 GMT, Daniel D. Daugherty wrote: >>> > Shenandoah changes are not complete: >>> > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp: In member function 'virtual >>> > void ShenandoahFinalMarkingTask::work(uint)': >>> > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:298:31: error: 'oops_do' is >>> > not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&resolve_mark_cl); ^~~~~~~ >>> > /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:308:31: error: 'oops_do' is >>> > not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&mark_cl); >>> > ^~~~~~~ >>> > I will have a look into how to resolve this. >>> > In order to fix this, I need one of two things: >>> > ``` >>> > * A way to iterate oops of the global-list. Explanation: In Shenandoah's I-U mode, we need to re-mark such ObjectMonitors >>> > during final-mark because they may be the only remaining reference to an object on the heap. >>> > _or_ (even better) >>> > >>> > * Call a write-barrier (IN_NATIVE) whenever an ObjectMonitor is moved to the global-list upon thread-destruction. This >>> > way we could mark that object concurrently. >>> > ``` >>> > >>> > >>> > Notice that this does not affect thread-local ObjectMonitors (IOW, most regular ObjectMonitors) because those would be >>> > scanned while scanning thread stacks. Therefore I'd like to avoid to generally place a barrier in ObjectMonitor. See: >>> > https://bugs.openjdk.java.net/browse/JDK-8251451 AFAICT, this may affect ZGC too (not 100% sure about this). >>> >>> So this change makes the object monitors weak. So you shouldn't have to remark them at all. If they hold the last >>> reference to the object, then the monitor gets the object cleared as part of normal weak OopStorage reference >>> processing. Subsequent deflation cycles will detect that the GC has cleared the oop, and reuse the monitor. In other >>> words, we should just remove the call to the oops_do function, and rely on OopStorage doing its thing. The GC should >>> not have to care at all about monitors any longer, only about processing weak OopStorages, which is done automatically. >>> Hope this makes sense. >> >> Yes, definitely. I came to the same conclusion. Thank you! >> With the patch amended: >> http://cr.openjdk.java.net/~rkennke/8247281-shenandoah.patch >> >> I'm good with the change. > > @rkennke - I've commited the changes in your webrev as > https://github.com/openjdk/jdk/pull/135/commits/bbf8dbd09bdf5c1c77c67cc637fbc10fe72d4894. @dholmes-ora and @fisk - I've taken a first pass at creating a CSR: JDK-8253121 migrate ObjectMonitor::_object to OopStorage https://bugs.openjdk.java.net/browse/JDK-8253121 Please look it over and feel free to edit as needed. Since I don't do CSR's often, what I've done might be all wrong. :-) ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 17:13:08 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 17:13:08 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v3] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: <1rj4zO-L65NEG1ZyUdi3YyJR3A6AOTeb5cBsmVOiJ4E=.40e51e90-4008-46ea-a451-526144312035@github.com> On Mon, 14 Sep 2020 01:52:04 GMT, David Holmes wrote: >> Thanks for confirmation. > > I don't see anything in the HPROF format description that claims this is a strong root. At a minimum this seems to be a > behavioural change that would warrant a CSR request. This also seems to be something that the serviceability folk > should be made aware of and have a chance to comment on. I've taken a first pass at creating a CSR: JDK-8253121 migrate ObjectMonitor::_object to OopStorage https://bugs.openjdk.java.net/browse/JDK-8253121 ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 17:13:08 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 17:13:08 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v3] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Mon, 14 Sep 2020 02:00:05 GMT, David Holmes wrote: >> Thanks for confirmation. > > From the spec I'm not clear on exactly what JVMTI_HEAP_REFERENCE_MONITOR is intended to be. Serviceability folk should > be giving some input here though. I've taken a first pass at creating a CSR: JDK-8253121 migrate ObjectMonitor::_object to OopStorage https://bugs.openjdk.java.net/browse/JDK-8253121 ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From iklam at openjdk.java.net Mon Sep 14 17:19:59 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 14 Sep 2020 17:19:59 GMT Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 15:42:19 GMT, Goetz Lindenmaier wrote: >> It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at >> fixing this, I think we need to use `u2` consistently for hash code computations. `u2` is the final storage type for >> the hash code in `JVMFlagLookup::_hashes`. > > LGTM, Goetz. I agree with Kim that it's better to disable the warning. ------------- PR: https://git.openjdk.java.net/jdk/pull/150 From shade at redhat.com Mon Sep 14 17:27:36 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 14 Sep 2020 19:27:36 +0200 Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: On 9/14/20 6:05 PM, Kim Barrett wrote: > I think changing to using u2 isn't the right solution though. This throws > away a lot of bits in the hash-code calculation, potentially making it less > effective. I think this accuracy concern does not apply here, because it does not actually change the computation. This is because polynomial hash codes enjoy the modulo distributivity, computation uses unsigned (non-negative) values, and the fact that the result is finally stored in u2, cutting out whatever upper bits hash code had accumulated. A bit more rigorously: M = 2^(sizeof(u2)*8) = 16 K = 2^(sizeof(unsigned int)*8) = 32 // assume int is 32 bit L1: note M and K are power of two, and K=2*M, so (x % K) % M = (x % M) here (Translation: cutting out K lower bits, and then M (M < K) lower bits is equivalent to cutting out M lower bits right away) new_hashcode = sum_i( (31^k * s[i]) % M) % M old_hashcode = sum_i( (31^k * s[i]) % K) % M = sum_i(((31^k * s[i]) % K) % M) % M // modulo distributivity = sum_i( (31^k * s[i]) % M) % M // by L1 = new_hashcode Or try this ;) https://cr.openjdk.java.net/~shade/8253089/mod-hashcode.cpp > I think a better workaround is to locally suppress the warning, > which I think is from the multiplication here: > > 61 h = 31*h + (unsigned int) *s; > > So I suggest wrapping that line with targeted warning suppression, i.e. > using PRAGMA_DIAG_PUSH/POP and PRAGMA_DISABLE_MSVC_WARNING(4307) with an > appropriate comment. I don't love cluttering code with that kind of thing, > but I think in this case that's a better solution than changing what's > being calculated. I think math thankfully saves us from cluttering the code with compiler pragmas. -- Thanks, -Aleksey From shade at redhat.com Mon Sep 14 17:43:35 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 14 Sep 2020 19:43:35 +0200 Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: On 9/14/20 7:27 PM, Aleksey Shipilev wrote: > On 9/14/20 6:05 PM, Kim Barrett wrote: >> I think changing to using u2 isn't the right solution though. This throws >> away a lot of bits in the hash-code calculation, potentially making it less >> effective. > > I think this accuracy concern does not apply here, because it does not actually change the computation. > > [...] > > Or try this ;) > https://cr.openjdk.java.net/~shade/8253089/mod-hashcode.cpp Even more: jvmFlagLookup.o files before and after are bit-to-bit identical. (I objdumped to look at .rodata section to see _flag_lookup_table is there). -- Thanks, -Aleksey From dcubed at openjdk.java.net Mon Sep 14 18:39:03 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 18:39:03 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> Message-ID: On Mon, 14 Sep 2020 18:33:17 GMT, Daniel D. Daugherty wrote: >> @dholmes-ora and @fisk - I've taken a first pass at creating a CSR: >> JDK-8253121 migrate ObjectMonitor::_object to OopStorage >> https://bugs.openjdk.java.net/browse/JDK-8253121 >> >> Please look it over and feel free to edit as needed. Since I don't do >> CSR's often, what I've done might be all wrong. :-) > > @kimbarrett: > >> src/hotspot/share/oops/weakHandle.cpp >> 36 WeakHandle::WeakHandle(OopStorage* storage, oop obj) : >> 37 _obj(storage->allocate()) { >> 38 assert(obj != NULL, "no need to create weak null oop"); >> >> Please format this differently so the ctor-init-list is more easily >> distinguished from the body. I don't care that much which of the several >> alternatives is used. > > After discussion with Erik, I changed the indent on L37 from two space to four spaces. @kimbarrett > src/hotspot/share/runtime/objectMonitor.cpp > 244 // Check that object() and set_object() are called from the right context: > 245 static void check_object_context() { > > This seems like checking we would normally only do in a debug build. Is this > really supposed to be done in product builds too? (It's written to support > that, just wondering if that's really what we want.) Maybe these aren't > called very often so it doesn't matter? I also see that guarantee (rather > than assert) is used a fair amount in this and related code. I've changed check_object_context() to only be defined and called when ASSERT is defined. I've also changed the guarantee() calls to assert() calls. I've done a couple of Mach5 Tier[1-8] test cycles on this code so I'm no longer worried about this code or its callers in release bits. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 18:39:02 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 18:39:02 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> Message-ID: <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> On Mon, 14 Sep 2020 16:47:27 GMT, Daniel D. Daugherty wrote: >> @rkennke - I've commited the changes in your webrev as >> https://github.com/openjdk/jdk/pull/135/commits/bbf8dbd09bdf5c1c77c67cc637fbc10fe72d4894. > > @dholmes-ora and @fisk - I've taken a first pass at creating a CSR: > JDK-8253121 migrate ObjectMonitor::_object to OopStorage > https://bugs.openjdk.java.net/browse/JDK-8253121 > > Please look it over and feel free to edit as needed. Since I don't do > CSR's often, what I've done might be all wrong. :-) @kimbarrett: > src/hotspot/share/oops/weakHandle.cpp > 36 WeakHandle::WeakHandle(OopStorage* storage, oop obj) : > 37 _obj(storage->allocate()) { > 38 assert(obj != NULL, "no need to create weak null oop"); > > Please format this differently so the ctor-init-list is more easily > distinguished from the body. I don't care that much which of the several > alternatives is used. After discussion with Erik, I changed the indent on L37 from two space to four spaces. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 18:39:03 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 18:39:03 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> Message-ID: On Mon, 14 Sep 2020 18:35:29 GMT, Daniel D. Daugherty wrote: >> @kimbarrett: >> >>> src/hotspot/share/oops/weakHandle.cpp >>> 36 WeakHandle::WeakHandle(OopStorage* storage, oop obj) : >>> 37 _obj(storage->allocate()) { >>> 38 assert(obj != NULL, "no need to create weak null oop"); >>> >>> Please format this differently so the ctor-init-list is more easily >>> distinguished from the body. I don't care that much which of the several >>> alternatives is used. >> >> After discussion with Erik, I changed the indent on L37 from two space to four spaces. > > @kimbarrett > >> src/hotspot/share/runtime/objectMonitor.cpp >> 244 // Check that object() and set_object() are called from the right context: >> 245 static void check_object_context() { >> >> This seems like checking we would normally only do in a debug build. Is this >> really supposed to be done in product builds too? (It's written to support >> that, just wondering if that's really what we want.) Maybe these aren't >> called very often so it doesn't matter? I also see that guarantee (rather >> than assert) is used a fair amount in this and related code. > > I've changed check_object_context() to only be defined and called > when ASSERT is defined. I've also changed the guarantee() calls > to assert() calls. > > I've done a couple of Mach5 Tier[1-8] test cycles on this code so I'm > no longer worried about this code or its callers in release bits. @kimbarrett > src/hotspot/share/runtime/objectMonitor.cpp > 249 guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be"); > 250 if (self->is_Java_thread()) { > > Maybe instead > > if (self->is_Java_thread()) { > ... > } else { > guarantee(self->is_VM_thread(), "must be"); > } I've made this refactoring change, tweaked the comments above the block a bit and switched from guarantee() to assert(). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 18:42:33 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 18:42:33 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> Message-ID: On Mon, 14 Sep 2020 18:36:28 GMT, Daniel D. Daugherty wrote: >> @kimbarrett >> >>> src/hotspot/share/runtime/objectMonitor.cpp >>> 244 // Check that object() and set_object() are called from the right context: >>> 245 static void check_object_context() { >>> >>> This seems like checking we would normally only do in a debug build. Is this >>> really supposed to be done in product builds too? (It's written to support >>> that, just wondering if that's really what we want.) Maybe these aren't >>> called very often so it doesn't matter? I also see that guarantee (rather >>> than assert) is used a fair amount in this and related code. >> >> I've changed check_object_context() to only be defined and called >> when ASSERT is defined. I've also changed the guarantee() calls >> to assert() calls. >> >> I've done a couple of Mach5 Tier[1-8] test cycles on this code so I'm >> no longer worried about this code or its callers in release bits. > > @kimbarrett > >> src/hotspot/share/runtime/objectMonitor.cpp >> 249 guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be"); >> 250 if (self->is_Java_thread()) { >> >> Maybe instead >> >> if (self->is_Java_thread()) { >> ... >> } else { >> guarantee(self->is_VM_thread(), "must be"); >> } > > I've made this refactoring change, tweaked the comments above > the block a bit and switched from guarantee() to assert(). @kimbarrett > src/hotspot/share/runtime/synchronizer.cpp > 1548 guarantee(m->object_peek() == NULL, "invariant"); > > Later: But see previous comment. Some of this might be relevat later though. > > object_peek() seemed like the wrong operation here. I thought this was > attempting to verify that the underlying WeakHandle has been released. But > peeking doesn't ensure that. Oh, but we don't actually release() the > WeakHandle when we "free" an OM. We're just pushing the OM on the free > list. Which means the GC will continue to examine the associated OopStorage > entry (and discover it's NULL). So there's some cost to not releasing the > WeakHandle when putting an OM on the free list. Of course, it means there > won't be any allocations when taking off the free list; I'm not sure which > way is better. > > But this makes me go back and wonder about whether object_peek() should have > the _object.is_null() check. After creation it seems like that should never > be true. m->object_peek() == NULL is the right check at that location. om_release() is called when we are returning an ObjectMonitor to a free list. At that point, it should never be associated with an object. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 18:47:08 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 18:47:08 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> Message-ID: <6C-kpuZszd3T_WGL0HbPq70g9QiqEzs2fQ2v71FC2T4=.64508cfd-0baf-48e7-a829-b2ddb7c52c66@github.com> On Mon, 14 Sep 2020 18:39:45 GMT, Daniel D. Daugherty wrote: >> @kimbarrett >> >>> src/hotspot/share/runtime/objectMonitor.cpp >>> 249 guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be"); >>> 250 if (self->is_Java_thread()) { >>> >>> Maybe instead >>> >>> if (self->is_Java_thread()) { >>> ... >>> } else { >>> guarantee(self->is_VM_thread(), "must be"); >>> } >> >> I've made this refactoring change, tweaked the comments above >> the block a bit and switched from guarantee() to assert(). > > @kimbarrett > >> src/hotspot/share/runtime/synchronizer.cpp >> 1548 guarantee(m->object_peek() == NULL, "invariant"); >> >> Later: But see previous comment. Some of this might be relevat later though. >> >> object_peek() seemed like the wrong operation here. I thought this was >> attempting to verify that the underlying WeakHandle has been released. But >> peeking doesn't ensure that. Oh, but we don't actually release() the >> WeakHandle when we "free" an OM. We're just pushing the OM on the free >> list. Which means the GC will continue to examine the associated OopStorage >> entry (and discover it's NULL). So there's some cost to not releasing the >> WeakHandle when putting an OM on the free list. Of course, it means there >> won't be any allocations when taking off the free list; I'm not sure which >> way is better. >> >> But this makes me go back and wonder about whether object_peek() should have >> the _object.is_null() check. After creation it seems like that should never >> be true. > > m->object_peek() == NULL is the right check at that location. om_release() > is called when we are returning an ObjectMonitor to a free list. At that point, > it should never be associated with an object. @kimbarrett > src/hotspot/share/runtime/synchronizer.cpp > > The old code in chk_in_use_entry seems wrong. It checked for a null > object() and recorded that as an error. But then it went on and attempted > to use it as if it was not null. That's been fixed by the change. However, > the change no longer treats a null as an error. Probably this is because > it's weak, and so could have become null. But is that really possible for > an "in use" monitor? In the old code, an ObjectMonitor's object field should never be NULL when that ObjectMonitor is on an in-use list. We'll get the logging message and then a crash. I used to have guarantee(n->object() != NULL, ...) in there, but Robbin convinced me that was a waste because we'll just crash on the use of the NULL pointer and that was good enough. As Erik has already explained, now that we use a weak handle, the object can be GC'ed before the deflater thread comes along and removes the idle ObjectMonitor from the in-use list. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 18:55:38 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 18:55:38 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: <6C-kpuZszd3T_WGL0HbPq70g9QiqEzs2fQ2v71FC2T4=.64508cfd-0baf-48e7-a829-b2ddb7c52c66@github.com> References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> <6C-kpuZszd3T_WGL0HbPq70g9QiqEzs2fQ2v71FC2T4=.64508cfd-0baf-48e7-a829-b2ddb7c52c66@github.com> Message-ID: On Mon, 14 Sep 2020 18:44:30 GMT, Daniel D. Daugherty wrote: >> @kimbarrett >> >>> src/hotspot/share/runtime/synchronizer.cpp >>> 1548 guarantee(m->object_peek() == NULL, "invariant"); >>> >>> Later: But see previous comment. Some of this might be relevat later though. >>> >>> object_peek() seemed like the wrong operation here. I thought this was >>> attempting to verify that the underlying WeakHandle has been released. But >>> peeking doesn't ensure that. Oh, but we don't actually release() the >>> WeakHandle when we "free" an OM. We're just pushing the OM on the free >>> list. Which means the GC will continue to examine the associated OopStorage >>> entry (and discover it's NULL). So there's some cost to not releasing the >>> WeakHandle when putting an OM on the free list. Of course, it means there >>> won't be any allocations when taking off the free list; I'm not sure which >>> way is better. >>> >>> But this makes me go back and wonder about whether object_peek() should have >>> the _object.is_null() check. After creation it seems like that should never >>> be true. >> >> m->object_peek() == NULL is the right check at that location. om_release() >> is called when we are returning an ObjectMonitor to a free list. At that point, >> it should never be associated with an object. > > @kimbarrett > >> src/hotspot/share/runtime/synchronizer.cpp >> >> The old code in chk_in_use_entry seems wrong. It checked for a null >> object() and recorded that as an error. But then it went on and attempted >> to use it as if it was not null. That's been fixed by the change. However, >> the change no longer treats a null as an error. Probably this is because >> it's weak, and so could have become null. But is that really possible for >> an "in use" monitor? > > In the old code, an ObjectMonitor's object field should never be NULL when > that ObjectMonitor is on an in-use list. We'll get the logging message and > then a crash. I used to have guarantee(n->object() != NULL, ...) in there, > but Robbin convinced me that was a waste because we'll just crash on > the use of the NULL pointer and that was good enough. > > As Erik has already explained, now that we use a weak handle, the object > can be GC'ed before the deflater thread comes along and removes the > idle ObjectMonitor from the in-use list. @kimbarrett - I believe I've addressed your comments in this push: https://github.com/openjdk/jdk/pull/135/commits/9fa2bed109d6fe352c610b26ada650226ce9cec4 ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 18:55:38 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 18:55:38 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: Message-ID: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: kimbarrett CR - made minor changes to address Kim's code review. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/135/files - new: https://git.openjdk.java.net/jdk/pull/135/files/bbf8dbd0..9fa2bed1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=02-03 Stats: 16 lines in 2 files changed: 11 ins; 3 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/135.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/135/head:pull/135 PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 19:02:17 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 19:02:17 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> <6C-kpuZszd3T_WGL0HbPq70g9QiqEzs2fQ2v71FC2T4=.64508cfd-0baf-48e7-a829-b2ddb7c52c66@github.com> Message-ID: On Mon, 14 Sep 2020 18:51:31 GMT, Daniel D. Daugherty wrote: >> @kimbarrett >> >>> src/hotspot/share/runtime/synchronizer.cpp >>> >>> The old code in chk_in_use_entry seems wrong. It checked for a null >>> object() and recorded that as an error. But then it went on and attempted >>> to use it as if it was not null. That's been fixed by the change. However, >>> the change no longer treats a null as an error. Probably this is because >>> it's weak, and so could have become null. But is that really possible for >>> an "in use" monitor? >> >> In the old code, an ObjectMonitor's object field should never be NULL when >> that ObjectMonitor is on an in-use list. We'll get the logging message and >> then a crash. I used to have guarantee(n->object() != NULL, ...) in there, >> but Robbin convinced me that was a waste because we'll just crash on >> the use of the NULL pointer and that was good enough. >> >> As Erik has already explained, now that we use a weak handle, the object >> can be GC'ed before the deflater thread comes along and removes the >> idle ObjectMonitor from the in-use list. > > @kimbarrett - I believe I've addressed your comments in this push: > https://github.com/openjdk/jdk/pull/135/commits/9fa2bed109d6fe352c610b26ada650226ce9cec4 @coleenp, @rkennke, and @kimbarrett - I believe all of the changes that you have requested have been made. Please confirm by re-reviewing. @dholmes-ora - I don't think you asked for any specific code changes. I've taken a first pass at creating a CSR: JDK-8253121 migrate ObjectMonitor::_object to OopStorage https://bugs.openjdk.java.net/browse/JDK-8253121 Please look it over and feel free to edit as needed. Since I don't do CSR's often, what I've done might be all wrong. :-) ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From kim.barrett at oracle.com Mon Sep 14 19:29:47 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 14 Sep 2020 15:29:47 -0400 Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> Message-ID: > On Sep 14, 2020, at 2:39 PM, Daniel D.Daugherty wrote: > > On Mon, 14 Sep 2020 18:33:17 GMT, Daniel D. Daugherty wrote: > >> @kimbarrett: >> >>> src/hotspot/share/oops/weakHandle.cpp >>> 36 WeakHandle::WeakHandle(OopStorage* storage, oop obj) : >>> 37 _obj(storage->allocate()) { >>> 38 assert(obj != NULL, "no need to create weak null oop"); >>> >>> Please format this differently so the ctor-init-list is more easily >>> distinguished from the body. I don't care that much which of the several >>> alternatives is used. >> >> After discussion with Erik, I changed the indent on L37 from two space to four spaces. > > @kimbarrett Thanks. That?s one of the ?several alternatives? I was alluding to. From rkennke at openjdk.java.net Mon Sep 14 20:07:53 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 14 Sep 2020 20:07:53 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> <6C-kpuZszd3T_WGL0HbPq70g9QiqEzs2fQ2v71FC2T4=.64508cfd-0baf-48e7-a829-b2ddb7c52c66@github.com> Message-ID: On Mon, 14 Sep 2020 18:59:06 GMT, Daniel D. Daugherty wrote: > @coleenp, @rkennke, and @kimbarrett - I believe all of the changes that you have requested > have been made. Please confirm by re-reviewing. It looks good to me now! I've also build with Shenandoah GC, and run our sanity-tests (TEST=hotspot_gc_shenandoah) and all looks good. Thank you! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From rkennke at openjdk.java.net Mon Sep 14 20:18:38 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 14 Sep 2020 20:18:38 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 18:55:38 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > kimbarrett CR - made minor changes to address Kim's code review. src/hotspot/share/gc/shared/space.inline.hpp line 176: > 174: assert(!space->scanned_block_is_obj(cur_obj) || oop(cur_obj)->mark_raw().is_unlocked() || > 175: oop(cur_obj)->mark_raw().has_bias_pattern() || oop(cur_obj)->mark_raw().has_monitor(), > 176: "these are the only valid states during a mark sweep"); Is this change related? Also, when moving the assert into the else block it will become always-true because of space->scanned_block_is_obj(cur_obj), or am I missing something? ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 20:24:12 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 20:24:12 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 20:10:50 GMT, Roman Kennke wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> kimbarrett CR - made minor changes to address Kim's code review. > > src/hotspot/share/gc/shared/space.inline.hpp line 176: > >> 174: assert(!space->scanned_block_is_obj(cur_obj) || oop(cur_obj)->mark_raw().is_unlocked() || >> 175: oop(cur_obj)->mark_raw().has_bias_pattern() || oop(cur_obj)->mark_raw().has_monitor(), >> 176: "these are the only valid states during a mark sweep"); > > Is this change related? Also, when moving the assert into the else block it will become always-true because of > space->scanned_block_is_obj(cur_obj), or am I missing something? See this comment from Coleen and the replies: https://github.com/openjdk/jdk/pull/135#discussion_r487300636 Please let me know if that resolved this comment for you. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 20:24:11 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 20:24:11 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2] In-Reply-To: References: <3Br1W5wlBiJiBpIMA8Pm7HaLB0nFBNAUyoGYoJb4lc0=.20b26eea-5157-45c2-9a43-76ed9b554514@github.com> <5ktowf9ho39sRCq22E56krjEeTmOLzagSyR9UULnJNs=.acf484c6-c6d2-440b-acc4-52e12750c52e@github.com> <74w2VsorZYr6qJxvXN5yM1o6Dw1J0bzeEihOmV_eHCs=.dd020fbd-1f1a-469c-b9d7-dbc313770d66@github.com> <6C-kpuZszd3T_WGL0HbPq70g9QiqEzs2fQ2v71FC2T4=.64508cfd-0baf-48e7-a829-b2ddb7c52c66@github.com> Message-ID: <9hEXk57qPOk4VvO3fuTowg2Z_Xuw3WCnnW8xaZvLpFU=.50a740a7-fd65-42ca-ba06-ba1e76922ba1@github.com> On Mon, 14 Sep 2020 20:05:06 GMT, Roman Kennke wrote: >> @coleenp, @rkennke, and @kimbarrett - I believe all of the changes that you have requested >> have been made. Please confirm by re-reviewing. >> >> @dholmes-ora - I don't think you asked for any specific code changes. I've taken a first pass >> at creating a CSR: >> JDK-8253121 migrate ObjectMonitor::_object to OopStorage >> https://bugs.openjdk.java.net/browse/JDK-8253121 >> >> Please look it over and feel free to edit as needed. Since I don't do >> CSR's often, what I've done might be all wrong. :-) > >> @coleenp, @rkennke, and @kimbarrett - I believe all of the changes that you have requested >> have been made. Please confirm by re-reviewing. > > It looks good to me now! I've also build with Shenandoah GC, and run our sanity-tests (TEST=hotspot_gc_shenandoah) and > all looks good. Thank you! @rkennke - Thanks for confirming that Shenandoah now builds with the changes that you provided. Thanks for sending those. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From rkennke at openjdk.java.net Mon Sep 14 20:28:30 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 14 Sep 2020 20:28:30 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 20:21:23 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/gc/shared/space.inline.hpp line 176: >> >>> 174: assert(!space->scanned_block_is_obj(cur_obj) || oop(cur_obj)->mark_raw().is_unlocked() || >>> 175: oop(cur_obj)->mark_raw().has_bias_pattern() || oop(cur_obj)->mark_raw().has_monitor(), >>> 176: "these are the only valid states during a mark sweep"); >> >> Is this change related? Also, when moving the assert into the else block it will become always-true because of >> space->scanned_block_is_obj(cur_obj), or am I missing something? > > See this comment from Coleen and the replies: > https://github.com/openjdk/jdk/pull/135#discussion_r487300636 > > Please let me know if that resolved this comment for you. Sorry, no. Maybe it's too late here and I shall think about it tomorrow morning instead ;-) Or maybe you can explain it again in the context of that change. How's the assert even relevant when moved in the else-branch? ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 20:36:42 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 20:36:42 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 20:25:39 GMT, Roman Kennke wrote: >> See this comment from Coleen and the replies: >> https://github.com/openjdk/jdk/pull/135#discussion_r487300636 >> >> Please let me know if that resolved this comment for you. > > Sorry, no. Maybe it's too late here and I shall think about it tomorrow morning instead ;-) Or maybe you can explain it > again in the context of that change. How's the assert even relevant when moved in the else-branch? Sorry, I confused myself switching between this review and a preliminary review thread. Here's the original code: 165 while (cur_obj < scan_limit) { 166 assert(!space->scanned_block_is_obj(cur_obj) || 167 oop(cur_obj)->mark_raw().is_marked() || oop(cur_obj)->mark_raw().is_unlocked() || 168 oop(cur_obj)->mark_raw().has_bias_pattern(), 169 "these are the only valid states during a mark sweep"); 170 if (space->scanned_block_is_obj(cur_obj) && oop(cur_obj)->is_gc_marked()) { and here's the code after it was moved and rewritten: 173 } else { 174 assert(!space->scanned_block_is_obj(cur_obj) || oop(cur_obj)->mark_raw().is_unlocked() || 175 oop(cur_obj)->mark_raw().has_bias_pattern() || oop(cur_obj)->mark_raw().has_monitor(), 176 "these are the only valid states during a mark sweep"); Where the assert() was worked fine when the ObjectMonitor had a regular oop, but after it was changed into a weak handle, that location became exposed to the fact that the object reference could be GC'ed. The original code assumes that the ObjectMonitor oop reference is stable and unchanging and that's no longer the case with the weak handle. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 20:51:06 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 20:51:06 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: Message-ID: <7S-JQve0hEaKd5B_ryBa1tnTyIYdr4jJvMWdbJCVACM=.ea27fbfc-89e8-4b9f-b99e-fdae0d579718@github.com> On Mon, 14 Sep 2020 20:34:13 GMT, Daniel D. Daugherty wrote: >> Sorry, no. Maybe it's too late here and I shall think about it tomorrow morning instead ;-) Or maybe you can explain it >> again in the context of that change. How's the assert even relevant when moved in the else-branch? > > Sorry, I confused myself switching between this review and a > preliminary review thread. > > Here's the original code: > > 165 while (cur_obj < scan_limit) { > 166 assert(!space->scanned_block_is_obj(cur_obj) || > 167 oop(cur_obj)->mark_raw().is_marked() || oop(cur_obj)->mark_raw().is_unlocked() || > 168 oop(cur_obj)->mark_raw().has_bias_pattern(), > 169 "these are the only valid states during a mark sweep"); > 170 if (space->scanned_block_is_obj(cur_obj) && oop(cur_obj)->is_gc_marked()) { > > and here's the code after it was moved and rewritten: > > 173 } else { > 174 assert(!space->scanned_block_is_obj(cur_obj) || oop(cur_obj)->mark_raw().is_unlocked() || > 175 oop(cur_obj)->mark_raw().has_bias_pattern() || oop(cur_obj)->mark_raw().has_monitor(), > 176 "these are the only valid states during a mark sweep"); > > > Where the assert() was worked fine when the ObjectMonitor had a regular oop, > but after it was changed into a weak handle, that location became exposed to > the fact that the object reference could be GC'ed. The original code assumes > that the ObjectMonitor oop reference is stable and unchanging and that's no > longer the case with the weak handle. I found my original preliminary code review comment and Erik's reply: > src/hotspot/share/gc/shared/space.inline.hpp > old L166: assert(!space->scanned_block_is_obj(cur_obj) || > old L167: oop(cur_obj)->mark_raw().is_marked() || oop(cur_obj)->mark_raw().is_unlocked() || > old L168: oop(cur_obj)->mark_raw().has_bias_pattern(), > old L169: "these are the only valid states during a mark sweep"); > This assert was before the if-statement at the top of the while-loop. > > new L174: assert(!space->scanned_block_is_obj(cur_obj) || oop(cur_obj)->mark_raw().is_unlocked() || > new L175: oop(cur_obj)->mark_raw().has_bias_pattern() || oop(cur_obj)->mark_raw().has_monitor(), > new L176: "these are the only valid states during a mark sweep"); > The assert is now in the else branch of the following if-statement: > > L166 if (space->scanned_block_is_obj(cur_obj) && oop(cur_obj)->is_gc_marked()) { > > The new assert() drops this check: > > oop(cur_obj)->mark_raw().is_marked() > > and adds this check: > > oop(cur_obj)->mark_raw().has_monitor() > > Dropping the "is_marked()" makes sense since the new location > of the assert is in the else branch of "oop(cur_obj)->is_gc_marked()". > The addition of the "has_monitor()" check is puzzling. Why was this > added and why wasn't it needed in the old assert()? > > In fact, I'm not sure why this change is here at all. This is an artifact of the monitor now being weak. Since there was previously always a strong root to all inflated monitors, there were never any dead objects in the heap, that still had pointers in the mark word to the monitor. The change to weak now implies that we suddenly have dead objects in the heap, that in the markWord point out their monitor. GC code that iterates through consecutive objects one by one, will see these now dead objects with monitors. The assert changes reflect that. Before it was unexpected and would assert on that. Now I moved the assertion to the case when the object is alive instead. We have no business asserting what should be in the markWord of dead objects. I hope it makes sense now! ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From coleenp at openjdk.java.net Mon Sep 14 20:51:06 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 14 Sep 2020 20:51:06 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: <7S-JQve0hEaKd5B_ryBa1tnTyIYdr4jJvMWdbJCVACM=.ea27fbfc-89e8-4b9f-b99e-fdae0d579718@github.com> References: <7S-JQve0hEaKd5B_ryBa1tnTyIYdr4jJvMWdbJCVACM=.ea27fbfc-89e8-4b9f-b99e-fdae0d579718@github.com> Message-ID: On Mon, 14 Sep 2020 20:45:29 GMT, Daniel D. Daugherty wrote: >> Sorry, I confused myself switching between this review and a >> preliminary review thread. >> >> Here's the original code: >> >> 165 while (cur_obj < scan_limit) { >> 166 assert(!space->scanned_block_is_obj(cur_obj) || >> 167 oop(cur_obj)->mark_raw().is_marked() || oop(cur_obj)->mark_raw().is_unlocked() || >> 168 oop(cur_obj)->mark_raw().has_bias_pattern(), >> 169 "these are the only valid states during a mark sweep"); >> 170 if (space->scanned_block_is_obj(cur_obj) && oop(cur_obj)->is_gc_marked()) { >> >> and here's the code after it was moved and rewritten: >> >> 173 } else { >> 174 assert(!space->scanned_block_is_obj(cur_obj) || oop(cur_obj)->mark_raw().is_unlocked() || >> 175 oop(cur_obj)->mark_raw().has_bias_pattern() || oop(cur_obj)->mark_raw().has_monitor(), >> 176 "these are the only valid states during a mark sweep"); >> >> >> Where the assert() was worked fine when the ObjectMonitor had a regular oop, >> but after it was changed into a weak handle, that location became exposed to >> the fact that the object reference could be GC'ed. The original code assumes >> that the ObjectMonitor oop reference is stable and unchanging and that's no >> longer the case with the weak handle. > > I found my original preliminary code review comment and Erik's reply: > >> src/hotspot/share/gc/shared/space.inline.hpp >> old L166: assert(!space->scanned_block_is_obj(cur_obj) || >> old L167: oop(cur_obj)->mark_raw().is_marked() || oop(cur_obj)->mark_raw().is_unlocked() || >> old L168: oop(cur_obj)->mark_raw().has_bias_pattern(), >> old L169: "these are the only valid states during a mark sweep"); >> This assert was before the if-statement at the top of the while-loop. >> >> new L174: assert(!space->scanned_block_is_obj(cur_obj) || oop(cur_obj)->mark_raw().is_unlocked() || >> new L175: oop(cur_obj)->mark_raw().has_bias_pattern() || oop(cur_obj)->mark_raw().has_monitor(), >> new L176: "these are the only valid states during a mark sweep"); >> The assert is now in the else branch of the following if-statement: >> >> L166 if (space->scanned_block_is_obj(cur_obj) && oop(cur_obj)->is_gc_marked()) { >> >> The new assert() drops this check: >> >> oop(cur_obj)->mark_raw().is_marked() >> >> and adds this check: >> >> oop(cur_obj)->mark_raw().has_monitor() >> >> Dropping the "is_marked()" makes sense since the new location >> of the assert is in the else branch of "oop(cur_obj)->is_gc_marked()". >> The addition of the "has_monitor()" check is puzzling. Why was this >> added and why wasn't it needed in the old assert()? >> >> In fact, I'm not sure why this change is here at all. > > This is an artifact of the monitor now being weak. Since there was previously always a strong root > to all inflated monitors, there were never any dead objects in the heap, that still had pointers in the > mark word to the monitor. The change to weak now implies that we suddenly have dead objects > in the heap, that in the markWord point out their monitor. GC code that iterates through consecutive > objects one by one, will see these now dead objects with monitors. The assert changes reflect that. > Before it was unexpected and would assert on that. Now I moved the assertion to the case when the > object is alive instead. We have no business asserting what should be in the markWord of dead objects. > > I hope it makes sense now! This code seems like something that doesn't belong here anymore. This code assumed synchronous scanning of oops in ObjectMonitor and scanning memory regions, and that's no longer the case with OopStorage. I think this assert should be removed. It exports some implementation detail of now completely unrelated code in order to do a very specific check. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 20:54:14 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 20:54:14 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: <7S-JQve0hEaKd5B_ryBa1tnTyIYdr4jJvMWdbJCVACM=.ea27fbfc-89e8-4b9f-b99e-fdae0d579718@github.com> Message-ID: <9_5byOZc3b5-zKH7LFe12s2OKdp6RQ_k4Iq1hZTp32g=.482bd52f-19d0-4d02-b318-f455570a897a@github.com> On Mon, 14 Sep 2020 20:48:10 GMT, Coleen Phillimore wrote: >> I found my original preliminary code review comment and Erik's reply: >> >>> src/hotspot/share/gc/shared/space.inline.hpp >>> old L166: assert(!space->scanned_block_is_obj(cur_obj) || >>> old L167: oop(cur_obj)->mark_raw().is_marked() || oop(cur_obj)->mark_raw().is_unlocked() || >>> old L168: oop(cur_obj)->mark_raw().has_bias_pattern(), >>> old L169: "these are the only valid states during a mark sweep"); >>> This assert was before the if-statement at the top of the while-loop. >>> >>> new L174: assert(!space->scanned_block_is_obj(cur_obj) || oop(cur_obj)->mark_raw().is_unlocked() || >>> new L175: oop(cur_obj)->mark_raw().has_bias_pattern() || oop(cur_obj)->mark_raw().has_monitor(), >>> new L176: "these are the only valid states during a mark sweep"); >>> The assert is now in the else branch of the following if-statement: >>> >>> L166 if (space->scanned_block_is_obj(cur_obj) && oop(cur_obj)->is_gc_marked()) { >>> >>> The new assert() drops this check: >>> >>> oop(cur_obj)->mark_raw().is_marked() >>> >>> and adds this check: >>> >>> oop(cur_obj)->mark_raw().has_monitor() >>> >>> Dropping the "is_marked()" makes sense since the new location >>> of the assert is in the else branch of "oop(cur_obj)->is_gc_marked()". >>> The addition of the "has_monitor()" check is puzzling. Why was this >>> added and why wasn't it needed in the old assert()? >>> >>> In fact, I'm not sure why this change is here at all. >> >> This is an artifact of the monitor now being weak. Since there was previously always a strong root >> to all inflated monitors, there were never any dead objects in the heap, that still had pointers in the >> mark word to the monitor. The change to weak now implies that we suddenly have dead objects >> in the heap, that in the markWord point out their monitor. GC code that iterates through consecutive >> objects one by one, will see these now dead objects with monitors. The assert changes reflect that. >> Before it was unexpected and would assert on that. Now I moved the assertion to the case when the >> object is alive instead. We have no business asserting what should be in the markWord of dead objects. >> >> I hope it makes sense now! > > This code seems like something that doesn't belong here anymore. This code assumed synchronous scanning of oops in > ObjectMonitor and scanning memory regions, and that's no longer the case with OopStorage. I think this assert should be > removed. It exports some implementation detail of now completely unrelated code in order to do a very specific check. @fisk - please chime in here... ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From rkennke at openjdk.java.net Mon Sep 14 20:54:14 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 14 Sep 2020 20:54:14 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: <9_5byOZc3b5-zKH7LFe12s2OKdp6RQ_k4Iq1hZTp32g=.482bd52f-19d0-4d02-b318-f455570a897a@github.com> References: <7S-JQve0hEaKd5B_ryBa1tnTyIYdr4jJvMWdbJCVACM=.ea27fbfc-89e8-4b9f-b99e-fdae0d579718@github.com> <9_5byOZc3b5-zKH7LFe12s2OKdp6RQ_k4Iq1hZTp32g=.482bd52f-19d0-4d02-b318-f455570a897a@github.com> Message-ID: On Mon, 14 Sep 2020 20:51:34 GMT, Daniel D. Daugherty wrote: >> This code seems like something that doesn't belong here anymore. This code assumed synchronous scanning of oops in >> ObjectMonitor and scanning memory regions, and that's no longer the case with OopStorage. I think this assert should be >> removed. It exports some implementation detail of now completely unrelated code in order to do a very specific check. > > @fisk - please chime in here... I agree. Also, the assert becomes true somewhat obviously because of its first clause, which is already guaranteed because of its placement in the surrounding else-branch (unless something really weird happens). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From leonid.kuskov at oracle.com Mon Sep 14 21:00:14 2020 From: leonid.kuskov at oracle.com (leonid.kuskov at oracle.com) Date: Mon, 14 Sep 2020 14:00:14 -0700 Subject: Fatal errors when running JCK tests with JDK15/16 debug build In-Reply-To: References: <80c5c750-2a02-9bd9-d0b4-628481c71264@oracle.com> Message-ID: Hi Martin, This issue looks rather a test problem. I've filled the ticket JCK-7314897 . Best Regards, Leonid On 9/7/20 3:23 AM, Doerr, Martin wrote: > Hi Leonid, > > the errors were observed in many more vm/jvmti/Get... tests like the following ones: > > vm/jvmti/GetAllThreads/gath001/gath00101/gath00101.html > vm/jvmti/GetAvailableProcessors/gaps001/gaps00101/gaps00101.html > vm/jvmti/GetClassModifiers/gcmo001/gcmo00102/gcmo00102.html > vm/jvmti/GetClassMethods/gcmt001/gcmt00102/gcmt00102.html > vm/jvmti/GetBytecodes/gbyc001/gbyc00102/gbyc00102.html > vm/jvmti/GetCapabilities/gcap001/gcap00101/gcap00101.html > vm/jvmti/GetClassLoader/gclo001/gclo00101/gclo00101.html > > We run them with fastdbg builds every night and we have seen the errors almost every day. > > Best regards, > Martin > > >> -----Original Message----- >> From: David Holmes >> Sent: Dienstag, 1. September 2020 07:07 >> To: leonid.kuskov at oracle.com; Doerr, Martin ; >> serviceability-dev at openjdk.java.net; hotspot-runtime- >> dev at openjdk.java.net >> Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug build >> >> Hi Leonid, >> >> On 1/09/2020 10:42 am, leonid.kuskov at oracle.com wrote: >>> Hi, >>> >>> It's a known issue that was reported by Arno Zeller >>> (arno.zeller at sap.com) in the middle of June. The test >>> jvmti/GetAllStackTraces/gast001/gast00105/gast00105.html failed with the >>> same stack trace despite the fix ( JCK-7022500 lprintf in >>> jvmti/support.c is not MT-Safe) Please file a JCK's issue with details >>> to reproduce the failure. >> Interesting. The fix is supposed to make things thread-safe by using a >> RawMonitor to ensure only one thread can use lprintf at a time. I missed >> that in my initial analysis. But something is going wrong. >> >> Thanks, >> David >> >>> Thanks, >>> Leonid >>> >>> On 8/31/20 3:37 PM, David Holmes wrote: >>> >>>> On 1/09/2020 3:00 am, Doerr, Martin wrote: >>>>> Hi David, >>>>> >>>>> thanks for analyzing it. We need to exclude the test for now. >>>> Can you file a JCK bug? I can file one on our internal JCK Jira but >>>> I'm not sure what the right process is in this case. >>>> >>>> Thanks, >>>> David >>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes >>>>>> Sent: Montag, 31. August 2020 04:34 >>>>>> To: Doerr, Martin ; serviceability- >>>>>> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug >>>>>> build >>>>>> >>>>>> Hi Martin, >>>>>> >>>>>> On 29/08/2020 3:53 am, Doerr, Martin wrote: >>>>>>> Hi, >>>>>>> >>>>>>> we have seen the following fatal error more than 50 times since >>>>>>> 2020-05-25 in various JCK tests vm/jvmti. >>>>>>> >>>>>>> fatal error: String conversion failure: [check] ExitLock destroyed >>>>>>> >>>>>>> --> ?? [check] ExitLock exited >>>>>>> >>>>>>> (followed by garbage output) >>>>>>> >>>>>>> 8166358: Re-enable String verification in >>>>>>> java_lang_String::create_from_str() >>>>>>> >>>>>>> was pushed at that date which introduced the call to fatal. >>>>>>> >>>>>>> Stack (example from linuxppc64le, but also observed on x86 and >>>>>>> aarch64): >>>>>>> V? [libjvm.so+0xee242c] java_lang_String::create_from_str(char >> const*, >>>>>>> Thread*) [clone .part.158]+0x51c >>>>>>> V? [libjvm.so+0xee2530] java_lang_String::create_oop_from_str(char >>>>>>> const*, Thread*)+0x40 >>>>>>> V? [libjvm.so+0x1026a30]? jni_NewStringUTF+0x1e0 >>>>>>> C? [libjckjvmti.so+0x3ce4c]? logWrite+0x5c >>>>>>> C? [libjckjvmti.so+0x3cd20]? lprintf+0x170 >>>>>>> C? [libjckjvmti.so+0x485b8]? gast00104_agent_proc+0x254 >>>>>>> V? [libjvm.so+0x1218f0c] >> JvmtiAgentThread::call_start_function()+0x24c >>>>>>> V? [libjvm.so+0x193a8fc] JavaThread::thread_main_inner()+0x32c >>>>>>> V? [libjvm.so+0x19418a0]? Thread::call_run()+0x160 >>>>>>> V? [libjvm.so+0x15c9d0c]? thread_native_entry(Thread*)+0x18c >>>>>>> C? [libpthread.so.0+0x9b48]? start_thread+0x108 >>>>>>> >>>>>>> (Problem could have been there before but without this fatal >> message.) >>>>>>> The messages are generated by: >>>>>>> >>>>>>> tests/vm/jvmti/GetAllStackTraces/gast001/gast00104/gast00104.c >>>>>>> >>>>>>> This looks like a race condition. The message changes while the VM >>>>>>> creates a String object from it. Has anybody seen this before? >>>>>> No but ... >>>>>> >>>>>>> Is it a test problem? I'm not familiar with the lprintf calls in >>>>>>> the test. >>>>>> ... the lprintf is part of the JCK support library (support.c if you >>>>>> have access to sources) and it uses a static buffer for the log >>>>>> messages >>>>>> and so it not thread-safe. This test creates a thread and both it and >>>>>> the main thread call lprintf concurrently. >>>>>> >>>>>> So this is a JCK test/test-library bug that appears to be exposed by >>>>>> the >>>>>> changes made in 8166358. >>>>>> >>>>>> Cheers, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Martin >>>>>>> From eosterlund at openjdk.java.net Mon Sep 14 21:05:31 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 14 Sep 2020 21:05:31 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: <7S-JQve0hEaKd5B_ryBa1tnTyIYdr4jJvMWdbJCVACM=.ea27fbfc-89e8-4b9f-b99e-fdae0d579718@github.com> <9_5byOZc3b5-zKH7LFe12s2OKdp6RQ_k4Iq1hZTp32g=.482bd52f-19d0-4d02-b318-f455570a897a@github.com> Message-ID: On Mon, 14 Sep 2020 20:51:58 GMT, Roman Kennke wrote: >> @fisk - please chime in here... > > I agree. Also, the assert becomes true somewhat obviously because of its first clause, which is already guaranteed > because of its placement in the surrounding else-branch (unless something really weird happens). I am also 100% okay with just removing the random assertion. We have a whole bunch of other block iteration code in GC code that does not sprinkle random asserts about what patterns of markWords are supposedly good or bad in live/dead objects. And I also don't get what kind of bug this would catch. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 21:09:57 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 21:09:57 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: <7S-JQve0hEaKd5B_ryBa1tnTyIYdr4jJvMWdbJCVACM=.ea27fbfc-89e8-4b9f-b99e-fdae0d579718@github.com> <9_5byOZc3b5-zKH7LFe12s2OKdp6RQ_k4Iq1hZTp32g=.482bd52f-19d0-4d02-b318-f455570a897a@github.com> Message-ID: On Mon, 14 Sep 2020 21:02:51 GMT, Erik ?sterlund wrote: >> I agree. Also, the assert becomes true somewhat obviously because of its first clause, which is already guaranteed >> because of its placement in the surrounding else-branch (unless something really weird happens). > > I am also 100% okay with just removing the random assertion. We have a whole bunch of other block iteration code in GC > code that does not sprinkle random asserts about what patterns of markWords are supposedly good or bad in live/dead > objects. And I also don't get what kind of bug this would catch. Okay. Since @coleenp, @rkennke and @fisk want that assertion to "hit the road", I'm in the process of removing it... build running now (just 'cause I'm paranoid me...)... ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 21:15:00 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 21:15:00 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: rkennke, coleenp, fisk CR - delete random assert() that knows too much about markWords. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/135/files - new: https://git.openjdk.java.net/jdk/pull/135/files/9fa2bed1..eeb9d761 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=03-04 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/135.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/135/head:pull/135 PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Mon Sep 14 21:15:00 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Sep 2020 21:15:00 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: <7S-JQve0hEaKd5B_ryBa1tnTyIYdr4jJvMWdbJCVACM=.ea27fbfc-89e8-4b9f-b99e-fdae0d579718@github.com> <9_5byOZc3b5-zKH7LFe12s2OKdp6RQ_k4Iq1hZTp32g=.482bd52f-19d0-4d02-b318-f455570a897a@github.com> Message-ID: On Mon, 14 Sep 2020 21:07:23 GMT, Daniel D. Daugherty wrote: >> I am also 100% okay with just removing the random assertion. We have a whole bunch of other block iteration code in GC >> code that does not sprinkle random asserts about what patterns of markWords are supposedly good or bad in live/dead >> objects. And I also don't get what kind of bug this would catch. > > Okay. Since @coleenp, @rkennke and @fisk want that assertion to "hit the road", > I'm in the process of removing it... build running now (just 'cause I'm paranoid me...)... This sub-thread should be resolve via https://github.com/openjdk/jdk/pull/135/commits/eeb9d761374ea193a7295fcce3ffe9707a8f0348 ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From iklam at openjdk.java.net Mon Sep 14 21:47:52 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 14 Sep 2020 21:47:52 GMT Subject: RFR: 8253098: Archived full module graph should be disabled if CDS heap cannot be mapped Message-ID: Please review this simple fix -- `MetaspaceShared::use_full_module_graph()` depends on the CDS heap. It should be disabled if the CDS heap cannot be mapped. ------------- Commit messages: - 8253098: Disable archived module graph if CDS heap cannot be mapped Changes: https://git.openjdk.java.net/jdk/pull/158/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=158&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253098 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/158.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/158/head:pull/158 PR: https://git.openjdk.java.net/jdk/pull/158 From ccheung at openjdk.java.net Mon Sep 14 22:27:51 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 14 Sep 2020 22:27:51 GMT Subject: RFR: 8253098: Archived full module graph should be disabled if CDS heap cannot be mapped In-Reply-To: References: Message-ID: <_83bdRbdWT-gWgqq7h9NpB12mGWaSCY1GNqBfnn9Wl4=.f9592d81-cf39-40e8-a13f-19f493d7db99@github.com> On Mon, 14 Sep 2020 21:40:50 GMT, Ioi Lam wrote: > Please review this simple fix -- `MetaspaceShared::use_full_module_graph()` depends on the CDS heap. It should be > disabled if the CDS heap cannot be mapped. Looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/158 From iklam at openjdk.java.net Mon Sep 14 22:28:26 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 14 Sep 2020 22:28:26 GMT Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: <4UPtVQYiZjQfUiHe4BAClmg_IU5LqF36-ekIBxbUF-Y=.5eba4ba3-ae0e-4e8c-aebd-14f6bfc714a7@github.com> On Mon, 14 Sep 2020 12:58:35 GMT, Aleksey Shipilev wrote: > It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at > fixing this, I think we need to use `u2` consistently for hash code computations. `u2` is the final storage type for > the hash code in `JVMFlagLookup::_hashes`. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/150 From iklam at openjdk.java.net Mon Sep 14 22:28:26 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 14 Sep 2020 22:28:26 GMT Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 17:17:27 GMT, Ioi Lam wrote: >> LGTM, Goetz. > > I agree with Kim that it's better to disable the warning. TL/DR; I think the proposed change is acceptable as it doesn't significantly change the bucket size distribution. ==== I added a patch to show the bucket size distribution: http://cr.openjdk.java.net/~iklam/jdk16/8253089-jvmFlagLookup-u2-msvc/buckets_size.patch.txt I then apply the proposed change on top. The contents of the hashtable is changed (I don't know why Aleksey sees no changes; I used output from `gcc -save-temps`). Even the lower 16 bit of the hash may not change, this expression will give different results depending on the upper 16 bits: int bucket_index = (int)(hash % NUM_BUCKETS); Anyway, the bucket size distribution is about the same as before. The max bucket size is about 12. http://cr.openjdk.java.net/~iklam/jdk16/8253089-jvmFlagLookup-u2-msvc/jvmFlagLookup.s.old.txt http://cr.openjdk.java.net/~iklam/jdk16/8253089-jvmFlagLookup-u2-msvc/jvmFlagLookup.s.new.txt Scroll down to here: the first 277 entries are the bucket size. _ZL18_flag_lookup_table: .value 2 .value 8 .value 3 .value 5 .value 10 .value 1 .value 7 .value 8 .value 3 .value 3 .value 5 .value 2 .value 5 .value 7 .value 6 .value 2 .value 5 .value 2 .value 6 .value 3 .value 8 .value 1 .value 3 ------------- PR: https://git.openjdk.java.net/jdk/pull/150 From dholmes at openjdk.java.net Tue Sep 15 01:58:12 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 15 Sep 2020 01:58:12 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 21:15:00 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > rkennke, coleenp, fisk CR - delete random assert() that knows too much about markWords. Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dholmes at openjdk.java.net Tue Sep 15 02:08:11 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 15 Sep 2020 02:08:11 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: On Tue, 15 Sep 2020 01:55:39 GMT, David Holmes wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> rkennke, coleenp, fisk CR - delete random assert() that knows too much about markWords. > > Marked as reviewed by dholmes (Reviewer). Edit: I missed that this was already flagged. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dholmes at openjdk.java.net Tue Sep 15 02:32:23 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 15 Sep 2020 02:32:23 GMT Subject: RFR: 8253098: Archived full module graph should be disabled if CDS heap cannot be mapped In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 21:40:50 GMT, Ioi Lam wrote: > Please review this simple fix -- `MetaspaceShared::use_full_module_graph()` depends on the CDS heap. It should be > disabled if the CDS heap cannot be mapped. Looks good. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/158 From iklam at openjdk.java.net Tue Sep 15 02:40:12 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 15 Sep 2020 02:40:12 GMT Subject: Integrated: 8253098: Archived full module graph should be disabled if CDS heap cannot be mapped In-Reply-To: References: Message-ID: <4YU3kllfRn8amG_TF51lsFncJ6XFyID_q9AP53eXEHg=.aec2874f-5c95-44a9-8774-29eb0765eca0@github.com> On Mon, 14 Sep 2020 21:40:50 GMT, Ioi Lam wrote: > Please review this simple fix -- `MetaspaceShared::use_full_module_graph()` depends on the CDS heap. It should be > disabled if the CDS heap cannot be mapped. This pull request has now been integrated. Changeset: 70cc7fc1 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/70cc7fc1 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8253098: Archived full module graph should be disabled if CDS heap cannot be mapped Reviewed-by: ccheung, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/158 From shade at openjdk.java.net Tue Sep 15 05:16:53 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 15 Sep 2020 05:16:53 GMT Subject: Integrated: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: <-WUQ_VGA1f01UEZBjGl-FRxkBV9aHkcugP032YD3Pkg=.a7be0c67-4062-41a3-abc9-1fd773990bba@github.com> On Mon, 14 Sep 2020 12:58:35 GMT, Aleksey Shipilev wrote: > It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at > fixing this, I think we need to use `u2` consistently for hash code computations. `u2` is the final storage type for > the hash code in `JVMFlagLookup::_hashes`. This pull request has now been integrated. Changeset: 3f455f09 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/3f455f09 Stats: 8 lines in 2 files changed: 0 ins; 0 del; 8 mod 8253089: Windows (MSVC 2017) build fails after JDK-8243208 Reviewed-by: mdoerr, goetz, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/150 From rehn at openjdk.java.net Tue Sep 15 07:39:56 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 15 Sep 2020 07:39:56 GMT Subject: RFR: 8238761: Asynchronous handshakes Message-ID: This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there is now only one method to call when handshaking one thread. Some comments about the changes: - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much worse then this. - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that thread. - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. - Added WB testing method. - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a safepoint and half of them after, in many handshake all operations. - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to the NoSafepointVerifyer. - Added filtered queue and gtest for it. Passes multiple t1-8 runs. Been through some pre-reviwing. ------------- Commit messages: - Rebase version 1.0 Changes: https://git.openjdk.java.net/jdk/pull/151/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8238761 Stats: 1047 lines in 24 files changed: 693 ins; 150 del; 204 mod Patch: https://git.openjdk.java.net/jdk/pull/151.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/151/head:pull/151 PR: https://git.openjdk.java.net/jdk/pull/151 From martin.doerr at sap.com Tue Sep 15 07:44:15 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 15 Sep 2020 07:44:15 +0000 Subject: Fatal errors when running JCK tests with JDK15/16 debug build In-Reply-To: References: <80c5c750-2a02-9bd9-d0b4-628481c71264@oracle.com> Message-ID: Hi Leonid, I don?t have access to JCK issues, but thanks for creating it. Best regards, Martin From: leonid.kuskov at oracle.com Sent: Montag, 14. September 2020 23:00 To: Doerr, Martin ; David Holmes ; serviceability-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Cc: Zeller, Arno Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug build Hi Martin, This issue looks rather a test problem. I've filled the ticket JCK-7314897. Best Regards, Leonid On 9/7/20 3:23 AM, Doerr, Martin wrote: Hi Leonid, the errors were observed in many more vm/jvmti/Get... tests like the following ones: vm/jvmti/GetAllThreads/gath001/gath00101/gath00101.html vm/jvmti/GetAvailableProcessors/gaps001/gaps00101/gaps00101.html vm/jvmti/GetClassModifiers/gcmo001/gcmo00102/gcmo00102.html vm/jvmti/GetClassMethods/gcmt001/gcmt00102/gcmt00102.html vm/jvmti/GetBytecodes/gbyc001/gbyc00102/gbyc00102.html vm/jvmti/GetCapabilities/gcap001/gcap00101/gcap00101.html vm/jvmti/GetClassLoader/gclo001/gclo00101/gclo00101.html We run them with fastdbg builds every night and we have seen the errors almost every day. Best regards, Martin -----Original Message----- From: David Holmes Sent: Dienstag, 1. September 2020 07:07 To: leonid.kuskov at oracle.com; Doerr, Martin ; serviceability-dev at openjdk.java.net; hotspot-runtime- dev at openjdk.java.net Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug build Hi Leonid, On 1/09/2020 10:42 am, leonid.kuskov at oracle.com wrote: Hi, It's a known issue that was reported by Arno Zeller (arno.zeller at sap.com) in the middle of June. The test jvmti/GetAllStackTraces/gast001/gast00105/gast00105.html failed with the same stack trace despite the fix ( JCK-7022500 lprintf in jvmti/support.c is not MT-Safe) Please file a JCK's issue with details to reproduce the failure. Interesting. The fix is supposed to make things thread-safe by using a RawMonitor to ensure only one thread can use lprintf at a time. I missed that in my initial analysis. But something is going wrong. Thanks, David Thanks, Leonid On 8/31/20 3:37 PM, David Holmes wrote: On 1/09/2020 3:00 am, Doerr, Martin wrote: Hi David, thanks for analyzing it. We need to exclude the test for now. Can you file a JCK bug? I can file one on our internal JCK Jira but I'm not sure what the right process is in this case. Thanks, David Best regards, Martin -----Original Message----- From: David Holmes Sent: Montag, 31. August 2020 04:34 To: Doerr, Martin ; serviceability- dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: Re: Fatal errors when running JCK tests with JDK15/16 debug build Hi Martin, On 29/08/2020 3:53 am, Doerr, Martin wrote: Hi, we have seen the following fatal error more than 50 times since 2020-05-25 in various JCK tests vm/jvmti. fatal error: String conversion failure: [check] ExitLock destroyed --> [check] ExitLock exited (followed by garbage output) 8166358: Re-enable String verification in java_lang_String::create_from_str() was pushed at that date which introduced the call to fatal. Stack (example from linuxppc64le, but also observed on x86 and aarch64): V [libjvm.so+0xee242c] java_lang_String::create_from_str(char const*, Thread*) [clone .part.158]+0x51c V [libjvm.so+0xee2530] java_lang_String::create_oop_from_str(char const*, Thread*)+0x40 V [libjvm.so+0x1026a30] jni_NewStringUTF+0x1e0 C [libjckjvmti.so+0x3ce4c] logWrite+0x5c C [libjckjvmti.so+0x3cd20] lprintf+0x170 C [libjckjvmti.so+0x485b8] gast00104_agent_proc+0x254 V [libjvm.so+0x1218f0c] JvmtiAgentThread::call_start_function()+0x24c V [libjvm.so+0x193a8fc] JavaThread::thread_main_inner()+0x32c V [libjvm.so+0x19418a0] Thread::call_run()+0x160 V [libjvm.so+0x15c9d0c] thread_native_entry(Thread*)+0x18c C [libpthread.so.0+0x9b48] start_thread+0x108 (Problem could have been there before but without this fatal message.) The messages are generated by: tests/vm/jvmti/GetAllStackTraces/gast001/gast00104/gast00104.c This looks like a race condition. The message changes while the VM creates a String object from it. Has anybody seen this before? No but ... Is it a test problem? I'm not familiar with the lprintf calls in the test. ... the lprintf is part of the JCK support library (support.c if you have access to sources) and it uses a static buffer for the log messages and so it not thread-safe. This test creates a thread and both it and the main thread call lprintf concurrently. So this is a JCK test/test-library bug that appears to be exposed by the changes made in 8166358. Cheers, David ----- Best regards, Martin From kbarrett at openjdk.java.net Tue Sep 15 10:21:15 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 15 Sep 2020 10:21:15 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 18:55:38 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > kimbarrett CR - made minor changes to address Kim's code review. src/hotspot/share/runtime/objectMonitor.cpp line 244: > 242: } > 243: > 244: #ifdef ASSERT There would be less `#ifdef ASSERT` clutter if just the body of check_object_context were conditionalized. Then the calls wouldn't need to be. Your call... ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From kbarrett at openjdk.java.net Tue Sep 15 10:21:12 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 15 Sep 2020 10:21:12 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 21:15:00 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > rkennke, coleenp, fisk CR - delete random assert() that knows too much about markWords. Marked as reviewed by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From jcm at openjdk.java.net Tue Sep 15 10:59:06 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Tue, 15 Sep 2020 10:59:06 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v2] In-Reply-To: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> Message-ID: <_2RfxBOE39VhwtDZe2F2qLb52IfF_JiCWwE2cJsEuiM=.01bb1177-808a-45ea-a8bf-3dccfab6ea38@github.com> > Hi > > Moving the review that is based on mercurial repo to github. > The history of conversation is > [here](https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039861.html) > Issue:[ JDK-8249451 ](https://bugs.openjdk.java.net/browse/JDK-8249451) > > @dholmes-ora could you please have a look. Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: removing unused definition load_class_by_index ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/169/files - new: https://git.openjdk.java.net/jdk/pull/169/files/cfc2d719..1c0786a5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=169&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=169&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/169.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/169/head:pull/169 PR: https://git.openjdk.java.net/jdk/pull/169 From coleenp at openjdk.java.net Tue Sep 15 12:11:59 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 15 Sep 2020 12:11:59 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: <6BzJKwek4JwQkNWvWZp0SCS78dMRtaPTRa4p_UAC0gc=.acc9d8f4-aaa4-4951-bff5-f6da6c467c90@github.com> On Mon, 14 Sep 2020 21:15:00 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > rkennke, coleenp, fisk CR - delete random assert() that knows too much about markWords. Thank you for removing it. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/135 From rkennke at openjdk.java.net Tue Sep 15 12:25:04 2020 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 15 Sep 2020 12:25:04 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 21:15:00 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > rkennke, coleenp, fisk CR - delete random assert() that knows too much about markWords. Looks good to me! ------------- Marked as reviewed by rkennke (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/135 From zgu at openjdk.java.net Tue Sep 15 14:51:45 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 15 Sep 2020 14:51:45 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive Message-ID: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to post_run() method. ------------- Commit messages: - JDK-8252921 - JDK-8252921 Changes: https://git.openjdk.java.net/jdk/pull/185/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252921 Stats: 24 lines in 2 files changed: 10 ins; 13 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/185.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/185/head:pull/185 PR: https://git.openjdk.java.net/jdk/pull/185 From martin.doerr at sap.com Tue Sep 15 14:52:50 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 15 Sep 2020 14:52:50 +0000 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints (CR14/v2.14/17-for-jdk15) In-Reply-To: <1d6a6087-b82d-96cf-bfb4-87cb03869bd6@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <38e2d441-11c9-8342-37d5-8030dd06f2f4@oracle.com> <2a8976f7-37e0-03b9-3099-e07464e46512@oracle.com> <5681d640-08c8-3433-0f85-3f23eea69e87@oracle.com> <029b596d-46e5-9fa7-38fd-c34d3a32987b@oracle.com> <54fd4f9c-afef-0819-de2d-b81b25fa6c22@oracle.com> <79bfb73a-20d7-7b85-4a84-dd22b150ed0d@oracle.com> <9fcc131c-dfbd-b943-381b-2ea8d854fcd7@oracle.com> <3b76e50f-fd8f-d61b-272a-27338df99094@oracle.com> <1d6a6087-b82d-96cf-bfb4-87cb03869bd6@oracle.com> Message-ID: Hi Dan and Carsten, I just noticed that this change introduced 2 usages of "support_IRIW_for_not_multiple_copy_atomic_cpu". I think this is incorrect for arm32 which is not multi-copy-atomic, but uses support_IRIW_for_not_multiple_copy_atomic_cpu = false. You probably meant "#ifdef CPU_MULTI_COPY_ATOMIC"? I haven't studied the access patterns you were trying to fix, but this looks wrong. Should I create an issue? Would be great if I could assign it to somebody familiar with this new code. Best regards, Martin > -----Original Message----- > From: hotspot-runtime-dev bounces at openjdk.java.net> On Behalf Of Daniel D. Daugherty > Sent: Dienstag, 2. Juni 2020 21:25 > To: Carsten Varming > Cc: Roman Kennke ; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints > (CR14/v2.14/17-for-jdk15) > > Hi Carsten, > > Thanks for the fast review of the updated comments. > > I filed the following new bug to track the change: > > ??? JDK-8246359 clarify confusing comment in ObjectMonitor::EnterI()'s > ??????????????? race with async deflation > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > And I started a review thread for the fix under that new bug ID. > > Dan > > > On 6/2/20 2:13 PM, Carsten Varming wrote: > > Hi Dan, > > > > I like the new comment. Thank you for doing the update. > > > > Carsten > > > > On Tue, Jun 2, 2020 at 1:54 PM Daniel D. Daugherty > > > > wrote: > > > > Hi Carsten, > > > > See replies below... > > > > David, Erik and Robbin, if you folks could also check out the revised > > comment below that would be appreciated. > > > > > > On 6/2/20 9:39 AM, Carsten Varming wrote: > >> Hi Dan, > >> > >> See inline. > >> > >> On Mon, Jun 1, 2020 at 11:32 PM Daniel D. Daugherty > >> >> > wrote: > >> > >> Hi Carsten, > >> > >> Thanks for chiming in on this review thread!! > >> > >> > >> It is my pleasure. You know the code is solid when the discussion > >> is focused on the comments. > > > > So true, so very true! > > > > > >> On 6/1/20 10:41 PM, Carsten Varming wrote: > >>> Hi Dan, > >>> > >>> I like the new protocol, but I had to think about how the > >>> extra increment to _contentions replaced the check on _owner > >>> that I originally?added. > >> > >> Right. The check on _owner was described in detail in the > >> OpenJDK wiki > >> subsection that was called "T-enter Wins By A-B-A". It can > >> still be > >> found by going thru the wiki's history links. > >> > >> That subsection was renamed and rewritten and can be found here: > >> > >> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#A > syncMonitorDeflation-T- > enterWinsByCancellationViaDEFLATER_MARKERSwap > >> > >> > >>> I am thinking that the increased _contention value is a > >>> little mark left on the ObjectMonitor to signal to the > >>> deflater thread (which must be in the middle of trying to > >>> acquire the object monitor as _owner was set to > >>> DEFLATER_MARKER) that the deflater thread lost the race. > >> > >> That is exactly what the extra increment is being used for. > >> > >> In my reply to David H. that you quoted below, I describe the > >> progression > >> of contention values thru the two possible race scenarios. > >> The progression > >> shows the T-enter thread winning the race and marking the > >> contention field > >> with the extra increment while the T-deflater thread > >> recognizes that it has > >> lost the race and unmarks the contention field with an extra > >> decrement. > >> > >> > >> I noticed that. Looks like David and I were racing and David won. :) > >> > >>> That little mark stays with the object monitor long after > >>> the thread is done with the monitor. > >> > >> The "little mark" stays with the ObjectMonitor after T-enter > >> is done > >> entering until the T-deflater thread recognizes that the > >> async deflation > >> was canceled and does an extra decrement. I don't think I > >> would describe > >> it as "long after". > >> > >> > >> Sorry about the use of "long after". When I think about the > >> correctness of protocols, like the deflation protocol, I end up > >> thinking about sequences of instructions and the relevant > >> interleavings. In that context I often end up using phrases like > >> "long after" and "after" to mean anything after a particular > >> instruction. I did not mean to imply anything about the relative > >> speed of the execution of the code. > > > > It's okay. I do something similar in the transaction diagrams that > > I use to work out timing issues: ... > > > > The only point that I was trying to make is that the T-deflate thread > > is responsible for cleaning up the extra mark and it's committed to > > the code path that will result in the cleanup. Yes, there may be a > > between the time that T-deflate recognizes that async > > deflation was canceled and when T-deflate does the extra decrement, > > but I don't see any harm in it. > > > > > >>> It might be worth adding a comment to the code explaining > >>> that after the increment, the _contention field can only be > >>> set to 0 by a corresponding decrement in the async deflater > >>> thread, ensuring that the > >>> Atomic::cmpxchg(&mid->_contentions, (jint)0, -max_jint)?on > >>> line 2166 fails. In particular, the comment: > >>> +. // .... We bump contentions an > >>> + // extra time to prevent the async deflater thread from > >>> temporarily > >>> + // changing it to -max_jint and back to zero (no flicker > >>> to confuse > >>> + // is_being_async_deflated() > >>> confused me as after the deflater thread sets _contentions > >>> to -max_jint, the?deflater thread has won the race and the > >>> object monitor is about to be deflated. > >> > >> For context, here's the code and comment being discussed: > >> > >>> 527 if (AsyncDeflateIdleMonitors && > >>> 528 try_set_owner_from(DEFLATER_MARKER, Self) == > DEFLATER_MARKER) { > >>> 529 // Cancelled the in-progress async deflation. We bump > >>> contentions an > >>> 530 // extra time to prevent the async deflater thread from > >>> temporarily > >>> 531 // changing it to -max_jint and back to zero (no flicker > >>> to confuse > >>> 532 // is_being_async_deflated()). The async deflater thread > >>> will > >>> 533 // decrement contentions after it recognizes that the async > >>> 534 // deflation was cancelled. > >>> 535 add_to_contentions(1); > >> > >> This part of the new comment: > >> > >> ?532???? // ...? The async deflater thread will > >> ?533???? // decrement contentions after it recognizes that > >> the async > >> ?534???? // deflation was cancelled. > >> > >> makes it clear that the async deflater thread does the > >> corresponding decrement > >> to the increment done by the T-enter thread so that covers > >> this part of your > >> comment above: > >> > >> ??? the _contention field can only be set to 0 by a > >> corresponding decrement > >> ??? in the async deflater thread > >> > >> This part of the new comment: > >> > >> ?529???? // ...? We bump contentions an > >> ?530???? // extra time to prevent the async deflater thread > >> from temporarily > >> ?531???? // changing it to -max_jint and back to zero (no > >> flicker to confuse > >> ?532???? // is_being_async_deflated()). > >> > >> makes it clear that we're keeping make-contentions-negative > >> part of the > >> async deflation protocol from happening so that covers this > >> part of your > >> comment above: > >> > >> ??? ensuring that the Atomic::cmpxchg(&mid->_contentions, > >> (jint)0, -max_jint) > >> ??? on line 2166 fails. > >> > >> This part of your comment above makes it clear where the > >> confusion arises: > >> > >> ??? confused me as after the deflater thread sets > >> _contentions to -max_jint, > >> ??? the deflater thread has won the race and the object > >> monitor is about to > >> ??? be deflated. > >> > >> Your original algorithm is a three-part async deflation protocol: > >> > >> Part 1 - set owner field to DEFLATER marker > >> Part 2 - make a zero contentions field -max_jint > >> Part 3 - check to see if the owner field is still DEFLATER_MARKER > >> > >> If part 3 fails, then the contentions field that is currently > >> negative > >> has max_jint added to it to complete the bail out process. > >> It's that > >> third part that makes the contentions field flicker from: > >> > >> ??? 0 -> -max_jint -> 0 > >> > >> And the extra contentions increment in the new two part > >> protocol solves > >> that flicker and allows us to treat (contentions < 0) as a > >> linearization > >> point. > >> > >> Please let me know if this clarifies your concern. > >> > >> > >> I am no?longer confused, but the cause of my confusion is still > >> present in the comment. > >> > >> This group knows about the three part algorithm, but when the > >> code is pushed there is no representation of the three part > >> algorithm in the code or repository. > > > > That's a really good point and a side effect of my living with this > > code for a very long time... > > > > > >> I forgot the details of the algorithm and read the latest version > >> of the code to figure out what the flickering was about. As you > >> would expect, I found that there is no way the code can cause the > >> flicker mentioned. That made me worried. I started to question > >> myself: What can?cause the behavior that is described in the > >> comments? What am I missing? As a result, I think it is best if > >> we keep the flickering to ourselves and update the comment to > >> describe that because _owner was DEFLATER_MARKER the deflation > >> thread must be in the middle of the protocol for deflating the > >> object monitor, and in particular, incrementing _contentions > >> ensures the failure of the final CAS in the deflation protocol > >> (final in the protocol implemented in the code). > > > > The above is a more clear expression of your concerns and I agree. > > > > > >> To be clear: > >> > >> > 529 // Cancelled the in-progress async deflation. > >> > >> I would expend this comment by mentioning that the deflator > >> thread cannot win the last part of the 2-part deflation protocol > >> as 0 < _contentions (pre-condition to this method). > >> > >> > We bump contentions an > >> > 530 // extra time to prevent the async deflater thread from > >> temporarily > >> > 531 // changing it to -max_jint and back to zero (no flicker to > >> confuse > >> > 532 // is_being_async_deflated()). > >> > >> I would replace this part with something along the lines of: We > >> bump contentions an extra time to prevent the deflator thread > >> from winning the last part of the (2-part) deflation protocol > >> after this thread decrements _contentions as part of the release > >> of the object monitor. > >> > >> > The async deflater thread will > >> > 533 // decrement contentions after it recognizes that the async > >> > 534 // deflation was cancelled. > >> > >> I would keep this part. > > > > So here's my rewrite of the code and comment block: > > > > ? if (AsyncDeflateIdleMonitors && > > ????? try_set_owner_from(DEFLATER_MARKER, Self) == > DEFLATER_MARKER) { > > ??? // Cancelled the in-progress async deflation by changing owner > > from > > ??? // DEFLATER_MARKER to Self. As part of the contended enter > > protocol, > > ??? // contentions was incremented to a positive value before EnterI() > > ??? // was called and that prevents the deflater thread from > > winning the > > ??? // last part of the 2-part async deflation protocol. After > > EnterI() > > ??? // returns to enter(), contentions is decremented because the > > caller > > ??? // now owns the monitor. We bump contentions an extra time here to > > ??? // prevent the deflater thread from winning the last part of the > > ??? // 2-part async deflation protocol after the regular decrement > > ??? // occurs in enter(). The deflater thread will decrement > > contentions > > ??? // after it recognizes that the async deflation was cancelled. > > ??? add_to_contentions(1); > > > > I've made this change to both places in EnterI() that had the original > > confusing comment. > > > > Please let me know if this rewrite works for everyone. > > > > Since I've already pushed 8153224, I'll file a new bug to push this > > clarification once we're all in agreement here. > > > > Dan > > > > > >> > >> I hope this helps, > >> Carsten > >> > >>> Otherwise, the code looks great. I am looking forward to > >>> seeing in the repo. > >> > >> Thanks! The code should be there soon. > >> > >> Dan > >> > >> > >>> > >>> Carsten > >>> > >>> On Mon, Jun 1, 2020 at 8:32 PM Daniel D. Daugherty > >>> >>> > wrote: > >>> > >>> Hi David, > >>> > >>> On 6/1/20 7:58 PM, David Holmes wrote: > >>> > Hi Dan, > >>> > > >>> > Sorry for the delay. > >>> > >>> No worries. It's always worth waiting for your code > >>> review in general > >>> and, with the complexity of this project, it's on my > >>> must-do list! > >>> > >>> > >>> > > >>> > On 28/05/2020 3:20 am, Daniel D. Daugherty wrote: > >>> >> Greetings, > >>> >> > >>> >> Erik O. had an idea for changing the three part async > >>> deflation protocol > >>> >> into a two part async deflation protocol where the > >>> second part (setting > >>> >> the contentions field to -max_jint) is a > >>> linearization point. I've taken > >>> >> Erik's proposal (which was relative to > >>> CR12/v2.12/15-for-jdk15), merged > >>> >> it with CR13/v2.13/16-for-jdk15, and made a few minor > >>> tweaks. > >>> >> > >>> >> I have attached the change list from CR13 to CR14 and > >>> I've also added a > >>> >> link to the CR13-to-CR14-changes file to the webrevs > >>> so it should be > >>> >> easy > >>> >> to find. > >>> >> > >>> >> Main bug URL: > >>> >> > >>> >> ???? JDK-8153224 Monitor deflation prolong safepoints > >>> >> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >> > >>> >> The project is currently baselined on jdk-15+24. > >>> >> > >>> >> Here's the full webrev URL for those folks that want > >>> to see all of the > >>> >> current Async Monitor Deflation code in one go (v2.14 > >>> full): > >>> >> > >>> >> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/17-for- > jdk15+24.v2.14.full/ > >>> > >>> >> > >>> >> > >>> >> Some folks might want to see just what has changed > >>> since the last review > >>> >> cycle so here's a webrev for that (v2.14 inc): > >>> >> > >>> >> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/17-for- > jdk15+24.v2.14.inc/ > >>> > >>> > > >>> > > >>> > src/hotspot/share/runtime/synchronizer.cpp > >>> > > >>> > I'm having a little trouble keeping the _contentions > >>> relationships in > >>> > my head. In particular with this change I can't quite > >>> grok the: > >>> > > >>> > // Deferred decrement for the JT EnterI() that > >>> cancelled the async > >>> > deflation. > >>> > mid->add_to_contentions(-1); > >>> > > >>> > change. I kind of get EnterI() does an extra increment > >>> and the > >>> > deflator thread does the above matching decrement. But > >>> given the two > >>> > changes can happen in any order I'm not sure what the > >>> possible visible > >>> > values for _contentions will be and how that might > >>> affect other code > >>> > inspecting it? > >>> > >>> I have a sub-section in the OpenJDK wiki dedicated to > >>> this particular race: > >>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#A > syncMonitorDeflation-T- > enterWinsByCancellationViaDEFLATER_MARKERSwap > >>> > >>> In order for this race condition to manifest, the > >>> T-enter thread has to > >>> successfully swap the owner field's DEFLATER_MARKER > >>> value for Self. That > >>> swap will eventually cause the T-deflate thread to > >>> realize that the async > >>> deflation that it started has been canceled. > >>> > >>> The diagram shows the progression of contentions values: > >>> > >>> - ObjectMonitor box 1 shows contentions == 1 because > >>> T-enter incremented > >>> ?? the contentions field > >>> > >>> - ObjectMonitor box 2 shows contentions == 2 because > >>> EnterI() did the > >>> ?? extra increment. > >>> > >>> - ObjectMonitor box 3 shows contentions == 1 because > >>> T-enter did the > >>> ?? regular contentions decrement. > >>> > >>> - ObjectMonitor box 4 shows contentions == 0 because > >>> T-deflate did the > >>> ?? extra contentions decrement. > >>> > >>> Now it is possible for T-deflate to do the extra > >>> decrement before T-enter > >>> does the extra increment. If I were to add another > >>> diagram to show that > >>> variant of the race, that progression of contentions > >>> values would be: > >>> > >>> - ObjectMonitor box 1 shows contentions == 1 because > >>> T-enter incremented > >>> ?? the contentions field > >>> > >>> - ObjectMonitor box 2 shows contentions == 0 because > >>> T-deflate did the > >>> ?? extra contentions decrement. > >>> > >>> - ObjectMonitor box 3 shows contentions == 1 because > >>> EnterI() did the > >>> ?? extra increment. > >>> > >>> - ObjectMonitor box 4 shows contentions == 0 because > >>> T-enter did the > >>> ?? regular contentions decrement. > >>> > >>> Notice that in this second scenario the contentions > >>> field never goes > >>> negative so there's nothing to confuse a potential caller of > >>> is_being_async_deflated(): > >>> > >>> inline bool ObjectMonitor::is_being_async_deflated() { > >>> ?? return AsyncDeflateIdleMonitors && contentions() < 0; > >>> } > >>> > >>> It is not possible for T-deflate's extra decrement of > >>> the contentions > >>> field to make the contentions field negative. That > >>> decrement only happens > >>> when T-deflate detects that the async deflation has been > >>> canceled and > >>> async deflation can only be canceled after T-enter has > >>> already made the > >>> contentions field > 0. > >>> > >>> Please let me know if this resolves your concern about: > >>> > >>> > // Deferred decrement for the JT EnterI() that > >>> cancelled the async > >>> > deflation. > >>> > mid->add_to_contentions(-1); > >>> > >>> I'm not planning to update the OpenJDK wiki to add a > >>> second variant of > >>> the cancellation race. Please let me know if that is okay. > >>> > >>> > > >>> > But otherwise the changes in this version seem good > >>> and overall the > >>> > protocol seems simpler. > >>> > >>> This sounds like a thumbs up, but I'm looking for > >>> something more definitive. > >>> > >>> > >>> > I'm still going to spend some more time going over the > >>> complete webrev > >>> > to get a fuller sense of things. > >>> > >>> As always, if you find something after I've pushed, > >>> we'll deal with it. > >>> > >>> Thanks for your many re-reviews for this project!! > >>> > >>> Dan > >>> > >>> > >>> > > >>> > Thanks, > >>> > David > >>> > > >>> >> > >>> >> > >>> >> The OpenJDK wiki has been updated for v2.14. > >>> >> > >>> >> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> >> > >>> >> The jdk-15+24 based v2.14 version of the patch has > >>> gone thru Mach5 > >>> >> Tier[1-5] > >>> >> testing with no related failures; Mach5 Tier[67] are > >>> running now and > >>> >> so far > >>> >> have no related failures. I'll kick off Mach5 Tier8 > >>> after the other > >>> >> tiers > >>> >> have finished since Mach5 is a bit busy right now. > >>> >> > >>> >> I'm also running my usual inflation stress testing on > >>> Linux-X64 and > >>> >> macOSX > >>> >> and so far there are no issues. > >>> >> > >>> >> Thanks, in advance, for any questions, comments or > >>> suggestions. > >>> >> > >>> >> Dan > >>> >> > >>> >> > >>> >> On 5/21/20 2:53 PM, Daniel D. Daugherty wrote: > >>> >>> Greetings, > >>> >>> > >>> >>> I have made changes to the Async Monitor Deflation > >>> code in response to > >>> >>> the CR12/v2.12/15-for-jdk15 code review cycle. > >>> Thanks to David H. and > >>> >>> Erik O. for their OpenJDK reviews in the v2.12 round! > >>> >>> > >>> >>> I have attached the change list from CR12 to CR13 > >>> and I've also added a > >>> >>> link to the CR12-to-CR13-changes file to the webrevs > >>> so it should be > >>> >>> easy > >>> >>> to find. > >>> >>> > >>> >>> Main bug URL: > >>> >>> > >>> >>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>> >>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>> > >>> >>> The project is currently baselined on jdk-15+24. > >>> >>> > >>> >>> Here's the full webrev URL for those folks that want > >>> to see all of the > >>> >>> current Async Monitor Deflation code in one go > >>> (v2.13 full): > >>> >>> > >>> >>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/16-for- > jdk15%2b24.v2.13.full/ > >>> > >>> >>> > >>> >>> > >>> >>> Some folks might want to see just what has changed > >>> since the last > >>> >>> review > >>> >>> cycle so here's a webrev for that (v2.13 inc): > >>> >>> > >>> >>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/16-for- > jdk15%2b24.v2.13.inc/ > >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> The OpenJDK wiki is currently at v2.13 and might > >>> require minor > >>> >>> tweaks for v2.12 > >>> >>> and v2.13. Yes, I need to make yet another crawl > >>> thru review of it... > >>> >>> > >>> >>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> >>> > >>> >>> The jdk-15+24 based v2.13 version of the patch is > >>> going thru the usual > >>> >>> Mach5 testing right now. It is also going thru my > >>> usual inflation > >>> >>> stress > >>> >>> testing on Linux-X64 and macOSX. > >>> >>> > >>> >>> Thanks, in advance, for any questions, comments or > >>> suggestions. > >>> >>> > >>> >>> Dan > >>> >>> > >>> >>> On 5/14/20 5:40 PM, Daniel D. Daugherty wrote: > >>> >>>> Greetings, > >>> >>>> > >>> >>>> I have made changes to the Async Monitor Deflation > >>> code in response to > >>> >>>> the CR11/v2.11/14-for-jdk15 code review cycle. > >>> Thanks to David H., > >>> >>>> Erik O., > >>> >>>> and Robbin for their OpenJDK reviews in the v2.11 > >>> round! > >>> >>>> > >>> >>>> I have attached the change list from CR11 to CR12 > >>> and I've also > >>> >>>> added a > >>> >>>> link to the CR11-to-CR12-changes file to the > >>> webrevs so it should > >>> >>>> be easy > >>> >>>> to find. > >>> >>>> > >>> >>>> Main bug URL: > >>> >>>> > >>> >>>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>> >>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>> > >>> >>>> The project is currently baselined on jdk-15+23. > >>> >>>> > >>> >>>> Here's the full webrev URL for those folks that > >>> want to see all of the > >>> >>>> current Async Monitor Deflation code in one go > >>> (v2.12 full): > >>> >>>> > >>> >>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/15-for- > jdk15%2b23.v2.12.full/ > >>> > >>> >>>> > >>> >>>> > >>> >>>> Some folks might want to see just what has changed > >>> since the last > >>> >>>> review > >>> >>>> cycle so here's a webrev for that (v2.12 inc): > >>> >>>> > >>> >>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/15-for- > jdk15%2b23.v2.12.inc/ > >>> > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> The OpenJDK wiki is currently at v2.11 and might > >>> require minor > >>> >>>> tweaks for v2.12: > >>> >>>> > >>> >>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> >>>> > >>> >>>> The jdk-15+23 based v2.12 version of the patch is > >>> going thru the usual > >>> >>>> Mach5 testing right now. > >>> >>>> > >>> >>>> Thanks, in advance, for any questions, comments or > >>> suggestions. > >>> >>>> > >>> >>>> Dan > >>> >>>> > >>> >>>> > >>> >>>> On 5/7/20 1:08 PM, Daniel D. Daugherty wrote: > >>> >>>>> Greetings, > >>> >>>>> > >>> >>>>> I have made changes to the Async Monitor Deflation > >>> code in > >>> >>>>> response to > >>> >>>>> the CR10/v2.10/13-for-jdk15 code review cycle and > >>> DaCapo-h2 perf > >>> >>>>> testing. > >>> >>>>> Thanks to Erik O., Robbin and David H. for their > >>> OpenJDK reviews > >>> >>>>> in the > >>> >>>>> v2.10 round! Thanks to Eric C. for his help in > >>> isolating the > >>> >>>>> DaCapo-h2 > >>> >>>>> performance regression. > >>> >>>>> > >>> >>>>> With the removal of ref_counting and the > >>> ObjectMonitorHandle > >>> >>>>> class, the > >>> >>>>> Async Monitor Deflation project is now closer to > >>> Carsten's original > >>> >>>>> prototype. While ref_counting gave us > >>> ObjectMonitor* safety > >>> >>>>> enforced by > >>> >>>>> code, I saw a ~22.8% slow down with > >>> -XX:-AsyncDeflateIdleMonitors > >>> >>>>> ("off" > >>> >>>>> mode). The slow down with "on" mode > >>> -XX:+AsyncDeflateIdleMonitors > >>> >>>>> is ~17%. > >>> >>>>> > >>> >>>>> I have attached the change list from CR10 to CR11 > >>> instead of > >>> >>>>> putting it in > >>> >>>>> the body of this email. I've also added a link to the > >>> >>>>> CR10-to-CR11-changes > >>> >>>>> file to the webrevs so it should be easy to find. > >>> >>>>> > >>> >>>>> Main bug URL: > >>> >>>>> > >>> >>>>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>> > >>> >>>>> The project is currently baselined on jdk-15+21. > >>> >>>>> > >>> >>>>> Here's the full webrev URL for those folks that > >>> want to see all of > >>> >>>>> the > >>> >>>>> current Async Monitor Deflation code in one go > >>> (v2.11 full): > >>> >>>>> > >>> >>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for- > jdk15%2b21.v2.11.full/ > >>> > >>> >>>>> > >>> >>>>> > >>> >>>>> Some folks might want to see just what has changed > >>> since the last > >>> >>>>> review > >>> >>>>> cycle so here's a webrev for that (v2.11 inc): > >>> >>>>> > >>> >>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for- > jdk15%2b21.v2.11.inc/ > >>> > >>> >>>>> > >>> >>>>> > >>> >>>>> Because of the removal of ref_counting and the > >>> ObjectMonitorHandle > >>> >>>>> class, the > >>> >>>>> incremental webrev is a bit noisier than I would > >>> have preferred. > >>> >>>>> > >>> >>>>> > >>> >>>>> The OpenJDK wiki has NOT YET been updated for this > >>> round of changes: > >>> >>>>> > >>> >>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> >>>>> > >>> >>>>> The jdk-15+21 based v2.11 version of the patch has > >>> been thru Mach5 > >>> >>>>> tier[1-6] > >>> >>>>> testing on Oracle's usual set of platforms. Mach5 > >>> tier[78] are > >>> >>>>> still running. > >>> >>>>> I'm running the v2.11 patch through my usual set > >>> of stress testing on > >>> >>>>> Linux-X64 and macOSX. > >>> >>>>> > >>> >>>>> I'm planning to do a SPECjbb2015, DaCapo-h2 and > >>> volano round on the > >>> >>>>> CR11/v2.11/14-for-jdk15 bits. > >>> >>>>> > >>> >>>>> Thanks, in advance, for any questions, comments or > >>> suggestions. > >>> >>>>> > >>> >>>>> Dan > >>> >>>>> > >>> >>>>> > >>> >>>>> On 2/26/20 5:22 PM, Daniel D. Daugherty wrote: > >>> >>>>>> Greetings, > >>> >>>>>> > >>> >>>>>> I have made changes to the Async Monitor > >>> Deflation code in > >>> >>>>>> response to > >>> >>>>>> the CR9/v2.09/12-for-jdk14 code review cycle. > >>> Thanks to Robbin > >>> >>>>>> and Erik O. > >>> >>>>>> for their comments in this round! > >>> >>>>>> > >>> >>>>>> With the extraction and push of > >>> {8235931,8236035,8235795} to > >>> >>>>>> JDK15, the > >>> >>>>>> Async Monitor Deflation code is back to "just" > >>> async deflation > >>> >>>>>> changes! > >>> >>>>>> > >>> >>>>>> I have attached the change list from CR9 to CR10 > >>> instead of > >>> >>>>>> putting it in > >>> >>>>>> the body of this email. I've also added a link to > >>> the > >>> >>>>>> CR9-to-CR10-changes > >>> >>>>>> file to the webrevs so it should be easy to find. > >>> >>>>>> > >>> >>>>>> Main bug URL: > >>> >>>>>> > >>> >>>>>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>> > >>> >>>>>> The project is currently baselined on jdk-15+11. > >>> >>>>>> > >>> >>>>>> Here's the full webrev URL for those folks that > >>> want to see all > >>> >>>>>> of the > >>> >>>>>> current Async Monitor Deflation code in one go > >>> (v2.10 full): > >>> >>>>>> > >>> >>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for- > jdk15+11.v2.10.full/ > >>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> Some folks might want to see just what has > >>> changed since the last > >>> >>>>>> review > >>> >>>>>> cycle so here's a webrev for that (v2.10 inc): > >>> >>>>>> > >>> >>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for- > jdk15+11.v2.10.inc/ > >>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> Since we backed out the > >>> HandshakeAfterDeflateIdleMonitors option > >>> >>>>>> and the > >>> >>>>>> C2 ref_count changes and updated the copyright > >>> years, the "inc" > >>> >>>>>> webrev has > >>> >>>>>> a bit more noise in it than usual. Sorry about that! > >>> >>>>>> > >>> >>>>>> The OpenJDK wiki has been updated for this round > >>> of changes: > >>> >>>>>> > >>> >>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> The jdk-15+11 based v2.10 version of the patch > >>> has been thru > >>> >>>>>> Mach5 tier[1-7] > >>> >>>>>> testing on Oracle's usual set of platforms. Mach5 > >>> tier8 is still > >>> >>>>>> running. > >>> >>>>>> I'm running the v2.10 patch through my usual set > >>> of stress > >>> >>>>>> testing on > >>> >>>>>> Linux-X64 and macOSX. > >>> >>>>>> > >>> >>>>>> I'm planning to do a SPECjbb2015 round on the > >>> >>>>>> CR10/v2.20/13-for-jdk15 bits. > >>> >>>>>> > >>> >>>>>> Thanks, in advance, for any questions, comments > >>> or suggestions. > >>> >>>>>> > >>> >>>>>> Dan > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> On 2/4/20 9:41 AM, Daniel D. Daugherty wrote: > >>> >>>>>>> Greetings, > >>> >>>>>>> > >>> >>>>>>> This project is no longer targeted to JDK14 so > >>> this is NOT an > >>> >>>>>>> urgent code > >>> >>>>>>> review request. > >>> >>>>>>> > >>> >>>>>>> I've extracted the following three fixes from > >>> the Async Monitor > >>> >>>>>>> Deflation > >>> >>>>>>> project code: > >>> >>>>>>> > >>> >>>>>>> ? ? JDK-8235931 add OM_CACHE_LINE_SIZE and use > >>> smaller size on > >>> >>>>>>> SPARCv9 and X64 > >>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235931 > >>> >>>>>>> > >>> >>>>>>> ? ? JDK-8236035 refactor > >>> ObjectMonitor::set_owner() and _owner > >>> >>>>>>> field setting > >>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8236035 > >>> >>>>>>> > >>> >>>>>>> ? ? JDK-8235795 replace monitor list > >>> >>>>>>> mux{Acquire,Release}(&gListLock) with spin locks > >>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235795 > >>> >>>>>>> > >>> >>>>>>> Each of these has been reviewed separately and > >>> will be pushed to > >>> >>>>>>> JDK15 > >>> >>>>>>> in the near future (possibly by the end of this > >>> week). Of > >>> >>>>>>> course, there > >>> >>>>>>> were improvements during these review cycles and > >>> the purpose of > >>> >>>>>>> this > >>> >>>>>>> e-mail is to provided updated webrevs for this fix > >>> >>>>>>> (CR9/v2.09/12-for-jdk14) > >>> >>>>>>> within the revised context provided by {8235931, > >>> 8236035, 8235795}. > >>> >>>>>>> > >>> >>>>>>> Main bug URL: > >>> >>>>>>> > >>> >>>>>>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>> > >>> >>>>>>> The project is currently baselined on jdk-14+34. > >>> >>>>>>> > >>> >>>>>>> Here's the full webrev URL for those folks that > >>> want to see all > >>> >>>>>>> of the > >>> >>>>>>> current Async Monitor Deflation code along with > >>> {8235931, > >>> >>>>>>> 8236035, 8235795} > >>> >>>>>>> in one go (v2.09b full): > >>> >>>>>>> > >>> >>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- > jdk14.v2.09b.full/ > >>> > >>> >>>>>>> > >>> >>>>>>> > >>> >>>>>>> Compare the open.patch file in > >>> 12-for-jdk14.v2.09.full and > >>> >>>>>>> 12-for-jdk14.v2.09b.full > >>> >>>>>>> using your favorite file comparison/merge tool > >>> to see how Async > >>> >>>>>>> Monitor Deflation > >>> >>>>>>> evolved due to {8235931, 8236035, 8235795}. > >>> >>>>>>> > >>> >>>>>>> Some folks might want to see just the Async > >>> Monitor Deflation > >>> >>>>>>> code on top of > >>> >>>>>>> {8235931, 8236035, 8235795} so here's a webrev > >>> for that (v2.09b > >>> >>>>>>> inc): > >>> >>>>>>> > >>> >>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- > jdk14.v2.09b.inc/ > >>> > >>> >>>>>>> > >>> >>>>>>> > >>> >>>>>>> These webrevs have gone thru several Mach5 > >>> Tier[1-8] runs along > >>> >>>>>>> with > >>> >>>>>>> my usual stress testing and SPECjbb2015 testing > >>> and there aren't > >>> >>>>>>> any > >>> >>>>>>> surprises relative to CR9/v2.09/12-for-jdk14. > >>> >>>>>>> > >>> >>>>>>> Thanks, in advance, for any questions, comments > >>> or suggestions. > >>> >>>>>>> > >>> >>>>>>> Dan > >>> >>>>>>> > >>> >>>>>>> > >>> >>>>>>> On 12/11/19 3:41 PM, Daniel D. Daugherty wrote: > >>> >>>>>>>> Greetings, > >>> >>>>>>>> > >>> >>>>>>>> I have made changes to the Async Monitor > >>> Deflation code in > >>> >>>>>>>> response to > >>> >>>>>>>> the CR8/v2.08/11-for-jdk14 code review cycle. > >>> Thanks to David > >>> >>>>>>>> H., Robbin > >>> >>>>>>>> and Erik O. for their comments! > >>> >>>>>>>> > >>> >>>>>>>> This project is no longer targeted to JDK14 so > >>> this is NOT an > >>> >>>>>>>> urgent code > >>> >>>>>>>> review request. The primary purpose of this > >>> webrev is simply to > >>> >>>>>>>> close the > >>> >>>>>>>> CR8/v2.08/11-for-jdk14 code review loop and to > >>> let folks see > >>> >>>>>>>> how I resolved > >>> >>>>>>>> the code review comments from that round. > >>> >>>>>>>> > >>> >>>>>>>> Most of the comments in the > >>> CR8/v2.08/11-for-jdk14 code review > >>> >>>>>>>> cycle were > >>> >>>>>>>> on the monitor list changes so I'm going to > >>> take a look at > >>> >>>>>>>> extracting those > >>> >>>>>>>> changes into a standalone patch. Switching from > >>> >>>>>>>> Thread::muxAcquire(&gListLock) > >>> >>>>>>>> and Thread::muxRelease(&gListLock) to finer > >>> grained internal > >>> >>>>>>>> spin locks needs > >>> >>>>>>>> to be thoroughly reviewed and the best way to > >>> do that is > >>> >>>>>>>> separately from the > >>> >>>>>>>> Async Monitor Deflation changes. Thanks to > >>> Coleen for > >>> >>>>>>>> suggesting doing this > >>> >>>>>>>> extraction earlier. > >>> >>>>>>>> > >>> >>>>>>>> I have attached the change list from CR8 to CR9 > >>> instead of > >>> >>>>>>>> putting it in > >>> >>>>>>>> the body of this email. I've also added a link > >>> to the > >>> >>>>>>>> CR8-to-CR9-changes > >>> >>>>>>>> file to the webrevs so it should be easy to find. > >>> >>>>>>>> > >>> >>>>>>>> Main bug URL: > >>> >>>>>>>> > >>> >>>>>>>> JDK-8153224 Monitor deflation prolong safepoints > >>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>>> > >>> >>>>>>>> The project is currently baselined on jdk-14+26. > >>> >>>>>>>> > >>> >>>>>>>> Here's the full webrev URL for those folks that > >>> want to see all > >>> >>>>>>>> of the > >>> >>>>>>>> current Async Monitor Deflation code in one go > >>> (v2.09 full): > >>> >>>>>>>> > >>> >>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- > jdk14.v2.09.full/ > >>> > >>> >>>>>>>> > >>> >>>>>>>> > >>> >>>>>>>> Some folks might want to see just what has > >>> changed since the > >>> >>>>>>>> last review > >>> >>>>>>>> cycle so here's a webrev for that (v2.09 inc): > >>> >>>>>>>> > >>> >>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- > jdk14.v2.09.inc/ > >>> > >>> >>>>>>>> > >>> >>>>>>>> > >>> >>>>>>>> The OpenJDK wiki has NOT yet been updated for > >>> this round of > >>> >>>>>>>> changes: > >>> >>>>>>>> > >>> >>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>> > >>> >>>>>>>> > >>> >>>>>>>> The jdk-14+26 based v2.09 version of the patch > >>> has been thru > >>> >>>>>>>> Mach5 tier[1-7] > >>> >>>>>>>> testing on Oracle's usual set of platforms. > >>> Mach5 tier8 is > >>> >>>>>>>> still running. > >>> >>>>>>>> A slightly older version of the v2.09 patch has > >>> also been > >>> >>>>>>>> through my usual > >>> >>>>>>>> set of stress testing on Linux-X64 and macOSX > >>> with the addition > >>> >>>>>>>> of Robbin's > >>> >>>>>>>> "MoCrazy 1024" test running in parallel on > >>> Linux-X64 with the > >>> >>>>>>>> other tests in > >>> >>>>>>>> my lab. The "MoCrazy 1024" has been going for > > >>> 5 days and > >>> >>>>>>>> 6700+ iterations > >>> >>>>>>>> without any failures. > >>> >>>>>>>> > >>> >>>>>>>> I'm planning to do a SPECjbb2015 round on the > >>> >>>>>>>> CR9/v2.09/12-for-jdk14 bits. > >>> >>>>>>>> > >>> >>>>>>>> Thanks, in advance, for any questions, comments > >>> or suggestions. > >>> >>>>>>>> > >>> >>>>>>>> Dan > >>> >>>>>>>> > >>> >>>>>>>> > >>> >>>>>>>> On 11/4/19 4:03 PM, Daniel D. Daugherty wrote: > >>> >>>>>>>>> Greetings, > >>> >>>>>>>>> > >>> >>>>>>>>> I have made changes to the Async Monitor > >>> Deflation code in > >>> >>>>>>>>> response to > >>> >>>>>>>>> the CR7/v2.07/10-for-jdk14 code review cycle. > >>> Thanks to David > >>> >>>>>>>>> H., Robbin > >>> >>>>>>>>> and Erik O. for their comments! > >>> >>>>>>>>> > >>> >>>>>>>>> JDK14 Rampdown phase one is coming on Dec. 12, > >>> 2019 and the > >>> >>>>>>>>> Async Monitor > >>> >>>>>>>>> Deflation project needs to push before Nov. > >>> 12, 2019 in order > >>> >>>>>>>>> to allow > >>> >>>>>>>>> for sufficient bake time for such a big > >>> change. Nov. 12 is > >>> >>>>>>>>> _next_ Tuesday > >>> >>>>>>>>> so we have 8 days from today to finish this > >>> code review cycle > >>> >>>>>>>>> and push > >>> >>>>>>>>> this code for JDK14. > >>> >>>>>>>>> > >>> >>>>>>>>> Carsten and Roman! Time for you guys to chime > >>> in again on the > >>> >>>>>>>>> code reviews. > >>> >>>>>>>>> > >>> >>>>>>>>> I have attached the change list from CR7 to > >>> CR8 instead of > >>> >>>>>>>>> putting it in > >>> >>>>>>>>> the body of this email. I've also added a link > >>> to the > >>> >>>>>>>>> CR7-to-CR8-changes > >>> >>>>>>>>> file to the webrevs so it should be easy to find. > >>> >>>>>>>>> > >>> >>>>>>>>> Main bug URL: > >>> >>>>>>>>> > >>> >>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints > >>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>>>> > >>> >>>>>>>>> The project is currently baselined on jdk-14+21. > >>> >>>>>>>>> > >>> >>>>>>>>> Here's the full webrev URL for those folks > >>> that want to see > >>> >>>>>>>>> all of the > >>> >>>>>>>>> current Async Monitor Deflation code in one go > >>> (v2.08 full): > >>> >>>>>>>>> > >>> >>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for- > jdk14.v2.08.full > >>> > >>> >>>>>>>>> > >>> >>>>>>>>> > >>> >>>>>>>>> Some folks might want to see just what has > >>> changed since the > >>> >>>>>>>>> last review > >>> >>>>>>>>> cycle so here's a webrev for that (v2.08 inc): > >>> >>>>>>>>> > >>> >>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for- > jdk14.v2.08.inc/ > >>> > >>> >>>>>>>>> > >>> >>>>>>>>> > >>> >>>>>>>>> The OpenJDK wiki did not need any changes for > >>> this round: > >>> >>>>>>>>> > >>> >>>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>>> > >>> >>>>>>>>> > >>> >>>>>>>>> The jdk-14+21 based v2.08 version of the patch > >>> has been thru > >>> >>>>>>>>> Mach5 tier[1-8] > >>> >>>>>>>>> testing on Oracle's usual set of platforms. It > >>> has also been > >>> >>>>>>>>> through my usual > >>> >>>>>>>>> set of stress testing on Linux-X64, macOSX and > >>> Solaris-X64 > >>> >>>>>>>>> with the addition > >>> >>>>>>>>> of Robbin's "MoCrazy 1024" test running in > >>> parallel with the > >>> >>>>>>>>> other tests in > >>> >>>>>>>>> my lab. Some testing is still running, but so > >>> far there are no > >>> >>>>>>>>> new regressions. > >>> >>>>>>>>> > >>> >>>>>>>>> I have not yet done a SPECjbb2015 round on the > >>> >>>>>>>>> CR8/v2.08/11-for-jdk14 bits. > >>> >>>>>>>>> > >>> >>>>>>>>> Thanks, in advance, for any questions, > >>> comments or suggestions. > >>> >>>>>>>>> > >>> >>>>>>>>> Dan > >>> >>>>>>>>> > >>> >>>>>>>>> > >>> >>>>>>>>> On 10/17/19 5:50 PM, Daniel D. Daugherty wrote: > >>> >>>>>>>>>> Greetings, > >>> >>>>>>>>>> > >>> >>>>>>>>>> The Async Monitor Deflation project is > >>> reaching the end game. > >>> >>>>>>>>>> I have no > >>> >>>>>>>>>> changes planned for the project at this time > >>> so all that is > >>> >>>>>>>>>> left is code > >>> >>>>>>>>>> review and any changes that results from > >>> those reviews. > >>> >>>>>>>>>> > >>> >>>>>>>>>> Carsten and Roman! Time for you guys to chime > >>> in again on the > >>> >>>>>>>>>> code reviews. > >>> >>>>>>>>>> > >>> >>>>>>>>>> I have attached the list of fixes from CR6 to > >>> CR7 instead of > >>> >>>>>>>>>> putting it > >>> >>>>>>>>>> in the main body of this email. > >>> >>>>>>>>>> > >>> >>>>>>>>>> Main bug URL: > >>> >>>>>>>>>> > >>> >>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints > >>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>>>>> > >>> >>>>>>>>>> The project is currently baselined on jdk-14+19. > >>> >>>>>>>>>> > >>> >>>>>>>>>> Here's the full webrev URL for those folks > >>> that want to see > >>> >>>>>>>>>> all of the > >>> >>>>>>>>>> current Async Monitor Deflation code in one > >>> go (v2.07 full): > >>> >>>>>>>>>> > >>> >>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for- > jdk14.v2.07.full > >>> > >>> >>>>>>>>>> > >>> >>>>>>>>>> > >>> >>>>>>>>>> Some folks might want to see just what has > >>> changed since the > >>> >>>>>>>>>> last review > >>> >>>>>>>>>> cycle so here's a webrev for that (v2.07 inc): > >>> >>>>>>>>>> > >>> >>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for- > jdk14.v2.07.inc/ > >>> > >>> >>>>>>>>>> > >>> >>>>>>>>>> > >>> >>>>>>>>>> The OpenJDK wiki has been updated to match the > >>> >>>>>>>>>> CR7/v2.07/10-for-jdk14 changes: > >>> >>>>>>>>>> > >>> >>>>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>>>> > >>> >>>>>>>>>> > >>> >>>>>>>>>> The jdk-14+18 based v2.07 version of the > >>> patch has been thru > >>> >>>>>>>>>> Mach5 tier[1-8] > >>> >>>>>>>>>> testing on Oracle's usual set of platforms. > >>> It has also been > >>> >>>>>>>>>> through my usual > >>> >>>>>>>>>> set of stress testing on Linux-X64, macOSX > >>> and Solaris-X64 > >>> >>>>>>>>>> with the addition > >>> >>>>>>>>>> of Robbin's "MoCrazy 1024" test running in > >>> parallel with the > >>> >>>>>>>>>> other tests in > >>> >>>>>>>>>> my lab. > >>> >>>>>>>>>> > >>> >>>>>>>>>> The jdk-14+19 based v2.07 version of the > >>> patch has been thru > >>> >>>>>>>>>> Mach5 tier[1-3] > >>> >>>>>>>>>> test on Oracle's usual set of platforms. > >>> Mach5 tier[4-8] are > >>> >>>>>>>>>> in process. > >>> >>>>>>>>>> > >>> >>>>>>>>>> I did another round of SPECjbb2015 testing in > >>> Oracle's Aurora > >>> >>>>>>>>>> Performance lab > >>> >>>>>>>>>> using using their tuned SPECjbb2015 Linux-X64 > >>> G1 configs: > >>> >>>>>>>>>> > >>> >>>>>>>>>> - "base" is jdk-14+18 > >>> >>>>>>>>>> - "v2.07" is the latest version and includes C2 > >>> >>>>>>>>>> inc_om_ref_count() support > >>> >>>>>>>>>> ????? on LP64 X64 and the new > >>> >>>>>>>>>> HandshakeAfterDeflateIdleMonitors option > >>> >>>>>>>>>> - "off" is with -XX:-AsyncDeflateIdleMonitors > >>> specified > >>> >>>>>>>>>> - "handshake" is with > >>> >>>>>>>>>> -XX:+HandshakeAfterDeflateIdleMonitors specified > >>> >>>>>>>>>> > >>> >>>>>>>>>> ???????? hbIR?????????? hbIR > >>> >>>>>>>>>> (max attempted)? (settled)? max-jOPS > >>> critical-jOPS runtime > >>> >>>>>>>>>> ---------------? ---------? -------- > >>> ------------- ------- > >>> >>>>>>>>>> ?????????? 34282.00?? 30635.90? 28831.30 > >>> 20969.20 3841.30 base > >>> >>>>>>>>>> ?????????? 34282.00?? 30973.00? 29345.80 > >>> 21025.20 3964.10 v2.07 > >>> >>>>>>>>>> ?????????? 34282.00?? 31105.60? 29174.30 > >>> 21074.00 3931.30 > >>> >>>>>>>>>> v2.07_handshake > >>> >>>>>>>>>> ?????????? 34282.00?? 30789.70? 27151.60 > >>> 19839.10 3850.20 > >>> >>>>>>>>>> v2.07_off > >>> >>>>>>>>>> > >>> >>>>>>>>>> - The Aurora Perf comparison tool reports: > >>> >>>>>>>>>> > >>> >>>>>>>>>> ??????? Comparison????????????? max-jOPS > >>> critical-jOPS > >>> >>>>>>>>>> ??????? ---------------------- > >>> -------------------- > >>> >>>>>>>>>> -------------------- > >>> >>>>>>>>>> ??????? base vs 2.07??????????? +1.78% (s, > >>> p=0.000) +0.27% > >>> >>>>>>>>>> (ns, p=0.790) > >>> >>>>>>>>>> ??????? base vs 2.07_handshake? +1.19% (s, > >>> p=0.007) +0.58% > >>> >>>>>>>>>> (ns, p=0.536) > >>> >>>>>>>>>> ??????? base vs 2.07_off??????? -5.83% (ns, > >>> p=0.394) -5.39% > >>> >>>>>>>>>> (ns, p=0.347) > >>> >>>>>>>>>> > >>> >>>>>>>>>> ??????? (s) - significant? (ns) - not-significant > >>> >>>>>>>>>> > >>> >>>>>>>>>> - For historical comparison, the Aurora Perf > >>> comparision > >>> >>>>>>>>>> tool > >>> >>>>>>>>>> ??????? reported for v2.06 with a baseline of > >>> jdk-13+31: > >>> >>>>>>>>>> > >>> >>>>>>>>>> ??????? Comparison????????????? max-jOPS > >>> critical-jOPS > >>> >>>>>>>>>> ??????? ---------------------- > >>> -------------------- > >>> >>>>>>>>>> -------------------- > >>> >>>>>>>>>> ??????? base vs 2.06??????????? -0.32% (ns, > >>> p=0.345) +0.71% > >>> >>>>>>>>>> (ns, p=0.646) > >>> >>>>>>>>>> ??????? base vs 2.06_off??????? +0.49% (ns, > >>> p=0.292) -1.21% > >>> >>>>>>>>>> (ns, p=0.481) > >>> >>>>>>>>>> > >>> >>>>>>>>>> ??????? (s) - significant? (ns) - not-significant > >>> >>>>>>>>>> > >>> >>>>>>>>>> Thanks, in advance, for any questions, > >>> comments or suggestions. > >>> >>>>>>>>>> > >>> >>>>>>>>>> Dan > >>> >>>>>>>>>> > >>> >>>>>>>>>> > >>> >>>>>>>>>> On 8/28/19 5:02 PM, Daniel D. Daugherty wrote: > >>> >>>>>>>>>>> Greetings, > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> The Async Monitor Deflation project has > >>> rebased to JDK14 so > >>> >>>>>>>>>>> it's time > >>> >>>>>>>>>>> for our first code review in that new context!! > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> I've been focused on changing the monitor > >>> list management > >>> >>>>>>>>>>> code to be > >>> >>>>>>>>>>> lock-free in order to make SPECjbb2015 > >>> happier. Of course > >>> >>>>>>>>>>> with a change > >>> >>>>>>>>>>> like that, it takes a while to chase down > >>> all the new and > >>> >>>>>>>>>>> wonderful > >>> >>>>>>>>>>> races. At this point, I have the code back > >>> to the same > >>> >>>>>>>>>>> stability that > >>> >>>>>>>>>>> I had with CR5/v2.05/8-for-jdk13. > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> To lay the ground work for this round of > >>> review, I pushed > >>> >>>>>>>>>>> the following > >>> >>>>>>>>>>> two fixes to jdk/jdk earlier today: > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> ??? JDK-8230184 rename, whitespace, indent > >>> and comments > >>> >>>>>>>>>>> changes in preparation > >>> >>>>>>>>>>> ? ? ??????????? for lock free Monitor lists > >>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- > 8230184 > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> ??? JDK-8230317 > >>> serviceability/sa/ClhsdbPrintStatics.java > >>> >>>>>>>>>>> fails after 8230184 > >>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- > 8230317 > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> I have attached the list of fixes from CR5 > >>> to CR6 instead of > >>> >>>>>>>>>>> putting > >>> >>>>>>>>>>> in the main body of this email. > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> Main bug URL: > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong > >>> safepoints > >>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- > 8153224 > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> The project is currently baselined on > >>> jdk-14+11 plus the > >>> >>>>>>>>>>> fixes for > >>> >>>>>>>>>>> JDK-8230184 and JDK-8230317. > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> Here's the full webrev URL for those folks > >>> that want to see > >>> >>>>>>>>>>> all of the > >>> >>>>>>>>>>> current Async Monitor Deflation code in one > >>> go (v2.06 full): > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > jdk14.v2.06.full/ > >>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> The primary focus of this review cycle is on > >>> the lock-free > >>> >>>>>>>>>>> Monitor List > >>> >>>>>>>>>>> management changes so here's a webrev for > >>> just that patch > >>> >>>>>>>>>>> (v2.06c): > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > jdk14.v2.06c.inc/ > >>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> The secondary focus of this review cycle is > >>> on the bug fixes > >>> >>>>>>>>>>> that have > >>> >>>>>>>>>>> been made since CR5/v2.05/8-for-jdk13 so > >>> here's a webrev for > >>> >>>>>>>>>>> just that > >>> >>>>>>>>>>> patch (v2.06b): > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > jdk14.v2.06b.inc/ > >>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> The third and final bucket for this review > >>> cycle is the > >>> >>>>>>>>>>> rename, whitespace, > >>> >>>>>>>>>>> indent and comments changes made in > >>> preparation for lock > >>> >>>>>>>>>>> free Monitor list > >>> >>>>>>>>>>> management. Almost all of that was extracted > >>> into > >>> >>>>>>>>>>> JDK-8230184 for the > >>> >>>>>>>>>>> baseline so this bucket now has just a few > >>> comment changes > >>> >>>>>>>>>>> relative to > >>> >>>>>>>>>>> CR5/v2.05/8-for-jdk13. Here's a webrev for > >>> the remainder > >>> >>>>>>>>>>> (v2.06a): > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > jdk14.v2.06a.inc/ > >>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> Some folks might want to see just what has > >>> changed since the > >>> >>>>>>>>>>> last review > >>> >>>>>>>>>>> cycle so here's a webrev for that (v2.06 inc): > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > jdk14.v2.06.inc/ > >>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> Last, but not least, some folks might want > >>> to see the code > >>> >>>>>>>>>>> before the > >>> >>>>>>>>>>> addition of lock-free Monitor List > >>> management so here's a > >>> >>>>>>>>>>> webrev for > >>> >>>>>>>>>>> that (v2.00 -> v2.05): > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > jdk14.v2.05.inc/ > >>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> The OpenJDK wiki will need minor updates to > >>> match the CR6 > >>> >>>>>>>>>>> changes: > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> but that should only be changes to describe > >>> per-thread list > >>> >>>>>>>>>>> async monitor > >>> >>>>>>>>>>> deflation being done by the ServiceThread. > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> (I did update the OpenJDK wiki for the CR5 > >>> changes back on > >>> >>>>>>>>>>> 2019.08.14) > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> This version of the patch has been thru > >>> Mach5 tier[1-8] > >>> >>>>>>>>>>> testing on > >>> >>>>>>>>>>> Oracle's usual set of platforms. It has also > >>> been through my > >>> >>>>>>>>>>> usual set > >>> >>>>>>>>>>> of stress testing on Linux-X64, macOSX and > >>> Solaris-X64. > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> I did a bunch of SPECjbb2015 testing in > >>> Oracle's Aurora > >>> >>>>>>>>>>> Performance lab > >>> >>>>>>>>>>> using using their tuned SPECjbb2015 > >>> Linux-X64 G1 configs. > >>> >>>>>>>>>>> This was using > >>> >>>>>>>>>>> this patch baselined on jdk-13+31 (for > >>> stability): > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> ????????? hbIR?????????? hbIR > >>> >>>>>>>>>>> ???? (max attempted)? (settled)? max-jOPS > >>> critical-jOPS runtime > >>> >>>>>>>>>>> ???? ---------------? ---------? -------- > >>> ------------- ------- > >>> >>>>>>>>>>> ??????????? 34282.00?? 28837.20? 27905.20 > >>> 19817.40 3658.10 base > >>> >>>>>>>>>>> ??????????? 34965.70?? 29798.80? 27814.90 > >>> 19959.00 3514.60 > >>> >>>>>>>>>>> v2.06d > >>> >>>>>>>>>>> ??????????? 34282.00?? 29100.70? 28042.50 > >>> 19577.00 3701.90 > >>> >>>>>>>>>>> v2.06d_off > >>> >>>>>>>>>>> ??????????? 34282.00?? 29218.50? 27562.80 > >>> 19397.30 3657.60 > >>> >>>>>>>>>>> v2.06d_ocache > >>> >>>>>>>>>>> ??????????? 34965.70?? 29838.30? 26512.40 > >>> 19170.60 3569.90 > >>> >>>>>>>>>>> v2.05 > >>> >>>>>>>>>>> ??????????? 34282.00?? 28926.10? 27734.00 > >>> 19835.10 3588.40 > >>> >>>>>>>>>>> v2.05_off > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> The "off" configs are with > >>> -XX:-AsyncDeflateIdleMonitors > >>> >>>>>>>>>>> specified and > >>> >>>>>>>>>>> the "ocache" config is with 128 byte cache > >>> line sizes > >>> >>>>>>>>>>> instead of 64 byte > >>> >>>>>>>>>>> cache lines sizes. "v2.06d" is the last set > >>> of changes that > >>> >>>>>>>>>>> I made before > >>> >>>>>>>>>>> those changes were distributed into the > >>> "v2.06a", "v2.06b" > >>> >>>>>>>>>>> and "v2.06c" > >>> >>>>>>>>>>> buckets for this review recycle. > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> Thanks, in advance, for any questions, > >>> comments or suggestions. > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> Dan > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>>> On 7/11/19 3:49 PM, Daniel D. Daugherty wrote: > >>> >>>>>>>>>>>> Greetings, > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> I've been focused on chasing down and > >>> fixing the rare test > >>> >>>>>>>>>>>> failures > >>> >>>>>>>>>>>> that only pop up rarely. So this round is > >>> primarily fixes > >>> >>>>>>>>>>>> for races > >>> >>>>>>>>>>>> with a few additional fixes that came from > >>> Karen's review > >>> >>>>>>>>>>>> of CR4. > >>> >>>>>>>>>>>> Thanks Karen! > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> I have attached the list of fixes from CR4 > >>> to CR5 instead > >>> >>>>>>>>>>>> of putting > >>> >>>>>>>>>>>> in the main body of this email. > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> Main bug URL: > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong > >>> safepoints > >>> >>>>>>>>>>>> > >>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> The project is currently baselined on > >>> jdk-13+29. This will > >>> >>>>>>>>>>>> likely be > >>> >>>>>>>>>>>> the last JDK13 baseline for this project > >>> and I'll roll to > >>> >>>>>>>>>>>> the JDK14 > >>> >>>>>>>>>>>> (jdk/jdk) repo soon... > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> Here's the full webrev URL: > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for- > jdk13.full/ > >>> > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> Here's the incremental webrev URL: > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for- > jdk13.inc/ > >>> > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> I have not yet checked the OpenJDK wiki to > >>> see if it needs > >>> >>>>>>>>>>>> any updates > >>> >>>>>>>>>>>> to match the CR5 changes: > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> (I did update the OpenJDK wiki for the CR4 > >>> changes back on > >>> >>>>>>>>>>>> 2019.06.26) > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> This version of the patch has been thru > >>> Mach5 tier[1-3] > >>> >>>>>>>>>>>> testing on > >>> >>>>>>>>>>>> Oracle's usual set of platforms. Mach5 > >>> tier[4-6] is running > >>> >>>>>>>>>>>> now and > >>> >>>>>>>>>>>> Mach5 tier[78] will follow. I'll kick off > >>> the usual stress > >>> >>>>>>>>>>>> testing > >>> >>>>>>>>>>>> on Linux-X64, macOSX and Solaris-X64 as > >>> those machines > >>> >>>>>>>>>>>> become available. > >>> >>>>>>>>>>>> Since I haven't made any performance > >>> changes in this round, > >>> >>>>>>>>>>>> I'll only > >>> >>>>>>>>>>>> be running SPECjbb2015 to gather the latest > >>> >>>>>>>>>>>> monitorinflation logs. > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> Next up: > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> - We're still seeing 4-5% lower performance > >>> with > >>> >>>>>>>>>>>> SPECjbb2015 on > >>> >>>>>>>>>>>> ? Linux-X64 and we've determined that some > >>> of that comes from > >>> >>>>>>>>>>>> ? contention on the gListLock. So I'm going > >>> to investigate > >>> >>>>>>>>>>>> removing > >>> >>>>>>>>>>>> ? the gListLock. Yes, another lock free set > >>> of changes is > >>> >>>>>>>>>>>> coming! > >>> >>>>>>>>>>>> - Of course, going lock free often causes > >>> new races and new > >>> >>>>>>>>>>>> failures > >>> >>>>>>>>>>>> ? so that's a good reason for make those > >>> changes isolated > >>> >>>>>>>>>>>> in their > >>> >>>>>>>>>>>> ? own round (and not holding up > >>> CR5/v2.05/8-for-jdk13 > >>> >>>>>>>>>>>> anymore). > >>> >>>>>>>>>>>> - I finally have a potential fix for the > >>> Win* failure with > >>> >>>>>>>>>>>> > >>> gc/g1/humongousObjects/TestHumongousClassLoader.java > >>> >>>>>>>>>>>> ? but I haven't run it through Mach5 yet so > >>> it'll be in the > >>> >>>>>>>>>>>> next round. > >>> >>>>>>>>>>>> - Some RTM tests were recently re-enabled > >>> in Mach5 and I'm > >>> >>>>>>>>>>>> seeing some > >>> >>>>>>>>>>>> ? monitor related failures there. I suspect > >>> that I need to > >>> >>>>>>>>>>>> go take a > >>> >>>>>>>>>>>> ? look at the C2 RTM macro assembler code > >>> and look for > >>> >>>>>>>>>>>> things that might > >>> >>>>>>>>>>>> ? conflict if Async Monitor Deflation. If > >>> you're interested > >>> >>>>>>>>>>>> in that kind > >>> >>>>>>>>>>>> ? of issue, then see the > >>> macroAssembler_x86.cpp sanity > >>> >>>>>>>>>>>> check that I > >>> >>>>>>>>>>>> ? added in this round! > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> Thanks, in advance, for any questions, > >>> comments or > >>> >>>>>>>>>>>> suggestions. > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> Dan > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>>> On 5/26/19 8:30 PM, Daniel D. Daugherty wrote: > >>> >>>>>>>>>>>>> Greetings, > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> I have a fix for an issue that came up > >>> during performance > >>> >>>>>>>>>>>>> testing. > >>> >>>>>>>>>>>>> Many thanks to Robbin for diagnosing the > >>> issue in his > >>> >>>>>>>>>>>>> SPECjbb2015 > >>> >>>>>>>>>>>>> experiments. > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Here's the list of changes from CR3 to > >>> CR4. The list is a bit > >>> >>>>>>>>>>>>> verbose due to the complexity of the > >>> issue, but the changes > >>> >>>>>>>>>>>>> themselves are not that big. > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Functional: > >>> >>>>>>>>>>>>> ? - Change > >>> SafepointSynchronize::is_cleanup_needed() from > >>> >>>>>>>>>>>>> calling > >>> >>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() to > >>> calling > >>> >>>>>>>>>>>>> > >>> ObjectSynchronizer::is_safepoint_deflation_needed(): > >>> >>>>>>>>>>>>> ??? - is_safepoint_deflation_needed() > >>> returns the result of > >>> >>>>>>>>>>>>> monitors_used_above_threshold() for > >>> safepoint based > >>> >>>>>>>>>>>>> ????? monitor deflation > >>> (!AsyncDeflateIdleMonitors). > >>> >>>>>>>>>>>>> ??? - For AsyncDeflateIdleMonitors, it > >>> only returns true if > >>> >>>>>>>>>>>>> ????? there is a special deflation > >>> request, e.g., System.gc() > >>> >>>>>>>>>>>>> ????? - This solves a bug where there are > >>> a bunch of Cleanup > >>> >>>>>>>>>>>>> ??????? safepoints that simply request > >>> async deflation which > >>> >>>>>>>>>>>>> ??????? keeps the async JavaThreads from > >>> making progress on > >>> >>>>>>>>>>>>> ??????? their async deflation work. > >>> >>>>>>>>>>>>> ? - Add AsyncDeflationInterval diagnostic > >>> option. > >>> >>>>>>>>>>>>> Description: > >>> >>>>>>>>>>>>> ????? Async deflate idle monitors every so > >>> many > >>> >>>>>>>>>>>>> milliseconds when > >>> >>>>>>>>>>>>> MonitorUsedDeflationThreshold is exceeded > >>> (0 is off). > >>> >>>>>>>>>>>>> ? - Replace > >>> >>>>>>>>>>>>> > >>> ObjectSynchronizer::gOmShouldDeflateIdleMonitors() with > >>> >>>>>>>>>>>>> > >>> ObjectSynchronizer::is_async_deflation_needed(): > >>> >>>>>>>>>>>>> ??? - is_async_deflation_needed() returns > >>> true when > >>> >>>>>>>>>>>>> is_async_cleanup_requested() is true or when > >>> >>>>>>>>>>>>> monitors_used_above_threshold() is true > >>> (but no more > >>> >>>>>>>>>>>>> often than > >>> >>>>>>>>>>>>> AsyncDeflationInterval). > >>> >>>>>>>>>>>>> ??? - if AsyncDeflateIdleMonitors > >>> Service_lock->wait() now > >>> >>>>>>>>>>>>> waits for > >>> >>>>>>>>>>>>> ????? at most GuaranteedSafepointInterval > >>> millis: > >>> >>>>>>>>>>>>> ????? - This allows > >>> is_async_deflation_needed() to be > >>> >>>>>>>>>>>>> checked at > >>> >>>>>>>>>>>>> ??????? the same interval as > >>> GuaranteedSafepointInterval. > >>> >>>>>>>>>>>>> ??????? (default is 1000 millis/1 second) > >>> >>>>>>>>>>>>> ????? - Once is_async_deflation_needed() > >>> has returned > >>> >>>>>>>>>>>>> true, it > >>> >>>>>>>>>>>>> ??????? generally cannot return true for > >>> >>>>>>>>>>>>> AsyncDeflationInterval. > >>> >>>>>>>>>>>>> ??????? This is to prevent async deflation > >>> from swamping the > >>> >>>>>>>>>>>>> ServiceThread. > >>> >>>>>>>>>>>>> ? - The ServiceThread still handles async > >>> deflation of the > >>> >>>>>>>>>>>>> global > >>> >>>>>>>>>>>>> ??? in-use list and now it also marks > >>> JavaThreads for > >>> >>>>>>>>>>>>> async deflation > >>> >>>>>>>>>>>>> ??? of their in-use lists. > >>> >>>>>>>>>>>>> ??? - The ServiceThread will check for > >>> async deflation > >>> >>>>>>>>>>>>> work every > >>> >>>>>>>>>>>>> GuaranteedSafepointInterval. > >>> >>>>>>>>>>>>> ??? - A safepoint can still cause the > >>> ServiceThread to > >>> >>>>>>>>>>>>> check for > >>> >>>>>>>>>>>>> ????? async deflation work via > >>> is_async_deflation_requested. > >>> >>>>>>>>>>>>> ? - Refactor code from > >>> >>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() into > >>> >>>>>>>>>>>>> monitors_used_above_threshold() and remove > >>> >>>>>>>>>>>>> is_cleanup_needed(). > >>> >>>>>>>>>>>>> ? - In addition to System.gc(), the > >>> VM_Exit VM op and the > >>> >>>>>>>>>>>>> final > >>> >>>>>>>>>>>>> ??? VMThread safepoint now set the > >>> >>>>>>>>>>>>> is_special_deflation_requested > >>> >>>>>>>>>>>>> ??? flag to reduce the in-use monitor > >>> population that is > >>> >>>>>>>>>>>>> reported by > >>> >>>>>>>>>>>>> > >>> ObjectSynchronizer::log_in_use_monitor_details() at VM exit. > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Test update: > >>> >>>>>>>>>>>>> ? - > >>> test/hotspot/gtest/oops/test_markOop.cpp is updated to > >>> >>>>>>>>>>>>> work with > >>> >>>>>>>>>>>>> AsyncDeflateIdleMonitors. > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Collateral: > >>> >>>>>>>>>>>>> ? - Add/clarify/update some logging messages. > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Cleanup: > >>> >>>>>>>>>>>>> ? - Updated comments based on Karen's code > >>> review. > >>> >>>>>>>>>>>>> ? - Change 'special cleanup' -> 'special > >>> deflation' and > >>> >>>>>>>>>>>>> ??? 'async cleanup' -> 'async deflation'. > >>> >>>>>>>>>>>>> ??? - comment and function name changes > >>> >>>>>>>>>>>>> ? - Clarify MonitorUsedDeflationThreshold > >>> description; > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Main bug URL: > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong > >>> safepoints > >>> >>>>>>>>>>>>> > >>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> The project is currently baselined on > >>> jdk-13+22. > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Here's the full webrev URL: > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for- > jdk13.full/ > >>> > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Here's the incremental webrev URL: > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for- > jdk13.inc/ > >>> > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> I have not updated the OpenJDK wiki to > >>> reflect the CR4 > >>> >>>>>>>>>>>>> changes: > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> The wiki doesn't say a whole lot about the > >>> async deflation > >>> >>>>>>>>>>>>> invocation > >>> >>>>>>>>>>>>> mechanism so I have to figure out how to > >>> add that content. > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> This version of the patch has been thru > >>> Mach5 tier[1-8] > >>> >>>>>>>>>>>>> testing on > >>> >>>>>>>>>>>>> Oracle's usual set of platforms. My > >>> Solaris-X64 stress kit > >>> >>>>>>>>>>>>> run is > >>> >>>>>>>>>>>>> running now. Kitchensink8H on product, > >>> fastdebug, and > >>> >>>>>>>>>>>>> slowdebug bits > >>> >>>>>>>>>>>>> are running on Linux-X64, MacOSX and > >>> Solaris-X64. I still > >>> >>>>>>>>>>>>> have to run > >>> >>>>>>>>>>>>> my stress kit on Linux-X64. I still have > >>> to run the > >>> >>>>>>>>>>>>> SPECjbb2015 > >>> >>>>>>>>>>>>> baseline and CR4 runs on Linux-X64, MacOSX > >>> and Solaris-X64. > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Thanks, in advance, for any questions, > >>> comments or > >>> >>>>>>>>>>>>> suggestions. > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> Dan > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> On 5/6/19 11:52 AM, Daniel D. Daugherty wrote: > >>> >>>>>>>>>>>>>> Greetings, > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> I had some discussions with Karen about a > >>> race that was > >>> >>>>>>>>>>>>>> in the > >>> >>>>>>>>>>>>>> ObjectMonitor::enter() code in > >>> CR2/v2.02/5-for-jdk13. > >>> >>>>>>>>>>>>>> This race was > >>> >>>>>>>>>>>>>> theoretical and I had no test failures > >>> due to it. The fix > >>> >>>>>>>>>>>>>> is pretty > >>> >>>>>>>>>>>>>> simple: remove the special case code for > >>> async deflation > >>> >>>>>>>>>>>>>> in the > >>> >>>>>>>>>>>>>> ObjectMonitor::enter() function and rely > >>> solely on the > >>> >>>>>>>>>>>>>> ref_count > >>> >>>>>>>>>>>>>> for ObjectMonitor::enter() protection. > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> During those discussions Karen also > >>> floated the idea of > >>> >>>>>>>>>>>>>> using the > >>> >>>>>>>>>>>>>> ref_count field instead of the > >>> contentions field for the > >>> >>>>>>>>>>>>>> Async > >>> >>>>>>>>>>>>>> Monitor Deflation protocol. I decided to > >>> go ahead and > >>> >>>>>>>>>>>>>> code up that > >>> >>>>>>>>>>>>>> change and I have run it through the > >>> usual stress and > >>> >>>>>>>>>>>>>> Mach5 testing > >>> >>>>>>>>>>>>>> with no issues. It's also known as v2.03 > >>> (for those for > >>> >>>>>>>>>>>>>> with the > >>> >>>>>>>>>>>>>> patches) and as webrev/6-for-jdk13 (for > >>> those with webrev > >>> >>>>>>>>>>>>>> URLs). > >>> >>>>>>>>>>>>>> Sorry for all the names... > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> Main bug URL: > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong > >>> safepoints > >>> >>>>>>>>>>>>>> > >>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> The project is currently baselined on > >>> jdk-13+18. > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> Here's the full webrev URL: > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for- > jdk13.full/ > >>> > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> Here's the incremental webrev URL: > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for- > jdk13.inc/ > >>> > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> I have also updated the OpenJDK wiki to > >>> reflect the CR3 > >>> >>>>>>>>>>>>>> changes: > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> This version of the patch has been thru > >>> Mach5 tier[1-8] > >>> >>>>>>>>>>>>>> testing on > >>> >>>>>>>>>>>>>> Oracle's usual set of platforms. My > >>> Solaris-X64 stress > >>> >>>>>>>>>>>>>> kit run had > >>> >>>>>>>>>>>>>> no issues. Kitchensink8H on product, > >>> fastdebug, and > >>> >>>>>>>>>>>>>> slowdebug bits > >>> >>>>>>>>>>>>>> had no failures on Linux-X64; MacOSX > >>> fastdebug and > >>> >>>>>>>>>>>>>> slowdebug and > >>> >>>>>>>>>>>>>> Solaris-X64 release had the usual "Too > >>> large time diff" > >>> >>>>>>>>>>>>>> complaints. > >>> >>>>>>>>>>>>>> 12 hour Inflate2 runs on product, > >>> fastdebug and slowdebug > >>> >>>>>>>>>>>>>> bits on > >>> >>>>>>>>>>>>>> Linux-X64, MacOSX and Solaris-X64 had no > >>> failures. My > >>> >>>>>>>>>>>>>> Linux-X64 > >>> >>>>>>>>>>>>>> stress kit is running right now. > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> I've done the SPECjbb2015 baseline and > >>> CR3 runs. I need > >>> >>>>>>>>>>>>>> to gather > >>> >>>>>>>>>>>>>> the results and analyze them. > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> Thanks, in advance, for any questions, > >>> comments or > >>> >>>>>>>>>>>>>> suggestions. > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> Dan > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> On 4/25/19 12:38 PM, Daniel D. Daugherty > >>> wrote: > >>> >>>>>>>>>>>>>>> Greetings, > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> I have a small but important bug fix for > >>> the Async > >>> >>>>>>>>>>>>>>> Monitor Deflation > >>> >>>>>>>>>>>>>>> project ready to go. It's also known as > >>> v2.02 (for those > >>> >>>>>>>>>>>>>>> for with the > >>> >>>>>>>>>>>>>>> patches) and as webrev/5-for-jdk13 (for > >>> those with > >>> >>>>>>>>>>>>>>> webrev URLs). Sorry > >>> >>>>>>>>>>>>>>> for all the names... > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> JDK-8222295 was pushed to jdk/jdk two > >>> days ago so that > >>> >>>>>>>>>>>>>>> baseline patch > >>> >>>>>>>>>>>>>>> is out of our hair. > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> Main bug URL: > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong > >>> safepoints > >>> >>>>>>>>>>>>>>> > >>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> The project is currently baselined on > >>> jdk-13+17. > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> Here's the full webrev URL: > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for- > jdk13.full/ > >>> > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> Here's the incremental webrev URL > >>> (JDK-8153224): > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for- > jdk13.inc/ > >>> > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> I still have to update the OpenJDK wiki > >>> to reflect the > >>> >>>>>>>>>>>>>>> CR2 changes: > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> This version of the patch has been thru > >>> Mach5 tier[1-6] > >>> >>>>>>>>>>>>>>> testing on > >>> >>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 > >>> tier[7-8] is > >>> >>>>>>>>>>>>>>> running now. > >>> >>>>>>>>>>>>>>> My stress kit is running on Solaris-X64 > >>> now. > >>> >>>>>>>>>>>>>>> Kitchensink8H is running > >>> >>>>>>>>>>>>>>> now on product, fastdebug, and slowdebug > >>> bits on > >>> >>>>>>>>>>>>>>> Linux-X64, MacOSX > >>> >>>>>>>>>>>>>>> and Solaris-X64. 12 hour Inflate2 runs > >>> are running now > >>> >>>>>>>>>>>>>>> on product, > >>> >>>>>>>>>>>>>>> fastdebug and slowdebug bits on > >>> Linux-X64, MacOSX and > >>> >>>>>>>>>>>>>>> Solaris-X64. > >>> >>>>>>>>>>>>>>> I'll start my my stress kit on Linux-X64 > >>> sometime on > >>> >>>>>>>>>>>>>>> Sunday (after > >>> >>>>>>>>>>>>>>> my jdk-13+18 stress run is done). > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> I'll do SPECjbb2015 baseline and CR2 > >>> runs after all the > >>> >>>>>>>>>>>>>>> stress > >>> >>>>>>>>>>>>>>> testing is done. > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> Thanks, in advance, for any questions, > >>> comments or > >>> >>>>>>>>>>>>>>> suggestions. > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> Dan > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> On 4/19/19 11:58 AM, Daniel D. Daugherty > >>> wrote: > >>> >>>>>>>>>>>>>>>> Greetings, > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> I finally have CR1 for the Async > >>> Monitor Deflation > >>> >>>>>>>>>>>>>>>> project ready to > >>> >>>>>>>>>>>>>>>> go. It's also known as v2.01 (for those > >>> for with the > >>> >>>>>>>>>>>>>>>> patches) and as > >>> >>>>>>>>>>>>>>>> webrev/4-for-jdk13 (for those with > >>> webrev URLs). Sorry > >>> >>>>>>>>>>>>>>>> for all the > >>> >>>>>>>>>>>>>>>> names... > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> Main bug URL: > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong > >>> safepoints > >>> >>>>>>>>>>>>>>>> > >>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> Baseline bug fixes URL: > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> JDK-8222295 more baseline cleanups from > >>> Async > >>> >>>>>>>>>>>>>>>> Monitor Deflation project > >>> >>>>>>>>>>>>>>>> > >>> https://bugs.openjdk.java.net/browse/JDK-8222295 > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> The project is currently baselined on > >>> jdk-13+15. > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> Here's the webrev for the latest > >>> baseline changes > >>> >>>>>>>>>>>>>>>> (JDK-8222295): > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- > jdk13.8222295 > >>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> Here's the full webrev URL (JDK-8153224 > >>> only): > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- > jdk13.full/ > >>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> Here's the incremental webrev URL > >>> (JDK-8153224): > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- > jdk13.inc/ > >>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> So I'm looking for reviews for both > >>> JDK-8222295 and the > >>> >>>>>>>>>>>>>>>> latest version > >>> >>>>>>>>>>>>>>>> of JDK-8153224... > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> I still have to update the OpenJDK wiki > >>> to reflect the > >>> >>>>>>>>>>>>>>>> CR changes: > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> This version of the patch has been thru > >>> Mach5 tier[1-3] > >>> >>>>>>>>>>>>>>>> testing on > >>> >>>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 > >>> tier[4-6] is > >>> >>>>>>>>>>>>>>>> running now and > >>> >>>>>>>>>>>>>>>> Mach5 tier[78] will be run later today. > >>> My stress kit > >>> >>>>>>>>>>>>>>>> on Solaris-X64 > >>> >>>>>>>>>>>>>>>> is running now. Linux-X64 stress > >>> testing will start on > >>> >>>>>>>>>>>>>>>> Sunday. I'm > >>> >>>>>>>>>>>>>>>> planning to do Kitchensink runs, > >>> SPECjbb2015 runs and > >>> >>>>>>>>>>>>>>>> my monitor > >>> >>>>>>>>>>>>>>>> inflation stress tests on Linux-X64, > >>> MacOSX and > >>> >>>>>>>>>>>>>>>> Solaris-X64. > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> Thanks, in advance, for any questions, > >>> comments or > >>> >>>>>>>>>>>>>>>> suggestions. > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> Dan > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> On 3/24/19 9:57 AM, Daniel D. Daugherty > >>> wrote: > >>> >>>>>>>>>>>>>>>>> Greetings, > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> Welcome to the OpenJDK review thread > >>> for my port of > >>> >>>>>>>>>>>>>>>>> Carsten's work on: > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong > >>> safepoints > >>> >>>>>>>>>>>>>>>>> > >>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> Here's a link to the OpenJDK wiki that > >>> describes my port: > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> > >>> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>> > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> Here's the webrev URL: > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for- > jdk13/ > >>> > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> Here's a link to Carsten's original > >>> webrev: > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> > >>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ > >>> > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> Earlier versions of this patch have > >>> been through > >>> >>>>>>>>>>>>>>>>> several rounds of > >>> >>>>>>>>>>>>>>>>> preliminary review. Many thanks to > >>> Carsten, Coleen, > >>> >>>>>>>>>>>>>>>>> Robbin, and > >>> >>>>>>>>>>>>>>>>> Roman for their preliminary code > >>> review comments. A > >>> >>>>>>>>>>>>>>>>> very special > >>> >>>>>>>>>>>>>>>>> thanks to Robbin and Roman for > >>> building and testing > >>> >>>>>>>>>>>>>>>>> the patch in > >>> >>>>>>>>>>>>>>>>> their own environments (including > >>> specJBB2015). > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> This version of the patch has been > >>> thru Mach5 > >>> >>>>>>>>>>>>>>>>> tier[1-8] testing on > >>> >>>>>>>>>>>>>>>>> Oracle's usual set of platforms. > >>> Earlier versions have > >>> >>>>>>>>>>>>>>>>> been run > >>> >>>>>>>>>>>>>>>>> through my stress kit on my Linux-X64 > >>> and Solaris-X64 > >>> >>>>>>>>>>>>>>>>> servers > >>> >>>>>>>>>>>>>>>>> (product, fastdebug, > >>> slowdebug).Earlier versions have > >>> >>>>>>>>>>>>>>>>> run Kitchensink > >>> >>>>>>>>>>>>>>>>> for 12 hours on MacOSX, Linux-X64 and > >>> Solaris-X64 > >>> >>>>>>>>>>>>>>>>> (product, fastdebug > >>> >>>>>>>>>>>>>>>>> and slowdebug). Earlier versions have > >>> run my monitor > >>> >>>>>>>>>>>>>>>>> inflation stress > >>> >>>>>>>>>>>>>>>>> tests for 12 hours on MacOSX, > >>> Linux-X64 and > >>> >>>>>>>>>>>>>>>>> Solaris-X64 (product, > >>> >>>>>>>>>>>>>>>>> fastdebug and slowdebug). > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> All of the testing done on earlier > >>> versions will be > >>> >>>>>>>>>>>>>>>>> redone on the > >>> >>>>>>>>>>>>>>>>> latest version of the patch. > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> Thanks, in advance, for any questions, > >>> comments or > >>> >>>>>>>>>>>>>>>>> suggestions. > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> Dan > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>>> P.S. > >>> >>>>>>>>>>>>>>>>> One subtest in > >>> >>>>>>>>>>>>>>>>> > >>> gc/g1/humongousObjects/TestHumongousClassLoader.java > >>> >>>>>>>>>>>>>>>>> is currently failing in -Xcomp mode on > >>> Win* only. I've > >>> >>>>>>>>>>>>>>>>> been trying > >>> >>>>>>>>>>>>>>>>> to characterize/analyze this failure > >>> for more than a > >>> >>>>>>>>>>>>>>>>> week now. At > >>> >>>>>>>>>>>>>>>>> this point I'm convinced that Async > >>> Monitor Deflation > >>> >>>>>>>>>>>>>>>>> is aggravating > >>> >>>>>>>>>>>>>>>>> an existing bug. However, I plan to > >>> have a better > >>> >>>>>>>>>>>>>>>>> handle on that > >>> >>>>>>>>>>>>>>>>> failure before these bits are pushed > >>> to the jdk/jdk repo. > >>> >>>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>>> > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>>> > >>> >>>>>>>>>>>> > >>> >>>>>>>>>>> > >>> >>>>>>>>>> > >>> >>>>>>>>> > >>> >>>>>>>> > >>> >>>>>>> > >>> >>>>>> > >>> >>>>> > >>> >>>> > >>> >>> > >>> >> > >>> > >> > > From daniel.daugherty at oracle.com Tue Sep 15 14:59:14 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 15 Sep 2020 10:59:14 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints (CR14/v2.14/17-for-jdk15) In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <2a8976f7-37e0-03b9-3099-e07464e46512@oracle.com> <5681d640-08c8-3433-0f85-3f23eea69e87@oracle.com> <029b596d-46e5-9fa7-38fd-c34d3a32987b@oracle.com> <54fd4f9c-afef-0819-de2d-b81b25fa6c22@oracle.com> <79bfb73a-20d7-7b85-4a84-dd22b150ed0d@oracle.com> <9fcc131c-dfbd-b943-381b-2ea8d854fcd7@oracle.com> <3b76e50f-fd8f-d61b-272a-27338df99094@oracle.com> <1d6a6087-b82d-96cf-bfb4-87cb03869bd6@oracle.com> Message-ID: <158ce5cc-3869-737b-b644-a82c962949cc@oracle.com> Hi Martin, I believe that the support_IRIW_for_not_multiple_copy_atomic_cpu stuff came from Erik O. so I'm adding him to this email thread. Yes, please create an issue that describes the problem and we'll figure out who should take the issue... Dan On 9/15/20 10:52 AM, Doerr, Martin wrote: > Hi Dan and Carsten, > > I just noticed that this change introduced 2 usages of "support_IRIW_for_not_multiple_copy_atomic_cpu". > I think this is incorrect for arm32 which is not multi-copy-atomic, but uses support_IRIW_for_not_multiple_copy_atomic_cpu = false. > You probably meant "#ifdef CPU_MULTI_COPY_ATOMIC"? > > I haven't studied the access patterns you were trying to fix, but this looks wrong. > Should I create an issue? Would be great if I could assign it to somebody familiar with this new code. > > Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-runtime-dev > bounces at openjdk.java.net> On Behalf Of Daniel D. Daugherty >> Sent: Dienstag, 2. Juni 2020 21:25 >> To: Carsten Varming >> Cc: Roman Kennke ; hotspot-runtime- >> dev at openjdk.java.net >> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >> (CR14/v2.14/17-for-jdk15) >> >> Hi Carsten, >> >> Thanks for the fast review of the updated comments. >> >> I filed the following new bug to track the change: >> >> ??? JDK-8246359 clarify confusing comment in ObjectMonitor::EnterI()'s >> ??????????????? race with async deflation >> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> And I started a review thread for the fix under that new bug ID. >> >> Dan >> >> >> On 6/2/20 2:13 PM, Carsten Varming wrote: >>> Hi Dan, >>> >>> I like the new comment. Thank you for doing the update. >>> >>> Carsten >>> >>> On Tue, Jun 2, 2020 at 1:54 PM Daniel D. Daugherty >>> > >> wrote: >>> Hi Carsten, >>> >>> See replies below... >>> >>> David, Erik and Robbin, if you folks could also check out the revised >>> comment below that would be appreciated. >>> >>> >>> On 6/2/20 9:39 AM, Carsten Varming wrote: >>>> Hi Dan, >>>> >>>> See inline. >>>> >>>> On Mon, Jun 1, 2020 at 11:32 PM Daniel D. Daugherty >>>> >>> > wrote: >>>> >>>> Hi Carsten, >>>> >>>> Thanks for chiming in on this review thread!! >>>> >>>> >>>> It is my pleasure. You know the code is solid when the discussion >>>> is focused on the comments. >>> So true, so very true! >>> >>> >>>> On 6/1/20 10:41 PM, Carsten Varming wrote: >>>>> Hi Dan, >>>>> >>>>> I like the new protocol, but I had to think about how the >>>>> extra increment to _contentions replaced the check on _owner >>>>> that I originally?added. >>>> Right. The check on _owner was described in detail in the >>>> OpenJDK wiki >>>> subsection that was called "T-enter Wins By A-B-A". It can >>>> still be >>>> found by going thru the wiki's history links. >>>> >>>> That subsection was renamed and rewritten and can be found here: >>>> >>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#A >> syncMonitorDeflation-T- >> enterWinsByCancellationViaDEFLATER_MARKERSwap >>>> >>>>> I am thinking that the increased _contention value is a >>>>> little mark left on the ObjectMonitor to signal to the >>>>> deflater thread (which must be in the middle of trying to >>>>> acquire the object monitor as _owner was set to >>>>> DEFLATER_MARKER) that the deflater thread lost the race. >>>> That is exactly what the extra increment is being used for. >>>> >>>> In my reply to David H. that you quoted below, I describe the >>>> progression >>>> of contention values thru the two possible race scenarios. >>>> The progression >>>> shows the T-enter thread winning the race and marking the >>>> contention field >>>> with the extra increment while the T-deflater thread >>>> recognizes that it has >>>> lost the race and unmarks the contention field with an extra >>>> decrement. >>>> >>>> >>>> I noticed that. Looks like David and I were racing and David won. :) >>>> >>>>> That little mark stays with the object monitor long after >>>>> the thread is done with the monitor. >>>> The "little mark" stays with the ObjectMonitor after T-enter >>>> is done >>>> entering until the T-deflater thread recognizes that the >>>> async deflation >>>> was canceled and does an extra decrement. I don't think I >>>> would describe >>>> it as "long after". >>>> >>>> >>>> Sorry about the use of "long after". When I think about the >>>> correctness of protocols, like the deflation protocol, I end up >>>> thinking about sequences of instructions and the relevant >>>> interleavings. In that context I often end up using phrases like >>>> "long after" and "after" to mean anything after a particular >>>> instruction. I did not mean to imply anything about the relative >>>> speed of the execution of the code. >>> It's okay. I do something similar in the transaction diagrams that >>> I use to work out timing issues: ... >>> >>> The only point that I was trying to make is that the T-deflate thread >>> is responsible for cleaning up the extra mark and it's committed to >>> the code path that will result in the cleanup. Yes, there may be a >>> between the time that T-deflate recognizes that async >>> deflation was canceled and when T-deflate does the extra decrement, >>> but I don't see any harm in it. >>> >>> >>>>> It might be worth adding a comment to the code explaining >>>>> that after the increment, the _contention field can only be >>>>> set to 0 by a corresponding decrement in the async deflater >>>>> thread, ensuring that the >>>>> Atomic::cmpxchg(&mid->_contentions, (jint)0, -max_jint)?on >>>>> line 2166 fails. In particular, the comment: >>>>> +. // .... We bump contentions an >>>>> + // extra time to prevent the async deflater thread from >>>>> temporarily >>>>> + // changing it to -max_jint and back to zero (no flicker >>>>> to confuse >>>>> + // is_being_async_deflated() >>>>> confused me as after the deflater thread sets _contentions >>>>> to -max_jint, the?deflater thread has won the race and the >>>>> object monitor is about to be deflated. >>>> For context, here's the code and comment being discussed: >>>> >>>>> 527 if (AsyncDeflateIdleMonitors && >>>>> 528 try_set_owner_from(DEFLATER_MARKER, Self) == >> DEFLATER_MARKER) { >>>>> 529 // Cancelled the in-progress async deflation. We bump >>>>> contentions an >>>>> 530 // extra time to prevent the async deflater thread from >>>>> temporarily >>>>> 531 // changing it to -max_jint and back to zero (no flicker >>>>> to confuse >>>>> 532 // is_being_async_deflated()). The async deflater thread >>>>> will >>>>> 533 // decrement contentions after it recognizes that the async >>>>> 534 // deflation was cancelled. >>>>> 535 add_to_contentions(1); >>>> This part of the new comment: >>>> >>>> ?532???? // ...? The async deflater thread will >>>> ?533???? // decrement contentions after it recognizes that >>>> the async >>>> ?534???? // deflation was cancelled. >>>> >>>> makes it clear that the async deflater thread does the >>>> corresponding decrement >>>> to the increment done by the T-enter thread so that covers >>>> this part of your >>>> comment above: >>>> >>>> ??? the _contention field can only be set to 0 by a >>>> corresponding decrement >>>> ??? in the async deflater thread >>>> >>>> This part of the new comment: >>>> >>>> ?529???? // ...? We bump contentions an >>>> ?530???? // extra time to prevent the async deflater thread >>>> from temporarily >>>> ?531???? // changing it to -max_jint and back to zero (no >>>> flicker to confuse >>>> ?532???? // is_being_async_deflated()). >>>> >>>> makes it clear that we're keeping make-contentions-negative >>>> part of the >>>> async deflation protocol from happening so that covers this >>>> part of your >>>> comment above: >>>> >>>> ??? ensuring that the Atomic::cmpxchg(&mid->_contentions, >>>> (jint)0, -max_jint) >>>> ??? on line 2166 fails. >>>> >>>> This part of your comment above makes it clear where the >>>> confusion arises: >>>> >>>> ??? confused me as after the deflater thread sets >>>> _contentions to -max_jint, >>>> ??? the deflater thread has won the race and the object >>>> monitor is about to >>>> ??? be deflated. >>>> >>>> Your original algorithm is a three-part async deflation protocol: >>>> >>>> Part 1 - set owner field to DEFLATER marker >>>> Part 2 - make a zero contentions field -max_jint >>>> Part 3 - check to see if the owner field is still DEFLATER_MARKER >>>> >>>> If part 3 fails, then the contentions field that is currently >>>> negative >>>> has max_jint added to it to complete the bail out process. >>>> It's that >>>> third part that makes the contentions field flicker from: >>>> >>>> ??? 0 -> -max_jint -> 0 >>>> >>>> And the extra contentions increment in the new two part >>>> protocol solves >>>> that flicker and allows us to treat (contentions < 0) as a >>>> linearization >>>> point. >>>> >>>> Please let me know if this clarifies your concern. >>>> >>>> >>>> I am no?longer confused, but the cause of my confusion is still >>>> present in the comment. >>>> >>>> This group knows about the three part algorithm, but when the >>>> code is pushed there is no representation of the three part >>>> algorithm in the code or repository. >>> That's a really good point and a side effect of my living with this >>> code for a very long time... >>> >>> >>>> I forgot the details of the algorithm and read the latest version >>>> of the code to figure out what the flickering was about. As you >>>> would expect, I found that there is no way the code can cause the >>>> flicker mentioned. That made me worried. I started to question >>>> myself: What can?cause the behavior that is described in the >>>> comments? What am I missing? As a result, I think it is best if >>>> we keep the flickering to ourselves and update the comment to >>>> describe that because _owner was DEFLATER_MARKER the deflation >>>> thread must be in the middle of the protocol for deflating the >>>> object monitor, and in particular, incrementing _contentions >>>> ensures the failure of the final CAS in the deflation protocol >>>> (final in the protocol implemented in the code). >>> The above is a more clear expression of your concerns and I agree. >>> >>> >>>> To be clear: >>>> >>>> > 529 // Cancelled the in-progress async deflation. >>>> >>>> I would expend this comment by mentioning that the deflator >>>> thread cannot win the last part of the 2-part deflation protocol >>>> as 0 < _contentions (pre-condition to this method). >>>> >>>> > We bump contentions an >>>> > 530 // extra time to prevent the async deflater thread from >>>> temporarily >>>> > 531 // changing it to -max_jint and back to zero (no flicker to >>>> confuse >>>> > 532 // is_being_async_deflated()). >>>> >>>> I would replace this part with something along the lines of: We >>>> bump contentions an extra time to prevent the deflator thread >>>> from winning the last part of the (2-part) deflation protocol >>>> after this thread decrements _contentions as part of the release >>>> of the object monitor. >>>> >>>> > The async deflater thread will >>>> > 533 // decrement contentions after it recognizes that the async >>>> > 534 // deflation was cancelled. >>>> >>>> I would keep this part. >>> So here's my rewrite of the code and comment block: >>> >>> ? if (AsyncDeflateIdleMonitors && >>> ????? try_set_owner_from(DEFLATER_MARKER, Self) == >> DEFLATER_MARKER) { >>> ??? // Cancelled the in-progress async deflation by changing owner >>> from >>> ??? // DEFLATER_MARKER to Self. As part of the contended enter >>> protocol, >>> ??? // contentions was incremented to a positive value before EnterI() >>> ??? // was called and that prevents the deflater thread from >>> winning the >>> ??? // last part of the 2-part async deflation protocol. After >>> EnterI() >>> ??? // returns to enter(), contentions is decremented because the >>> caller >>> ??? // now owns the monitor. We bump contentions an extra time here to >>> ??? // prevent the deflater thread from winning the last part of the >>> ??? // 2-part async deflation protocol after the regular decrement >>> ??? // occurs in enter(). The deflater thread will decrement >>> contentions >>> ??? // after it recognizes that the async deflation was cancelled. >>> ??? add_to_contentions(1); >>> >>> I've made this change to both places in EnterI() that had the original >>> confusing comment. >>> >>> Please let me know if this rewrite works for everyone. >>> >>> Since I've already pushed 8153224, I'll file a new bug to push this >>> clarification once we're all in agreement here. >>> >>> Dan >>> >>> >>>> I hope this helps, >>>> Carsten >>>> >>>>> Otherwise, the code looks great. I am looking forward to >>>>> seeing in the repo. >>>> Thanks! The code should be there soon. >>>> >>>> Dan >>>> >>>> >>>>> Carsten >>>>> >>>>> On Mon, Jun 1, 2020 at 8:32 PM Daniel D. Daugherty >>>>> >>>> > wrote: >>>>> >>>>> Hi David, >>>>> >>>>> On 6/1/20 7:58 PM, David Holmes wrote: >>>>> > Hi Dan, >>>>> > >>>>> > Sorry for the delay. >>>>> >>>>> No worries. It's always worth waiting for your code >>>>> review in general >>>>> and, with the complexity of this project, it's on my >>>>> must-do list! >>>>> >>>>> >>>>> > >>>>> > On 28/05/2020 3:20 am, Daniel D. Daugherty wrote: >>>>> >> Greetings, >>>>> >> >>>>> >> Erik O. had an idea for changing the three part async >>>>> deflation protocol >>>>> >> into a two part async deflation protocol where the >>>>> second part (setting >>>>> >> the contentions field to -max_jint) is a >>>>> linearization point. I've taken >>>>> >> Erik's proposal (which was relative to >>>>> CR12/v2.12/15-for-jdk15), merged >>>>> >> it with CR13/v2.13/16-for-jdk15, and made a few minor >>>>> tweaks. >>>>> >> >>>>> >> I have attached the change list from CR13 to CR14 and >>>>> I've also added a >>>>> >> link to the CR13-to-CR14-changes file to the webrevs >>>>> so it should be >>>>> >> easy >>>>> >> to find. >>>>> >> >>>>> >> Main bug URL: >>>>> >> >>>>> >> ???? JDK-8153224 Monitor deflation prolong safepoints >>>>> >> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >> >>>>> >> The project is currently baselined on jdk-15+24. >>>>> >> >>>>> >> Here's the full webrev URL for those folks that want >>>>> to see all of the >>>>> >> current Async Monitor Deflation code in one go (v2.14 >>>>> full): >>>>> >> >>>>> >> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/17-for- >> jdk15+24.v2.14.full/ >>>>> >> >>>>> >> >>>>> >> Some folks might want to see just what has changed >>>>> since the last review >>>>> >> cycle so here's a webrev for that (v2.14 inc): >>>>> >> >>>>> >> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/17-for- >> jdk15+24.v2.14.inc/ >>>>> > >>>>> > >>>>> > src/hotspot/share/runtime/synchronizer.cpp >>>>> > >>>>> > I'm having a little trouble keeping the _contentions >>>>> relationships in >>>>> > my head. In particular with this change I can't quite >>>>> grok the: >>>>> > >>>>> > // Deferred decrement for the JT EnterI() that >>>>> cancelled the async >>>>> > deflation. >>>>> > mid->add_to_contentions(-1); >>>>> > >>>>> > change. I kind of get EnterI() does an extra increment >>>>> and the >>>>> > deflator thread does the above matching decrement. But >>>>> given the two >>>>> > changes can happen in any order I'm not sure what the >>>>> possible visible >>>>> > values for _contentions will be and how that might >>>>> affect other code >>>>> > inspecting it? >>>>> >>>>> I have a sub-section in the OpenJDK wiki dedicated to >>>>> this particular race: >>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#A >> syncMonitorDeflation-T- >> enterWinsByCancellationViaDEFLATER_MARKERSwap >>>>> In order for this race condition to manifest, the >>>>> T-enter thread has to >>>>> successfully swap the owner field's DEFLATER_MARKER >>>>> value for Self. That >>>>> swap will eventually cause the T-deflate thread to >>>>> realize that the async >>>>> deflation that it started has been canceled. >>>>> >>>>> The diagram shows the progression of contentions values: >>>>> >>>>> - ObjectMonitor box 1 shows contentions == 1 because >>>>> T-enter incremented >>>>> ?? the contentions field >>>>> >>>>> - ObjectMonitor box 2 shows contentions == 2 because >>>>> EnterI() did the >>>>> ?? extra increment. >>>>> >>>>> - ObjectMonitor box 3 shows contentions == 1 because >>>>> T-enter did the >>>>> ?? regular contentions decrement. >>>>> >>>>> - ObjectMonitor box 4 shows contentions == 0 because >>>>> T-deflate did the >>>>> ?? extra contentions decrement. >>>>> >>>>> Now it is possible for T-deflate to do the extra >>>>> decrement before T-enter >>>>> does the extra increment. If I were to add another >>>>> diagram to show that >>>>> variant of the race, that progression of contentions >>>>> values would be: >>>>> >>>>> - ObjectMonitor box 1 shows contentions == 1 because >>>>> T-enter incremented >>>>> ?? the contentions field >>>>> >>>>> - ObjectMonitor box 2 shows contentions == 0 because >>>>> T-deflate did the >>>>> ?? extra contentions decrement. >>>>> >>>>> - ObjectMonitor box 3 shows contentions == 1 because >>>>> EnterI() did the >>>>> ?? extra increment. >>>>> >>>>> - ObjectMonitor box 4 shows contentions == 0 because >>>>> T-enter did the >>>>> ?? regular contentions decrement. >>>>> >>>>> Notice that in this second scenario the contentions >>>>> field never goes >>>>> negative so there's nothing to confuse a potential caller of >>>>> is_being_async_deflated(): >>>>> >>>>> inline bool ObjectMonitor::is_being_async_deflated() { >>>>> ?? return AsyncDeflateIdleMonitors && contentions() < 0; >>>>> } >>>>> >>>>> It is not possible for T-deflate's extra decrement of >>>>> the contentions >>>>> field to make the contentions field negative. That >>>>> decrement only happens >>>>> when T-deflate detects that the async deflation has been >>>>> canceled and >>>>> async deflation can only be canceled after T-enter has >>>>> already made the >>>>> contentions field > 0. >>>>> >>>>> Please let me know if this resolves your concern about: >>>>> >>>>> > // Deferred decrement for the JT EnterI() that >>>>> cancelled the async >>>>> > deflation. >>>>> > mid->add_to_contentions(-1); >>>>> >>>>> I'm not planning to update the OpenJDK wiki to add a >>>>> second variant of >>>>> the cancellation race. Please let me know if that is okay. >>>>> >>>>> > >>>>> > But otherwise the changes in this version seem good >>>>> and overall the >>>>> > protocol seems simpler. >>>>> >>>>> This sounds like a thumbs up, but I'm looking for >>>>> something more definitive. >>>>> >>>>> >>>>> > I'm still going to spend some more time going over the >>>>> complete webrev >>>>> > to get a fuller sense of things. >>>>> >>>>> As always, if you find something after I've pushed, >>>>> we'll deal with it. >>>>> >>>>> Thanks for your many re-reviews for this project!! >>>>> >>>>> Dan >>>>> >>>>> >>>>> > >>>>> > Thanks, >>>>> > David >>>>> > >>>>> >> >>>>> >> >>>>> >> The OpenJDK wiki has been updated for v2.14. >>>>> >> >>>>> >> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >> >>>>> >> The jdk-15+24 based v2.14 version of the patch has >>>>> gone thru Mach5 >>>>> >> Tier[1-5] >>>>> >> testing with no related failures; Mach5 Tier[67] are >>>>> running now and >>>>> >> so far >>>>> >> have no related failures. I'll kick off Mach5 Tier8 >>>>> after the other >>>>> >> tiers >>>>> >> have finished since Mach5 is a bit busy right now. >>>>> >> >>>>> >> I'm also running my usual inflation stress testing on >>>>> Linux-X64 and >>>>> >> macOSX >>>>> >> and so far there are no issues. >>>>> >> >>>>> >> Thanks, in advance, for any questions, comments or >>>>> suggestions. >>>>> >> >>>>> >> Dan >>>>> >> >>>>> >> >>>>> >> On 5/21/20 2:53 PM, Daniel D. Daugherty wrote: >>>>> >>> Greetings, >>>>> >>> >>>>> >>> I have made changes to the Async Monitor Deflation >>>>> code in response to >>>>> >>> the CR12/v2.12/15-for-jdk15 code review cycle. >>>>> Thanks to David H. and >>>>> >>> Erik O. for their OpenJDK reviews in the v2.12 round! >>>>> >>> >>>>> >>> I have attached the change list from CR12 to CR13 >>>>> and I've also added a >>>>> >>> link to the CR12-to-CR13-changes file to the webrevs >>>>> so it should be >>>>> >>> easy >>>>> >>> to find. >>>>> >>> >>>>> >>> Main bug URL: >>>>> >>> >>>>> >>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>> >>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>> >>>>> >>> The project is currently baselined on jdk-15+24. >>>>> >>> >>>>> >>> Here's the full webrev URL for those folks that want >>>>> to see all of the >>>>> >>> current Async Monitor Deflation code in one go >>>>> (v2.13 full): >>>>> >>> >>>>> >>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/16-for- >> jdk15%2b24.v2.13.full/ >>>>> >>> >>>>> >>> >>>>> >>> Some folks might want to see just what has changed >>>>> since the last >>>>> >>> review >>>>> >>> cycle so here's a webrev for that (v2.13 inc): >>>>> >>> >>>>> >>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/16-for- >> jdk15%2b24.v2.13.inc/ >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> The OpenJDK wiki is currently at v2.13 and might >>>>> require minor >>>>> >>> tweaks for v2.12 >>>>> >>> and v2.13. Yes, I need to make yet another crawl >>>>> thru review of it... >>>>> >>> >>>>> >>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>> >>>>> >>> The jdk-15+24 based v2.13 version of the patch is >>>>> going thru the usual >>>>> >>> Mach5 testing right now. It is also going thru my >>>>> usual inflation >>>>> >>> stress >>>>> >>> testing on Linux-X64 and macOSX. >>>>> >>> >>>>> >>> Thanks, in advance, for any questions, comments or >>>>> suggestions. >>>>> >>> >>>>> >>> Dan >>>>> >>> >>>>> >>> On 5/14/20 5:40 PM, Daniel D. Daugherty wrote: >>>>> >>>> Greetings, >>>>> >>>> >>>>> >>>> I have made changes to the Async Monitor Deflation >>>>> code in response to >>>>> >>>> the CR11/v2.11/14-for-jdk15 code review cycle. >>>>> Thanks to David H., >>>>> >>>> Erik O., >>>>> >>>> and Robbin for their OpenJDK reviews in the v2.11 >>>>> round! >>>>> >>>> >>>>> >>>> I have attached the change list from CR11 to CR12 >>>>> and I've also >>>>> >>>> added a >>>>> >>>> link to the CR11-to-CR12-changes file to the >>>>> webrevs so it should >>>>> >>>> be easy >>>>> >>>> to find. >>>>> >>>> >>>>> >>>> Main bug URL: >>>>> >>>> >>>>> >>>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>> >>>>> >>>> The project is currently baselined on jdk-15+23. >>>>> >>>> >>>>> >>>> Here's the full webrev URL for those folks that >>>>> want to see all of the >>>>> >>>> current Async Monitor Deflation code in one go >>>>> (v2.12 full): >>>>> >>>> >>>>> >>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/15-for- >> jdk15%2b23.v2.12.full/ >>>>> >>>> >>>>> >>>> >>>>> >>>> Some folks might want to see just what has changed >>>>> since the last >>>>> >>>> review >>>>> >>>> cycle so here's a webrev for that (v2.12 inc): >>>>> >>>> >>>>> >>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/15-for- >> jdk15%2b23.v2.12.inc/ >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> The OpenJDK wiki is currently at v2.11 and might >>>>> require minor >>>>> >>>> tweaks for v2.12: >>>>> >>>> >>>>> >>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>> >>>>> >>>> The jdk-15+23 based v2.12 version of the patch is >>>>> going thru the usual >>>>> >>>> Mach5 testing right now. >>>>> >>>> >>>>> >>>> Thanks, in advance, for any questions, comments or >>>>> suggestions. >>>>> >>>> >>>>> >>>> Dan >>>>> >>>> >>>>> >>>> >>>>> >>>> On 5/7/20 1:08 PM, Daniel D. Daugherty wrote: >>>>> >>>>> Greetings, >>>>> >>>>> >>>>> >>>>> I have made changes to the Async Monitor Deflation >>>>> code in >>>>> >>>>> response to >>>>> >>>>> the CR10/v2.10/13-for-jdk15 code review cycle and >>>>> DaCapo-h2 perf >>>>> >>>>> testing. >>>>> >>>>> Thanks to Erik O., Robbin and David H. for their >>>>> OpenJDK reviews >>>>> >>>>> in the >>>>> >>>>> v2.10 round! Thanks to Eric C. for his help in >>>>> isolating the >>>>> >>>>> DaCapo-h2 >>>>> >>>>> performance regression. >>>>> >>>>> >>>>> >>>>> With the removal of ref_counting and the >>>>> ObjectMonitorHandle >>>>> >>>>> class, the >>>>> >>>>> Async Monitor Deflation project is now closer to >>>>> Carsten's original >>>>> >>>>> prototype. While ref_counting gave us >>>>> ObjectMonitor* safety >>>>> >>>>> enforced by >>>>> >>>>> code, I saw a ~22.8% slow down with >>>>> -XX:-AsyncDeflateIdleMonitors >>>>> >>>>> ("off" >>>>> >>>>> mode). The slow down with "on" mode >>>>> -XX:+AsyncDeflateIdleMonitors >>>>> >>>>> is ~17%. >>>>> >>>>> >>>>> >>>>> I have attached the change list from CR10 to CR11 >>>>> instead of >>>>> >>>>> putting it in >>>>> >>>>> the body of this email. I've also added a link to the >>>>> >>>>> CR10-to-CR11-changes >>>>> >>>>> file to the webrevs so it should be easy to find. >>>>> >>>>> >>>>> >>>>> Main bug URL: >>>>> >>>>> >>>>> >>>>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>> >>>>> >>>>> The project is currently baselined on jdk-15+21. >>>>> >>>>> >>>>> >>>>> Here's the full webrev URL for those folks that >>>>> want to see all of >>>>> >>>>> the >>>>> >>>>> current Async Monitor Deflation code in one go >>>>> (v2.11 full): >>>>> >>>>> >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for- >> jdk15%2b21.v2.11.full/ >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Some folks might want to see just what has changed >>>>> since the last >>>>> >>>>> review >>>>> >>>>> cycle so here's a webrev for that (v2.11 inc): >>>>> >>>>> >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for- >> jdk15%2b21.v2.11.inc/ >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Because of the removal of ref_counting and the >>>>> ObjectMonitorHandle >>>>> >>>>> class, the >>>>> >>>>> incremental webrev is a bit noisier than I would >>>>> have preferred. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> The OpenJDK wiki has NOT YET been updated for this >>>>> round of changes: >>>>> >>>>> >>>>> >>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>> >>>>> >>>>> The jdk-15+21 based v2.11 version of the patch has >>>>> been thru Mach5 >>>>> >>>>> tier[1-6] >>>>> >>>>> testing on Oracle's usual set of platforms. Mach5 >>>>> tier[78] are >>>>> >>>>> still running. >>>>> >>>>> I'm running the v2.11 patch through my usual set >>>>> of stress testing on >>>>> >>>>> Linux-X64 and macOSX. >>>>> >>>>> >>>>> >>>>> I'm planning to do a SPECjbb2015, DaCapo-h2 and >>>>> volano round on the >>>>> >>>>> CR11/v2.11/14-for-jdk15 bits. >>>>> >>>>> >>>>> >>>>> Thanks, in advance, for any questions, comments or >>>>> suggestions. >>>>> >>>>> >>>>> >>>>> Dan >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 2/26/20 5:22 PM, Daniel D. Daugherty wrote: >>>>> >>>>>> Greetings, >>>>> >>>>>> >>>>> >>>>>> I have made changes to the Async Monitor >>>>> Deflation code in >>>>> >>>>>> response to >>>>> >>>>>> the CR9/v2.09/12-for-jdk14 code review cycle. >>>>> Thanks to Robbin >>>>> >>>>>> and Erik O. >>>>> >>>>>> for their comments in this round! >>>>> >>>>>> >>>>> >>>>>> With the extraction and push of >>>>> {8235931,8236035,8235795} to >>>>> >>>>>> JDK15, the >>>>> >>>>>> Async Monitor Deflation code is back to "just" >>>>> async deflation >>>>> >>>>>> changes! >>>>> >>>>>> >>>>> >>>>>> I have attached the change list from CR9 to CR10 >>>>> instead of >>>>> >>>>>> putting it in >>>>> >>>>>> the body of this email. I've also added a link to >>>>> the >>>>> >>>>>> CR9-to-CR10-changes >>>>> >>>>>> file to the webrevs so it should be easy to find. >>>>> >>>>>> >>>>> >>>>>> Main bug URL: >>>>> >>>>>> >>>>> >>>>>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>> >>>>> >>>>>> The project is currently baselined on jdk-15+11. >>>>> >>>>>> >>>>> >>>>>> Here's the full webrev URL for those folks that >>>>> want to see all >>>>> >>>>>> of the >>>>> >>>>>> current Async Monitor Deflation code in one go >>>>> (v2.10 full): >>>>> >>>>>> >>>>> >>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for- >> jdk15+11.v2.10.full/ >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> Some folks might want to see just what has >>>>> changed since the last >>>>> >>>>>> review >>>>> >>>>>> cycle so here's a webrev for that (v2.10 inc): >>>>> >>>>>> >>>>> >>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for- >> jdk15+11.v2.10.inc/ >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> Since we backed out the >>>>> HandshakeAfterDeflateIdleMonitors option >>>>> >>>>>> and the >>>>> >>>>>> C2 ref_count changes and updated the copyright >>>>> years, the "inc" >>>>> >>>>>> webrev has >>>>> >>>>>> a bit more noise in it than usual. Sorry about that! >>>>> >>>>>> >>>>> >>>>>> The OpenJDK wiki has been updated for this round >>>>> of changes: >>>>> >>>>>> >>>>> >>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> The jdk-15+11 based v2.10 version of the patch >>>>> has been thru >>>>> >>>>>> Mach5 tier[1-7] >>>>> >>>>>> testing on Oracle's usual set of platforms. Mach5 >>>>> tier8 is still >>>>> >>>>>> running. >>>>> >>>>>> I'm running the v2.10 patch through my usual set >>>>> of stress >>>>> >>>>>> testing on >>>>> >>>>>> Linux-X64 and macOSX. >>>>> >>>>>> >>>>> >>>>>> I'm planning to do a SPECjbb2015 round on the >>>>> >>>>>> CR10/v2.20/13-for-jdk15 bits. >>>>> >>>>>> >>>>> >>>>>> Thanks, in advance, for any questions, comments >>>>> or suggestions. >>>>> >>>>>> >>>>> >>>>>> Dan >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> On 2/4/20 9:41 AM, Daniel D. Daugherty wrote: >>>>> >>>>>>> Greetings, >>>>> >>>>>>> >>>>> >>>>>>> This project is no longer targeted to JDK14 so >>>>> this is NOT an >>>>> >>>>>>> urgent code >>>>> >>>>>>> review request. >>>>> >>>>>>> >>>>> >>>>>>> I've extracted the following three fixes from >>>>> the Async Monitor >>>>> >>>>>>> Deflation >>>>> >>>>>>> project code: >>>>> >>>>>>> >>>>> >>>>>>> ? ? JDK-8235931 add OM_CACHE_LINE_SIZE and use >>>>> smaller size on >>>>> >>>>>>> SPARCv9 and X64 >>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235931 >>>>> >>>>>>> >>>>> >>>>>>> ? ? JDK-8236035 refactor >>>>> ObjectMonitor::set_owner() and _owner >>>>> >>>>>>> field setting >>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8236035 >>>>> >>>>>>> >>>>> >>>>>>> ? ? JDK-8235795 replace monitor list >>>>> >>>>>>> mux{Acquire,Release}(&gListLock) with spin locks >>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235795 >>>>> >>>>>>> >>>>> >>>>>>> Each of these has been reviewed separately and >>>>> will be pushed to >>>>> >>>>>>> JDK15 >>>>> >>>>>>> in the near future (possibly by the end of this >>>>> week). Of >>>>> >>>>>>> course, there >>>>> >>>>>>> were improvements during these review cycles and >>>>> the purpose of >>>>> >>>>>>> this >>>>> >>>>>>> e-mail is to provided updated webrevs for this fix >>>>> >>>>>>> (CR9/v2.09/12-for-jdk14) >>>>> >>>>>>> within the revised context provided by {8235931, >>>>> 8236035, 8235795}. >>>>> >>>>>>> >>>>> >>>>>>> Main bug URL: >>>>> >>>>>>> >>>>> >>>>>>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>> >>>>> >>>>>>> The project is currently baselined on jdk-14+34. >>>>> >>>>>>> >>>>> >>>>>>> Here's the full webrev URL for those folks that >>>>> want to see all >>>>> >>>>>>> of the >>>>> >>>>>>> current Async Monitor Deflation code along with >>>>> {8235931, >>>>> >>>>>>> 8236035, 8235795} >>>>> >>>>>>> in one go (v2.09b full): >>>>> >>>>>>> >>>>> >>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- >> jdk14.v2.09b.full/ >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> Compare the open.patch file in >>>>> 12-for-jdk14.v2.09.full and >>>>> >>>>>>> 12-for-jdk14.v2.09b.full >>>>> >>>>>>> using your favorite file comparison/merge tool >>>>> to see how Async >>>>> >>>>>>> Monitor Deflation >>>>> >>>>>>> evolved due to {8235931, 8236035, 8235795}. >>>>> >>>>>>> >>>>> >>>>>>> Some folks might want to see just the Async >>>>> Monitor Deflation >>>>> >>>>>>> code on top of >>>>> >>>>>>> {8235931, 8236035, 8235795} so here's a webrev >>>>> for that (v2.09b >>>>> >>>>>>> inc): >>>>> >>>>>>> >>>>> >>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- >> jdk14.v2.09b.inc/ >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> These webrevs have gone thru several Mach5 >>>>> Tier[1-8] runs along >>>>> >>>>>>> with >>>>> >>>>>>> my usual stress testing and SPECjbb2015 testing >>>>> and there aren't >>>>> >>>>>>> any >>>>> >>>>>>> surprises relative to CR9/v2.09/12-for-jdk14. >>>>> >>>>>>> >>>>> >>>>>>> Thanks, in advance, for any questions, comments >>>>> or suggestions. >>>>> >>>>>>> >>>>> >>>>>>> Dan >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> On 12/11/19 3:41 PM, Daniel D. Daugherty wrote: >>>>> >>>>>>>> Greetings, >>>>> >>>>>>>> >>>>> >>>>>>>> I have made changes to the Async Monitor >>>>> Deflation code in >>>>> >>>>>>>> response to >>>>> >>>>>>>> the CR8/v2.08/11-for-jdk14 code review cycle. >>>>> Thanks to David >>>>> >>>>>>>> H., Robbin >>>>> >>>>>>>> and Erik O. for their comments! >>>>> >>>>>>>> >>>>> >>>>>>>> This project is no longer targeted to JDK14 so >>>>> this is NOT an >>>>> >>>>>>>> urgent code >>>>> >>>>>>>> review request. The primary purpose of this >>>>> webrev is simply to >>>>> >>>>>>>> close the >>>>> >>>>>>>> CR8/v2.08/11-for-jdk14 code review loop and to >>>>> let folks see >>>>> >>>>>>>> how I resolved >>>>> >>>>>>>> the code review comments from that round. >>>>> >>>>>>>> >>>>> >>>>>>>> Most of the comments in the >>>>> CR8/v2.08/11-for-jdk14 code review >>>>> >>>>>>>> cycle were >>>>> >>>>>>>> on the monitor list changes so I'm going to >>>>> take a look at >>>>> >>>>>>>> extracting those >>>>> >>>>>>>> changes into a standalone patch. Switching from >>>>> >>>>>>>> Thread::muxAcquire(&gListLock) >>>>> >>>>>>>> and Thread::muxRelease(&gListLock) to finer >>>>> grained internal >>>>> >>>>>>>> spin locks needs >>>>> >>>>>>>> to be thoroughly reviewed and the best way to >>>>> do that is >>>>> >>>>>>>> separately from the >>>>> >>>>>>>> Async Monitor Deflation changes. Thanks to >>>>> Coleen for >>>>> >>>>>>>> suggesting doing this >>>>> >>>>>>>> extraction earlier. >>>>> >>>>>>>> >>>>> >>>>>>>> I have attached the change list from CR8 to CR9 >>>>> instead of >>>>> >>>>>>>> putting it in >>>>> >>>>>>>> the body of this email. I've also added a link >>>>> to the >>>>> >>>>>>>> CR8-to-CR9-changes >>>>> >>>>>>>> file to the webrevs so it should be easy to find. >>>>> >>>>>>>> >>>>> >>>>>>>> Main bug URL: >>>>> >>>>>>>> >>>>> >>>>>>>> JDK-8153224 Monitor deflation prolong safepoints >>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>>> >>>>> >>>>>>>> The project is currently baselined on jdk-14+26. >>>>> >>>>>>>> >>>>> >>>>>>>> Here's the full webrev URL for those folks that >>>>> want to see all >>>>> >>>>>>>> of the >>>>> >>>>>>>> current Async Monitor Deflation code in one go >>>>> (v2.09 full): >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- >> jdk14.v2.09.full/ >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> Some folks might want to see just what has >>>>> changed since the >>>>> >>>>>>>> last review >>>>> >>>>>>>> cycle so here's a webrev for that (v2.09 inc): >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- >> jdk14.v2.09.inc/ >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> The OpenJDK wiki has NOT yet been updated for >>>>> this round of >>>>> >>>>>>>> changes: >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> The jdk-14+26 based v2.09 version of the patch >>>>> has been thru >>>>> >>>>>>>> Mach5 tier[1-7] >>>>> >>>>>>>> testing on Oracle's usual set of platforms. >>>>> Mach5 tier8 is >>>>> >>>>>>>> still running. >>>>> >>>>>>>> A slightly older version of the v2.09 patch has >>>>> also been >>>>> >>>>>>>> through my usual >>>>> >>>>>>>> set of stress testing on Linux-X64 and macOSX >>>>> with the addition >>>>> >>>>>>>> of Robbin's >>>>> >>>>>>>> "MoCrazy 1024" test running in parallel on >>>>> Linux-X64 with the >>>>> >>>>>>>> other tests in >>>>> >>>>>>>> my lab. The "MoCrazy 1024" has been going for > >>>>> 5 days and >>>>> >>>>>>>> 6700+ iterations >>>>> >>>>>>>> without any failures. >>>>> >>>>>>>> >>>>> >>>>>>>> I'm planning to do a SPECjbb2015 round on the >>>>> >>>>>>>> CR9/v2.09/12-for-jdk14 bits. >>>>> >>>>>>>> >>>>> >>>>>>>> Thanks, in advance, for any questions, comments >>>>> or suggestions. >>>>> >>>>>>>> >>>>> >>>>>>>> Dan >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> On 11/4/19 4:03 PM, Daniel D. Daugherty wrote: >>>>> >>>>>>>>> Greetings, >>>>> >>>>>>>>> >>>>> >>>>>>>>> I have made changes to the Async Monitor >>>>> Deflation code in >>>>> >>>>>>>>> response to >>>>> >>>>>>>>> the CR7/v2.07/10-for-jdk14 code review cycle. >>>>> Thanks to David >>>>> >>>>>>>>> H., Robbin >>>>> >>>>>>>>> and Erik O. for their comments! >>>>> >>>>>>>>> >>>>> >>>>>>>>> JDK14 Rampdown phase one is coming on Dec. 12, >>>>> 2019 and the >>>>> >>>>>>>>> Async Monitor >>>>> >>>>>>>>> Deflation project needs to push before Nov. >>>>> 12, 2019 in order >>>>> >>>>>>>>> to allow >>>>> >>>>>>>>> for sufficient bake time for such a big >>>>> change. Nov. 12 is >>>>> >>>>>>>>> _next_ Tuesday >>>>> >>>>>>>>> so we have 8 days from today to finish this >>>>> code review cycle >>>>> >>>>>>>>> and push >>>>> >>>>>>>>> this code for JDK14. >>>>> >>>>>>>>> >>>>> >>>>>>>>> Carsten and Roman! Time for you guys to chime >>>>> in again on the >>>>> >>>>>>>>> code reviews. >>>>> >>>>>>>>> >>>>> >>>>>>>>> I have attached the change list from CR7 to >>>>> CR8 instead of >>>>> >>>>>>>>> putting it in >>>>> >>>>>>>>> the body of this email. I've also added a link >>>>> to the >>>>> >>>>>>>>> CR7-to-CR8-changes >>>>> >>>>>>>>> file to the webrevs so it should be easy to find. >>>>> >>>>>>>>> >>>>> >>>>>>>>> Main bug URL: >>>>> >>>>>>>>> >>>>> >>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints >>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>>>> >>>>> >>>>>>>>> The project is currently baselined on jdk-14+21. >>>>> >>>>>>>>> >>>>> >>>>>>>>> Here's the full webrev URL for those folks >>>>> that want to see >>>>> >>>>>>>>> all of the >>>>> >>>>>>>>> current Async Monitor Deflation code in one go >>>>> (v2.08 full): >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for- >> jdk14.v2.08.full >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> Some folks might want to see just what has >>>>> changed since the >>>>> >>>>>>>>> last review >>>>> >>>>>>>>> cycle so here's a webrev for that (v2.08 inc): >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for- >> jdk14.v2.08.inc/ >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> The OpenJDK wiki did not need any changes for >>>>> this round: >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> The jdk-14+21 based v2.08 version of the patch >>>>> has been thru >>>>> >>>>>>>>> Mach5 tier[1-8] >>>>> >>>>>>>>> testing on Oracle's usual set of platforms. It >>>>> has also been >>>>> >>>>>>>>> through my usual >>>>> >>>>>>>>> set of stress testing on Linux-X64, macOSX and >>>>> Solaris-X64 >>>>> >>>>>>>>> with the addition >>>>> >>>>>>>>> of Robbin's "MoCrazy 1024" test running in >>>>> parallel with the >>>>> >>>>>>>>> other tests in >>>>> >>>>>>>>> my lab. Some testing is still running, but so >>>>> far there are no >>>>> >>>>>>>>> new regressions. >>>>> >>>>>>>>> >>>>> >>>>>>>>> I have not yet done a SPECjbb2015 round on the >>>>> >>>>>>>>> CR8/v2.08/11-for-jdk14 bits. >>>>> >>>>>>>>> >>>>> >>>>>>>>> Thanks, in advance, for any questions, >>>>> comments or suggestions. >>>>> >>>>>>>>> >>>>> >>>>>>>>> Dan >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> On 10/17/19 5:50 PM, Daniel D. Daugherty wrote: >>>>> >>>>>>>>>> Greetings, >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> The Async Monitor Deflation project is >>>>> reaching the end game. >>>>> >>>>>>>>>> I have no >>>>> >>>>>>>>>> changes planned for the project at this time >>>>> so all that is >>>>> >>>>>>>>>> left is code >>>>> >>>>>>>>>> review and any changes that results from >>>>> those reviews. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Carsten and Roman! Time for you guys to chime >>>>> in again on the >>>>> >>>>>>>>>> code reviews. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> I have attached the list of fixes from CR6 to >>>>> CR7 instead of >>>>> >>>>>>>>>> putting it >>>>> >>>>>>>>>> in the main body of this email. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Main bug URL: >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints >>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> The project is currently baselined on jdk-14+19. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Here's the full webrev URL for those folks >>>>> that want to see >>>>> >>>>>>>>>> all of the >>>>> >>>>>>>>>> current Async Monitor Deflation code in one >>>>> go (v2.07 full): >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for- >> jdk14.v2.07.full >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Some folks might want to see just what has >>>>> changed since the >>>>> >>>>>>>>>> last review >>>>> >>>>>>>>>> cycle so here's a webrev for that (v2.07 inc): >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for- >> jdk14.v2.07.inc/ >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> The OpenJDK wiki has been updated to match the >>>>> >>>>>>>>>> CR7/v2.07/10-for-jdk14 changes: >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> The jdk-14+18 based v2.07 version of the >>>>> patch has been thru >>>>> >>>>>>>>>> Mach5 tier[1-8] >>>>> >>>>>>>>>> testing on Oracle's usual set of platforms. >>>>> It has also been >>>>> >>>>>>>>>> through my usual >>>>> >>>>>>>>>> set of stress testing on Linux-X64, macOSX >>>>> and Solaris-X64 >>>>> >>>>>>>>>> with the addition >>>>> >>>>>>>>>> of Robbin's "MoCrazy 1024" test running in >>>>> parallel with the >>>>> >>>>>>>>>> other tests in >>>>> >>>>>>>>>> my lab. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> The jdk-14+19 based v2.07 version of the >>>>> patch has been thru >>>>> >>>>>>>>>> Mach5 tier[1-3] >>>>> >>>>>>>>>> test on Oracle's usual set of platforms. >>>>> Mach5 tier[4-8] are >>>>> >>>>>>>>>> in process. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> I did another round of SPECjbb2015 testing in >>>>> Oracle's Aurora >>>>> >>>>>>>>>> Performance lab >>>>> >>>>>>>>>> using using their tuned SPECjbb2015 Linux-X64 >>>>> G1 configs: >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> - "base" is jdk-14+18 >>>>> >>>>>>>>>> - "v2.07" is the latest version and includes C2 >>>>> >>>>>>>>>> inc_om_ref_count() support >>>>> >>>>>>>>>> ????? on LP64 X64 and the new >>>>> >>>>>>>>>> HandshakeAfterDeflateIdleMonitors option >>>>> >>>>>>>>>> - "off" is with -XX:-AsyncDeflateIdleMonitors >>>>> specified >>>>> >>>>>>>>>> - "handshake" is with >>>>> >>>>>>>>>> -XX:+HandshakeAfterDeflateIdleMonitors specified >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> ???????? hbIR?????????? hbIR >>>>> >>>>>>>>>> (max attempted)? (settled)? max-jOPS >>>>> critical-jOPS runtime >>>>> >>>>>>>>>> ---------------? ---------? -------- >>>>> ------------- ------- >>>>> >>>>>>>>>> ?????????? 34282.00?? 30635.90? 28831.30 >>>>> 20969.20 3841.30 base >>>>> >>>>>>>>>> ?????????? 34282.00?? 30973.00? 29345.80 >>>>> 21025.20 3964.10 v2.07 >>>>> >>>>>>>>>> ?????????? 34282.00?? 31105.60? 29174.30 >>>>> 21074.00 3931.30 >>>>> >>>>>>>>>> v2.07_handshake >>>>> >>>>>>>>>> ?????????? 34282.00?? 30789.70? 27151.60 >>>>> 19839.10 3850.20 >>>>> >>>>>>>>>> v2.07_off >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> - The Aurora Perf comparison tool reports: >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> ??????? Comparison????????????? max-jOPS >>>>> critical-jOPS >>>>> >>>>>>>>>> ??????? ---------------------- >>>>> -------------------- >>>>> >>>>>>>>>> -------------------- >>>>> >>>>>>>>>> ??????? base vs 2.07??????????? +1.78% (s, >>>>> p=0.000) +0.27% >>>>> >>>>>>>>>> (ns, p=0.790) >>>>> >>>>>>>>>> ??????? base vs 2.07_handshake? +1.19% (s, >>>>> p=0.007) +0.58% >>>>> >>>>>>>>>> (ns, p=0.536) >>>>> >>>>>>>>>> ??????? base vs 2.07_off??????? -5.83% (ns, >>>>> p=0.394) -5.39% >>>>> >>>>>>>>>> (ns, p=0.347) >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> ??????? (s) - significant? (ns) - not-significant >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> - For historical comparison, the Aurora Perf >>>>> comparision >>>>> >>>>>>>>>> tool >>>>> >>>>>>>>>> ??????? reported for v2.06 with a baseline of >>>>> jdk-13+31: >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> ??????? Comparison????????????? max-jOPS >>>>> critical-jOPS >>>>> >>>>>>>>>> ??????? ---------------------- >>>>> -------------------- >>>>> >>>>>>>>>> -------------------- >>>>> >>>>>>>>>> ??????? base vs 2.06??????????? -0.32% (ns, >>>>> p=0.345) +0.71% >>>>> >>>>>>>>>> (ns, p=0.646) >>>>> >>>>>>>>>> ??????? base vs 2.06_off??????? +0.49% (ns, >>>>> p=0.292) -1.21% >>>>> >>>>>>>>>> (ns, p=0.481) >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> ??????? (s) - significant? (ns) - not-significant >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Thanks, in advance, for any questions, >>>>> comments or suggestions. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Dan >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> On 8/28/19 5:02 PM, Daniel D. Daugherty wrote: >>>>> >>>>>>>>>>> Greetings, >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> The Async Monitor Deflation project has >>>>> rebased to JDK14 so >>>>> >>>>>>>>>>> it's time >>>>> >>>>>>>>>>> for our first code review in that new context!! >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> I've been focused on changing the monitor >>>>> list management >>>>> >>>>>>>>>>> code to be >>>>> >>>>>>>>>>> lock-free in order to make SPECjbb2015 >>>>> happier. Of course >>>>> >>>>>>>>>>> with a change >>>>> >>>>>>>>>>> like that, it takes a while to chase down >>>>> all the new and >>>>> >>>>>>>>>>> wonderful >>>>> >>>>>>>>>>> races. At this point, I have the code back >>>>> to the same >>>>> >>>>>>>>>>> stability that >>>>> >>>>>>>>>>> I had with CR5/v2.05/8-for-jdk13. >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> To lay the ground work for this round of >>>>> review, I pushed >>>>> >>>>>>>>>>> the following >>>>> >>>>>>>>>>> two fixes to jdk/jdk earlier today: >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> ??? JDK-8230184 rename, whitespace, indent >>>>> and comments >>>>> >>>>>>>>>>> changes in preparation >>>>> >>>>>>>>>>> ? ? ??????????? for lock free Monitor lists >>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- >> 8230184 >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> ??? JDK-8230317 >>>>> serviceability/sa/ClhsdbPrintStatics.java >>>>> >>>>>>>>>>> fails after 8230184 >>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- >> 8230317 >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> I have attached the list of fixes from CR5 >>>>> to CR6 instead of >>>>> >>>>>>>>>>> putting >>>>> >>>>>>>>>>> in the main body of this email. >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Main bug URL: >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong >>>>> safepoints >>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- >> 8153224 >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> The project is currently baselined on >>>>> jdk-14+11 plus the >>>>> >>>>>>>>>>> fixes for >>>>> >>>>>>>>>>> JDK-8230184 and JDK-8230317. >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Here's the full webrev URL for those folks >>>>> that want to see >>>>> >>>>>>>>>>> all of the >>>>> >>>>>>>>>>> current Async Monitor Deflation code in one >>>>> go (v2.06 full): >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >> jdk14.v2.06.full/ >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> The primary focus of this review cycle is on >>>>> the lock-free >>>>> >>>>>>>>>>> Monitor List >>>>> >>>>>>>>>>> management changes so here's a webrev for >>>>> just that patch >>>>> >>>>>>>>>>> (v2.06c): >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >> jdk14.v2.06c.inc/ >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> The secondary focus of this review cycle is >>>>> on the bug fixes >>>>> >>>>>>>>>>> that have >>>>> >>>>>>>>>>> been made since CR5/v2.05/8-for-jdk13 so >>>>> here's a webrev for >>>>> >>>>>>>>>>> just that >>>>> >>>>>>>>>>> patch (v2.06b): >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >> jdk14.v2.06b.inc/ >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> The third and final bucket for this review >>>>> cycle is the >>>>> >>>>>>>>>>> rename, whitespace, >>>>> >>>>>>>>>>> indent and comments changes made in >>>>> preparation for lock >>>>> >>>>>>>>>>> free Monitor list >>>>> >>>>>>>>>>> management. Almost all of that was extracted >>>>> into >>>>> >>>>>>>>>>> JDK-8230184 for the >>>>> >>>>>>>>>>> baseline so this bucket now has just a few >>>>> comment changes >>>>> >>>>>>>>>>> relative to >>>>> >>>>>>>>>>> CR5/v2.05/8-for-jdk13. Here's a webrev for >>>>> the remainder >>>>> >>>>>>>>>>> (v2.06a): >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >> jdk14.v2.06a.inc/ >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Some folks might want to see just what has >>>>> changed since the >>>>> >>>>>>>>>>> last review >>>>> >>>>>>>>>>> cycle so here's a webrev for that (v2.06 inc): >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >> jdk14.v2.06.inc/ >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Last, but not least, some folks might want >>>>> to see the code >>>>> >>>>>>>>>>> before the >>>>> >>>>>>>>>>> addition of lock-free Monitor List >>>>> management so here's a >>>>> >>>>>>>>>>> webrev for >>>>> >>>>>>>>>>> that (v2.00 -> v2.05): >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >> jdk14.v2.05.inc/ >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> The OpenJDK wiki will need minor updates to >>>>> match the CR6 >>>>> >>>>>>>>>>> changes: >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> but that should only be changes to describe >>>>> per-thread list >>>>> >>>>>>>>>>> async monitor >>>>> >>>>>>>>>>> deflation being done by the ServiceThread. >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> (I did update the OpenJDK wiki for the CR5 >>>>> changes back on >>>>> >>>>>>>>>>> 2019.08.14) >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> This version of the patch has been thru >>>>> Mach5 tier[1-8] >>>>> >>>>>>>>>>> testing on >>>>> >>>>>>>>>>> Oracle's usual set of platforms. It has also >>>>> been through my >>>>> >>>>>>>>>>> usual set >>>>> >>>>>>>>>>> of stress testing on Linux-X64, macOSX and >>>>> Solaris-X64. >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> I did a bunch of SPECjbb2015 testing in >>>>> Oracle's Aurora >>>>> >>>>>>>>>>> Performance lab >>>>> >>>>>>>>>>> using using their tuned SPECjbb2015 >>>>> Linux-X64 G1 configs. >>>>> >>>>>>>>>>> This was using >>>>> >>>>>>>>>>> this patch baselined on jdk-13+31 (for >>>>> stability): >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> ????????? hbIR?????????? hbIR >>>>> >>>>>>>>>>> ???? (max attempted)? (settled)? max-jOPS >>>>> critical-jOPS runtime >>>>> >>>>>>>>>>> ???? ---------------? ---------? -------- >>>>> ------------- ------- >>>>> >>>>>>>>>>> ??????????? 34282.00?? 28837.20? 27905.20 >>>>> 19817.40 3658.10 base >>>>> >>>>>>>>>>> ??????????? 34965.70?? 29798.80? 27814.90 >>>>> 19959.00 3514.60 >>>>> >>>>>>>>>>> v2.06d >>>>> >>>>>>>>>>> ??????????? 34282.00?? 29100.70? 28042.50 >>>>> 19577.00 3701.90 >>>>> >>>>>>>>>>> v2.06d_off >>>>> >>>>>>>>>>> ??????????? 34282.00?? 29218.50? 27562.80 >>>>> 19397.30 3657.60 >>>>> >>>>>>>>>>> v2.06d_ocache >>>>> >>>>>>>>>>> ??????????? 34965.70?? 29838.30? 26512.40 >>>>> 19170.60 3569.90 >>>>> >>>>>>>>>>> v2.05 >>>>> >>>>>>>>>>> ??????????? 34282.00?? 28926.10? 27734.00 >>>>> 19835.10 3588.40 >>>>> >>>>>>>>>>> v2.05_off >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> The "off" configs are with >>>>> -XX:-AsyncDeflateIdleMonitors >>>>> >>>>>>>>>>> specified and >>>>> >>>>>>>>>>> the "ocache" config is with 128 byte cache >>>>> line sizes >>>>> >>>>>>>>>>> instead of 64 byte >>>>> >>>>>>>>>>> cache lines sizes. "v2.06d" is the last set >>>>> of changes that >>>>> >>>>>>>>>>> I made before >>>>> >>>>>>>>>>> those changes were distributed into the >>>>> "v2.06a", "v2.06b" >>>>> >>>>>>>>>>> and "v2.06c" >>>>> >>>>>>>>>>> buckets for this review recycle. >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Thanks, in advance, for any questions, >>>>> comments or suggestions. >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Dan >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> On 7/11/19 3:49 PM, Daniel D. Daugherty wrote: >>>>> >>>>>>>>>>>> Greetings, >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> I've been focused on chasing down and >>>>> fixing the rare test >>>>> >>>>>>>>>>>> failures >>>>> >>>>>>>>>>>> that only pop up rarely. So this round is >>>>> primarily fixes >>>>> >>>>>>>>>>>> for races >>>>> >>>>>>>>>>>> with a few additional fixes that came from >>>>> Karen's review >>>>> >>>>>>>>>>>> of CR4. >>>>> >>>>>>>>>>>> Thanks Karen! >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> I have attached the list of fixes from CR4 >>>>> to CR5 instead >>>>> >>>>>>>>>>>> of putting >>>>> >>>>>>>>>>>> in the main body of this email. >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Main bug URL: >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong >>>>> safepoints >>>>> >>>>>>>>>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> The project is currently baselined on >>>>> jdk-13+29. This will >>>>> >>>>>>>>>>>> likely be >>>>> >>>>>>>>>>>> the last JDK13 baseline for this project >>>>> and I'll roll to >>>>> >>>>>>>>>>>> the JDK14 >>>>> >>>>>>>>>>>> (jdk/jdk) repo soon... >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Here's the full webrev URL: >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for- >> jdk13.full/ >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Here's the incremental webrev URL: >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for- >> jdk13.inc/ >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> I have not yet checked the OpenJDK wiki to >>>>> see if it needs >>>>> >>>>>>>>>>>> any updates >>>>> >>>>>>>>>>>> to match the CR5 changes: >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> (I did update the OpenJDK wiki for the CR4 >>>>> changes back on >>>>> >>>>>>>>>>>> 2019.06.26) >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> This version of the patch has been thru >>>>> Mach5 tier[1-3] >>>>> >>>>>>>>>>>> testing on >>>>> >>>>>>>>>>>> Oracle's usual set of platforms. Mach5 >>>>> tier[4-6] is running >>>>> >>>>>>>>>>>> now and >>>>> >>>>>>>>>>>> Mach5 tier[78] will follow. I'll kick off >>>>> the usual stress >>>>> >>>>>>>>>>>> testing >>>>> >>>>>>>>>>>> on Linux-X64, macOSX and Solaris-X64 as >>>>> those machines >>>>> >>>>>>>>>>>> become available. >>>>> >>>>>>>>>>>> Since I haven't made any performance >>>>> changes in this round, >>>>> >>>>>>>>>>>> I'll only >>>>> >>>>>>>>>>>> be running SPECjbb2015 to gather the latest >>>>> >>>>>>>>>>>> monitorinflation logs. >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Next up: >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> - We're still seeing 4-5% lower performance >>>>> with >>>>> >>>>>>>>>>>> SPECjbb2015 on >>>>> >>>>>>>>>>>> ? Linux-X64 and we've determined that some >>>>> of that comes from >>>>> >>>>>>>>>>>> ? contention on the gListLock. So I'm going >>>>> to investigate >>>>> >>>>>>>>>>>> removing >>>>> >>>>>>>>>>>> ? the gListLock. Yes, another lock free set >>>>> of changes is >>>>> >>>>>>>>>>>> coming! >>>>> >>>>>>>>>>>> - Of course, going lock free often causes >>>>> new races and new >>>>> >>>>>>>>>>>> failures >>>>> >>>>>>>>>>>> ? so that's a good reason for make those >>>>> changes isolated >>>>> >>>>>>>>>>>> in their >>>>> >>>>>>>>>>>> ? own round (and not holding up >>>>> CR5/v2.05/8-for-jdk13 >>>>> >>>>>>>>>>>> anymore). >>>>> >>>>>>>>>>>> - I finally have a potential fix for the >>>>> Win* failure with >>>>> >>>>>>>>>>>> >>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>> >>>>>>>>>>>> ? but I haven't run it through Mach5 yet so >>>>> it'll be in the >>>>> >>>>>>>>>>>> next round. >>>>> >>>>>>>>>>>> - Some RTM tests were recently re-enabled >>>>> in Mach5 and I'm >>>>> >>>>>>>>>>>> seeing some >>>>> >>>>>>>>>>>> ? monitor related failures there. I suspect >>>>> that I need to >>>>> >>>>>>>>>>>> go take a >>>>> >>>>>>>>>>>> ? look at the C2 RTM macro assembler code >>>>> and look for >>>>> >>>>>>>>>>>> things that might >>>>> >>>>>>>>>>>> ? conflict if Async Monitor Deflation. If >>>>> you're interested >>>>> >>>>>>>>>>>> in that kind >>>>> >>>>>>>>>>>> ? of issue, then see the >>>>> macroAssembler_x86.cpp sanity >>>>> >>>>>>>>>>>> check that I >>>>> >>>>>>>>>>>> ? added in this round! >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Thanks, in advance, for any questions, >>>>> comments or >>>>> >>>>>>>>>>>> suggestions. >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Dan >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> On 5/26/19 8:30 PM, Daniel D. Daugherty wrote: >>>>> >>>>>>>>>>>>> Greetings, >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> I have a fix for an issue that came up >>>>> during performance >>>>> >>>>>>>>>>>>> testing. >>>>> >>>>>>>>>>>>> Many thanks to Robbin for diagnosing the >>>>> issue in his >>>>> >>>>>>>>>>>>> SPECjbb2015 >>>>> >>>>>>>>>>>>> experiments. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Here's the list of changes from CR3 to >>>>> CR4. The list is a bit >>>>> >>>>>>>>>>>>> verbose due to the complexity of the >>>>> issue, but the changes >>>>> >>>>>>>>>>>>> themselves are not that big. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Functional: >>>>> >>>>>>>>>>>>> ? - Change >>>>> SafepointSynchronize::is_cleanup_needed() from >>>>> >>>>>>>>>>>>> calling >>>>> >>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() to >>>>> calling >>>>> >>>>>>>>>>>>> >>>>> ObjectSynchronizer::is_safepoint_deflation_needed(): >>>>> >>>>>>>>>>>>> ??? - is_safepoint_deflation_needed() >>>>> returns the result of >>>>> >>>>>>>>>>>>> monitors_used_above_threshold() for >>>>> safepoint based >>>>> >>>>>>>>>>>>> ????? monitor deflation >>>>> (!AsyncDeflateIdleMonitors). >>>>> >>>>>>>>>>>>> ??? - For AsyncDeflateIdleMonitors, it >>>>> only returns true if >>>>> >>>>>>>>>>>>> ????? there is a special deflation >>>>> request, e.g., System.gc() >>>>> >>>>>>>>>>>>> ????? - This solves a bug where there are >>>>> a bunch of Cleanup >>>>> >>>>>>>>>>>>> ??????? safepoints that simply request >>>>> async deflation which >>>>> >>>>>>>>>>>>> ??????? keeps the async JavaThreads from >>>>> making progress on >>>>> >>>>>>>>>>>>> ??????? their async deflation work. >>>>> >>>>>>>>>>>>> ? - Add AsyncDeflationInterval diagnostic >>>>> option. >>>>> >>>>>>>>>>>>> Description: >>>>> >>>>>>>>>>>>> ????? Async deflate idle monitors every so >>>>> many >>>>> >>>>>>>>>>>>> milliseconds when >>>>> >>>>>>>>>>>>> MonitorUsedDeflationThreshold is exceeded >>>>> (0 is off). >>>>> >>>>>>>>>>>>> ? - Replace >>>>> >>>>>>>>>>>>> >>>>> ObjectSynchronizer::gOmShouldDeflateIdleMonitors() with >>>>> >>>>>>>>>>>>> >>>>> ObjectSynchronizer::is_async_deflation_needed(): >>>>> >>>>>>>>>>>>> ??? - is_async_deflation_needed() returns >>>>> true when >>>>> >>>>>>>>>>>>> is_async_cleanup_requested() is true or when >>>>> >>>>>>>>>>>>> monitors_used_above_threshold() is true >>>>> (but no more >>>>> >>>>>>>>>>>>> often than >>>>> >>>>>>>>>>>>> AsyncDeflationInterval). >>>>> >>>>>>>>>>>>> ??? - if AsyncDeflateIdleMonitors >>>>> Service_lock->wait() now >>>>> >>>>>>>>>>>>> waits for >>>>> >>>>>>>>>>>>> ????? at most GuaranteedSafepointInterval >>>>> millis: >>>>> >>>>>>>>>>>>> ????? - This allows >>>>> is_async_deflation_needed() to be >>>>> >>>>>>>>>>>>> checked at >>>>> >>>>>>>>>>>>> ??????? the same interval as >>>>> GuaranteedSafepointInterval. >>>>> >>>>>>>>>>>>> ??????? (default is 1000 millis/1 second) >>>>> >>>>>>>>>>>>> ????? - Once is_async_deflation_needed() >>>>> has returned >>>>> >>>>>>>>>>>>> true, it >>>>> >>>>>>>>>>>>> ??????? generally cannot return true for >>>>> >>>>>>>>>>>>> AsyncDeflationInterval. >>>>> >>>>>>>>>>>>> ??????? This is to prevent async deflation >>>>> from swamping the >>>>> >>>>>>>>>>>>> ServiceThread. >>>>> >>>>>>>>>>>>> ? - The ServiceThread still handles async >>>>> deflation of the >>>>> >>>>>>>>>>>>> global >>>>> >>>>>>>>>>>>> ??? in-use list and now it also marks >>>>> JavaThreads for >>>>> >>>>>>>>>>>>> async deflation >>>>> >>>>>>>>>>>>> ??? of their in-use lists. >>>>> >>>>>>>>>>>>> ??? - The ServiceThread will check for >>>>> async deflation >>>>> >>>>>>>>>>>>> work every >>>>> >>>>>>>>>>>>> GuaranteedSafepointInterval. >>>>> >>>>>>>>>>>>> ??? - A safepoint can still cause the >>>>> ServiceThread to >>>>> >>>>>>>>>>>>> check for >>>>> >>>>>>>>>>>>> ????? async deflation work via >>>>> is_async_deflation_requested. >>>>> >>>>>>>>>>>>> ? - Refactor code from >>>>> >>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() into >>>>> >>>>>>>>>>>>> monitors_used_above_threshold() and remove >>>>> >>>>>>>>>>>>> is_cleanup_needed(). >>>>> >>>>>>>>>>>>> ? - In addition to System.gc(), the >>>>> VM_Exit VM op and the >>>>> >>>>>>>>>>>>> final >>>>> >>>>>>>>>>>>> ??? VMThread safepoint now set the >>>>> >>>>>>>>>>>>> is_special_deflation_requested >>>>> >>>>>>>>>>>>> ??? flag to reduce the in-use monitor >>>>> population that is >>>>> >>>>>>>>>>>>> reported by >>>>> >>>>>>>>>>>>> >>>>> ObjectSynchronizer::log_in_use_monitor_details() at VM exit. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Test update: >>>>> >>>>>>>>>>>>> ? - >>>>> test/hotspot/gtest/oops/test_markOop.cpp is updated to >>>>> >>>>>>>>>>>>> work with >>>>> >>>>>>>>>>>>> AsyncDeflateIdleMonitors. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Collateral: >>>>> >>>>>>>>>>>>> ? - Add/clarify/update some logging messages. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Cleanup: >>>>> >>>>>>>>>>>>> ? - Updated comments based on Karen's code >>>>> review. >>>>> >>>>>>>>>>>>> ? - Change 'special cleanup' -> 'special >>>>> deflation' and >>>>> >>>>>>>>>>>>> ??? 'async cleanup' -> 'async deflation'. >>>>> >>>>>>>>>>>>> ??? - comment and function name changes >>>>> >>>>>>>>>>>>> ? - Clarify MonitorUsedDeflationThreshold >>>>> description; >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Main bug URL: >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong >>>>> safepoints >>>>> >>>>>>>>>>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> The project is currently baselined on >>>>> jdk-13+22. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Here's the full webrev URL: >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for- >> jdk13.full/ >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Here's the incremental webrev URL: >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for- >> jdk13.inc/ >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> I have not updated the OpenJDK wiki to >>>>> reflect the CR4 >>>>> >>>>>>>>>>>>> changes: >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> The wiki doesn't say a whole lot about the >>>>> async deflation >>>>> >>>>>>>>>>>>> invocation >>>>> >>>>>>>>>>>>> mechanism so I have to figure out how to >>>>> add that content. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> This version of the patch has been thru >>>>> Mach5 tier[1-8] >>>>> >>>>>>>>>>>>> testing on >>>>> >>>>>>>>>>>>> Oracle's usual set of platforms. My >>>>> Solaris-X64 stress kit >>>>> >>>>>>>>>>>>> run is >>>>> >>>>>>>>>>>>> running now. Kitchensink8H on product, >>>>> fastdebug, and >>>>> >>>>>>>>>>>>> slowdebug bits >>>>> >>>>>>>>>>>>> are running on Linux-X64, MacOSX and >>>>> Solaris-X64. I still >>>>> >>>>>>>>>>>>> have to run >>>>> >>>>>>>>>>>>> my stress kit on Linux-X64. I still have >>>>> to run the >>>>> >>>>>>>>>>>>> SPECjbb2015 >>>>> >>>>>>>>>>>>> baseline and CR4 runs on Linux-X64, MacOSX >>>>> and Solaris-X64. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>> comments or >>>>> >>>>>>>>>>>>> suggestions. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Dan >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> On 5/6/19 11:52 AM, Daniel D. Daugherty wrote: >>>>> >>>>>>>>>>>>>> Greetings, >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> I had some discussions with Karen about a >>>>> race that was >>>>> >>>>>>>>>>>>>> in the >>>>> >>>>>>>>>>>>>> ObjectMonitor::enter() code in >>>>> CR2/v2.02/5-for-jdk13. >>>>> >>>>>>>>>>>>>> This race was >>>>> >>>>>>>>>>>>>> theoretical and I had no test failures >>>>> due to it. The fix >>>>> >>>>>>>>>>>>>> is pretty >>>>> >>>>>>>>>>>>>> simple: remove the special case code for >>>>> async deflation >>>>> >>>>>>>>>>>>>> in the >>>>> >>>>>>>>>>>>>> ObjectMonitor::enter() function and rely >>>>> solely on the >>>>> >>>>>>>>>>>>>> ref_count >>>>> >>>>>>>>>>>>>> for ObjectMonitor::enter() protection. >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> During those discussions Karen also >>>>> floated the idea of >>>>> >>>>>>>>>>>>>> using the >>>>> >>>>>>>>>>>>>> ref_count field instead of the >>>>> contentions field for the >>>>> >>>>>>>>>>>>>> Async >>>>> >>>>>>>>>>>>>> Monitor Deflation protocol. I decided to >>>>> go ahead and >>>>> >>>>>>>>>>>>>> code up that >>>>> >>>>>>>>>>>>>> change and I have run it through the >>>>> usual stress and >>>>> >>>>>>>>>>>>>> Mach5 testing >>>>> >>>>>>>>>>>>>> with no issues. It's also known as v2.03 >>>>> (for those for >>>>> >>>>>>>>>>>>>> with the >>>>> >>>>>>>>>>>>>> patches) and as webrev/6-for-jdk13 (for >>>>> those with webrev >>>>> >>>>>>>>>>>>>> URLs). >>>>> >>>>>>>>>>>>>> Sorry for all the names... >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> Main bug URL: >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong >>>>> safepoints >>>>> >>>>>>>>>>>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> The project is currently baselined on >>>>> jdk-13+18. >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> Here's the full webrev URL: >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for- >> jdk13.full/ >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> Here's the incremental webrev URL: >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for- >> jdk13.inc/ >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> I have also updated the OpenJDK wiki to >>>>> reflect the CR3 >>>>> >>>>>>>>>>>>>> changes: >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> This version of the patch has been thru >>>>> Mach5 tier[1-8] >>>>> >>>>>>>>>>>>>> testing on >>>>> >>>>>>>>>>>>>> Oracle's usual set of platforms. My >>>>> Solaris-X64 stress >>>>> >>>>>>>>>>>>>> kit run had >>>>> >>>>>>>>>>>>>> no issues. Kitchensink8H on product, >>>>> fastdebug, and >>>>> >>>>>>>>>>>>>> slowdebug bits >>>>> >>>>>>>>>>>>>> had no failures on Linux-X64; MacOSX >>>>> fastdebug and >>>>> >>>>>>>>>>>>>> slowdebug and >>>>> >>>>>>>>>>>>>> Solaris-X64 release had the usual "Too >>>>> large time diff" >>>>> >>>>>>>>>>>>>> complaints. >>>>> >>>>>>>>>>>>>> 12 hour Inflate2 runs on product, >>>>> fastdebug and slowdebug >>>>> >>>>>>>>>>>>>> bits on >>>>> >>>>>>>>>>>>>> Linux-X64, MacOSX and Solaris-X64 had no >>>>> failures. My >>>>> >>>>>>>>>>>>>> Linux-X64 >>>>> >>>>>>>>>>>>>> stress kit is running right now. >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> I've done the SPECjbb2015 baseline and >>>>> CR3 runs. I need >>>>> >>>>>>>>>>>>>> to gather >>>>> >>>>>>>>>>>>>> the results and analyze them. >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>> comments or >>>>> >>>>>>>>>>>>>> suggestions. >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> Dan >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> On 4/25/19 12:38 PM, Daniel D. Daugherty >>>>> wrote: >>>>> >>>>>>>>>>>>>>> Greetings, >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> I have a small but important bug fix for >>>>> the Async >>>>> >>>>>>>>>>>>>>> Monitor Deflation >>>>> >>>>>>>>>>>>>>> project ready to go. It's also known as >>>>> v2.02 (for those >>>>> >>>>>>>>>>>>>>> for with the >>>>> >>>>>>>>>>>>>>> patches) and as webrev/5-for-jdk13 (for >>>>> those with >>>>> >>>>>>>>>>>>>>> webrev URLs). Sorry >>>>> >>>>>>>>>>>>>>> for all the names... >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> JDK-8222295 was pushed to jdk/jdk two >>>>> days ago so that >>>>> >>>>>>>>>>>>>>> baseline patch >>>>> >>>>>>>>>>>>>>> is out of our hair. >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Main bug URL: >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong >>>>> safepoints >>>>> >>>>>>>>>>>>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> The project is currently baselined on >>>>> jdk-13+17. >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Here's the full webrev URL: >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for- >> jdk13.full/ >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Here's the incremental webrev URL >>>>> (JDK-8153224): >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for- >> jdk13.inc/ >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> I still have to update the OpenJDK wiki >>>>> to reflect the >>>>> >>>>>>>>>>>>>>> CR2 changes: >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> This version of the patch has been thru >>>>> Mach5 tier[1-6] >>>>> >>>>>>>>>>>>>>> testing on >>>>> >>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 >>>>> tier[7-8] is >>>>> >>>>>>>>>>>>>>> running now. >>>>> >>>>>>>>>>>>>>> My stress kit is running on Solaris-X64 >>>>> now. >>>>> >>>>>>>>>>>>>>> Kitchensink8H is running >>>>> >>>>>>>>>>>>>>> now on product, fastdebug, and slowdebug >>>>> bits on >>>>> >>>>>>>>>>>>>>> Linux-X64, MacOSX >>>>> >>>>>>>>>>>>>>> and Solaris-X64. 12 hour Inflate2 runs >>>>> are running now >>>>> >>>>>>>>>>>>>>> on product, >>>>> >>>>>>>>>>>>>>> fastdebug and slowdebug bits on >>>>> Linux-X64, MacOSX and >>>>> >>>>>>>>>>>>>>> Solaris-X64. >>>>> >>>>>>>>>>>>>>> I'll start my my stress kit on Linux-X64 >>>>> sometime on >>>>> >>>>>>>>>>>>>>> Sunday (after >>>>> >>>>>>>>>>>>>>> my jdk-13+18 stress run is done). >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> I'll do SPECjbb2015 baseline and CR2 >>>>> runs after all the >>>>> >>>>>>>>>>>>>>> stress >>>>> >>>>>>>>>>>>>>> testing is done. >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>> comments or >>>>> >>>>>>>>>>>>>>> suggestions. >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Dan >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> On 4/19/19 11:58 AM, Daniel D. Daugherty >>>>> wrote: >>>>> >>>>>>>>>>>>>>>> Greetings, >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> I finally have CR1 for the Async >>>>> Monitor Deflation >>>>> >>>>>>>>>>>>>>>> project ready to >>>>> >>>>>>>>>>>>>>>> go. It's also known as v2.01 (for those >>>>> for with the >>>>> >>>>>>>>>>>>>>>> patches) and as >>>>> >>>>>>>>>>>>>>>> webrev/4-for-jdk13 (for those with >>>>> webrev URLs). Sorry >>>>> >>>>>>>>>>>>>>>> for all the >>>>> >>>>>>>>>>>>>>>> names... >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> Main bug URL: >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong >>>>> safepoints >>>>> >>>>>>>>>>>>>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> Baseline bug fixes URL: >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> JDK-8222295 more baseline cleanups from >>>>> Async >>>>> >>>>>>>>>>>>>>>> Monitor Deflation project >>>>> >>>>>>>>>>>>>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8222295 >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> The project is currently baselined on >>>>> jdk-13+15. >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> Here's the webrev for the latest >>>>> baseline changes >>>>> >>>>>>>>>>>>>>>> (JDK-8222295): >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- >> jdk13.8222295 >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> Here's the full webrev URL (JDK-8153224 >>>>> only): >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- >> jdk13.full/ >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> Here's the incremental webrev URL >>>>> (JDK-8153224): >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- >> jdk13.inc/ >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> So I'm looking for reviews for both >>>>> JDK-8222295 and the >>>>> >>>>>>>>>>>>>>>> latest version >>>>> >>>>>>>>>>>>>>>> of JDK-8153224... >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> I still have to update the OpenJDK wiki >>>>> to reflect the >>>>> >>>>>>>>>>>>>>>> CR changes: >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> This version of the patch has been thru >>>>> Mach5 tier[1-3] >>>>> >>>>>>>>>>>>>>>> testing on >>>>> >>>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 >>>>> tier[4-6] is >>>>> >>>>>>>>>>>>>>>> running now and >>>>> >>>>>>>>>>>>>>>> Mach5 tier[78] will be run later today. >>>>> My stress kit >>>>> >>>>>>>>>>>>>>>> on Solaris-X64 >>>>> >>>>>>>>>>>>>>>> is running now. Linux-X64 stress >>>>> testing will start on >>>>> >>>>>>>>>>>>>>>> Sunday. I'm >>>>> >>>>>>>>>>>>>>>> planning to do Kitchensink runs, >>>>> SPECjbb2015 runs and >>>>> >>>>>>>>>>>>>>>> my monitor >>>>> >>>>>>>>>>>>>>>> inflation stress tests on Linux-X64, >>>>> MacOSX and >>>>> >>>>>>>>>>>>>>>> Solaris-X64. >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>> comments or >>>>> >>>>>>>>>>>>>>>> suggestions. >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> Dan >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> On 3/24/19 9:57 AM, Daniel D. Daugherty >>>>> wrote: >>>>> >>>>>>>>>>>>>>>>> Greetings, >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> Welcome to the OpenJDK review thread >>>>> for my port of >>>>> >>>>>>>>>>>>>>>>> Carsten's work on: >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong >>>>> safepoints >>>>> >>>>>>>>>>>>>>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> Here's a link to the OpenJDK wiki that >>>>> describes my port: >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> Here's the webrev URL: >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for- >> jdk13/ >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> Here's a link to Carsten's original >>>>> webrev: >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> Earlier versions of this patch have >>>>> been through >>>>> >>>>>>>>>>>>>>>>> several rounds of >>>>> >>>>>>>>>>>>>>>>> preliminary review. Many thanks to >>>>> Carsten, Coleen, >>>>> >>>>>>>>>>>>>>>>> Robbin, and >>>>> >>>>>>>>>>>>>>>>> Roman for their preliminary code >>>>> review comments. A >>>>> >>>>>>>>>>>>>>>>> very special >>>>> >>>>>>>>>>>>>>>>> thanks to Robbin and Roman for >>>>> building and testing >>>>> >>>>>>>>>>>>>>>>> the patch in >>>>> >>>>>>>>>>>>>>>>> their own environments (including >>>>> specJBB2015). >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> This version of the patch has been >>>>> thru Mach5 >>>>> >>>>>>>>>>>>>>>>> tier[1-8] testing on >>>>> >>>>>>>>>>>>>>>>> Oracle's usual set of platforms. >>>>> Earlier versions have >>>>> >>>>>>>>>>>>>>>>> been run >>>>> >>>>>>>>>>>>>>>>> through my stress kit on my Linux-X64 >>>>> and Solaris-X64 >>>>> >>>>>>>>>>>>>>>>> servers >>>>> >>>>>>>>>>>>>>>>> (product, fastdebug, >>>>> slowdebug).Earlier versions have >>>>> >>>>>>>>>>>>>>>>> run Kitchensink >>>>> >>>>>>>>>>>>>>>>> for 12 hours on MacOSX, Linux-X64 and >>>>> Solaris-X64 >>>>> >>>>>>>>>>>>>>>>> (product, fastdebug >>>>> >>>>>>>>>>>>>>>>> and slowdebug). Earlier versions have >>>>> run my monitor >>>>> >>>>>>>>>>>>>>>>> inflation stress >>>>> >>>>>>>>>>>>>>>>> tests for 12 hours on MacOSX, >>>>> Linux-X64 and >>>>> >>>>>>>>>>>>>>>>> Solaris-X64 (product, >>>>> >>>>>>>>>>>>>>>>> fastdebug and slowdebug). >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> All of the testing done on earlier >>>>> versions will be >>>>> >>>>>>>>>>>>>>>>> redone on the >>>>> >>>>>>>>>>>>>>>>> latest version of the patch. >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>> comments or >>>>> >>>>>>>>>>>>>>>>> suggestions. >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> Dan >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> P.S. >>>>> >>>>>>>>>>>>>>>>> One subtest in >>>>> >>>>>>>>>>>>>>>>> >>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>> >>>>>>>>>>>>>>>>> is currently failing in -Xcomp mode on >>>>> Win* only. I've >>>>> >>>>>>>>>>>>>>>>> been trying >>>>> >>>>>>>>>>>>>>>>> to characterize/analyze this failure >>>>> for more than a >>>>> >>>>>>>>>>>>>>>>> week now. At >>>>> >>>>>>>>>>>>>>>>> this point I'm convinced that Async >>>>> Monitor Deflation >>>>> >>>>>>>>>>>>>>>>> is aggravating >>>>> >>>>>>>>>>>>>>>>> an existing bug. However, I plan to >>>>> have a better >>>>> >>>>>>>>>>>>>>>>> handle on that >>>>> >>>>>>>>>>>>>>>>> failure before these bits are pushed >>>>> to the jdk/jdk repo. >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>> >>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>>> >>> >>>>> >> >>>>> From martin.doerr at sap.com Tue Sep 15 16:32:20 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 15 Sep 2020 16:32:20 +0000 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints (CR14/v2.14/17-for-jdk15) In-Reply-To: <158ce5cc-3869-737b-b644-a82c962949cc@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <2a8976f7-37e0-03b9-3099-e07464e46512@oracle.com> <5681d640-08c8-3433-0f85-3f23eea69e87@oracle.com> <029b596d-46e5-9fa7-38fd-c34d3a32987b@oracle.com> <54fd4f9c-afef-0819-de2d-b81b25fa6c22@oracle.com> <79bfb73a-20d7-7b85-4a84-dd22b150ed0d@oracle.com> <9fcc131c-dfbd-b943-381b-2ea8d854fcd7@oracle.com> <3b76e50f-fd8f-d61b-272a-27338df99094@oracle.com> <1d6a6087-b82d-96cf-bfb4-87cb03869bd6@oracle.com> <158ce5cc-3869-737b-b644-a82c962949cc@oracle.com> Message-ID: Thank you, Dan! I've created https://bugs.openjdk.java.net/browse/JDK-8253183 Feel free to modify/assign. Best regards, Martin > -----Original Message----- > From: Daniel D. Daugherty > Sent: Dienstag, 15. September 2020 16:59 > To: Doerr, Martin ; Carsten Varming > ; Erik ?sterlund > Cc: Roman Kennke ; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints > (CR14/v2.14/17-for-jdk15) > > Hi Martin, > > I believe that the support_IRIW_for_not_multiple_copy_atomic_cpu stuff > came from Erik O. so I'm adding him to this email thread. > > Yes, please create an issue that describes the problem and we'll > figure out who should take the issue... > > Dan > > > On 9/15/20 10:52 AM, Doerr, Martin wrote: > > Hi Dan and Carsten, > > > > I just noticed that this change introduced 2 usages of > "support_IRIW_for_not_multiple_copy_atomic_cpu". > > I think this is incorrect for arm32 which is not multi-copy-atomic, but uses > support_IRIW_for_not_multiple_copy_atomic_cpu = false. > > You probably meant "#ifdef CPU_MULTI_COPY_ATOMIC"? > > > > I haven't studied the access patterns you were trying to fix, but this looks > wrong. > > Should I create an issue? Would be great if I could assign it to somebody > familiar with this new code. > > > > Best regards, > > Martin > > > > > >> -----Original Message----- > >> From: hotspot-runtime-dev >> bounces at openjdk.java.net> On Behalf Of Daniel D. Daugherty > >> Sent: Dienstag, 2. Juni 2020 21:25 > >> To: Carsten Varming > >> Cc: Roman Kennke ; hotspot-runtime- > >> dev at openjdk.java.net > >> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints > >> (CR14/v2.14/17-for-jdk15) > >> > >> Hi Carsten, > >> > >> Thanks for the fast review of the updated comments. > >> > >> I filed the following new bug to track the change: > >> > >> ??? JDK-8246359 clarify confusing comment in ObjectMonitor::EnterI()'s > >> ??????????????? race with async deflation > >> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > >> > >> And I started a review thread for the fix under that new bug ID. > >> > >> Dan > >> > >> > >> On 6/2/20 2:13 PM, Carsten Varming wrote: > >>> Hi Dan, > >>> > >>> I like the new comment. Thank you for doing the update. > >>> > >>> Carsten > >>> > >>> On Tue, Jun 2, 2020 at 1:54 PM Daniel D. Daugherty > >>> > > >> wrote: > >>> Hi Carsten, > >>> > >>> See replies below... > >>> > >>> David, Erik and Robbin, if you folks could also check out the revised > >>> comment below that would be appreciated. > >>> > >>> > >>> On 6/2/20 9:39 AM, Carsten Varming wrote: > >>>> Hi Dan, > >>>> > >>>> See inline. > >>>> > >>>> On Mon, Jun 1, 2020 at 11:32 PM Daniel D. Daugherty > >>>> >>>> > wrote: > >>>> > >>>> Hi Carsten, > >>>> > >>>> Thanks for chiming in on this review thread!! > >>>> > >>>> > >>>> It is my pleasure. You know the code is solid when the discussion > >>>> is focused on the comments. > >>> So true, so very true! > >>> > >>> > >>>> On 6/1/20 10:41 PM, Carsten Varming wrote: > >>>>> Hi Dan, > >>>>> > >>>>> I like the new protocol, but I had to think about how the > >>>>> extra increment to _contentions replaced the check on _owner > >>>>> that I originally?added. > >>>> Right. The check on _owner was described in detail in the > >>>> OpenJDK wiki > >>>> subsection that was called "T-enter Wins By A-B-A". It can > >>>> still be > >>>> found by going thru the wiki's history links. > >>>> > >>>> That subsection was renamed and rewritten and can be found > here: > >>>> > >>>> > >> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#A > >> syncMonitorDeflation-T- > >> enterWinsByCancellationViaDEFLATER_MARKERSwap > >>>> > >>>>> I am thinking that the increased _contention value is a > >>>>> little mark left on the ObjectMonitor to signal to the > >>>>> deflater thread (which must be in the middle of trying to > >>>>> acquire the object monitor as _owner was set to > >>>>> DEFLATER_MARKER) that the deflater thread lost the race. > >>>> That is exactly what the extra increment is being used for. > >>>> > >>>> In my reply to David H. that you quoted below, I describe the > >>>> progression > >>>> of contention values thru the two possible race scenarios. > >>>> The progression > >>>> shows the T-enter thread winning the race and marking the > >>>> contention field > >>>> with the extra increment while the T-deflater thread > >>>> recognizes that it has > >>>> lost the race and unmarks the contention field with an extra > >>>> decrement. > >>>> > >>>> > >>>> I noticed that. Looks like David and I were racing and David won. :) > >>>> > >>>>> That little mark stays with the object monitor long after > >>>>> the thread is done with the monitor. > >>>> The "little mark" stays with the ObjectMonitor after T-enter > >>>> is done > >>>> entering until the T-deflater thread recognizes that the > >>>> async deflation > >>>> was canceled and does an extra decrement. I don't think I > >>>> would describe > >>>> it as "long after". > >>>> > >>>> > >>>> Sorry about the use of "long after". When I think about the > >>>> correctness of protocols, like the deflation protocol, I end up > >>>> thinking about sequences of instructions and the relevant > >>>> interleavings. In that context I often end up using phrases like > >>>> "long after" and "after" to mean anything after a particular > >>>> instruction. I did not mean to imply anything about the relative > >>>> speed of the execution of the code. > >>> It's okay. I do something similar in the transaction diagrams that > >>> I use to work out timing issues: ... > >>> > >>> The only point that I was trying to make is that the T-deflate thread > >>> is responsible for cleaning up the extra mark and it's committed to > >>> the code path that will result in the cleanup. Yes, there may be a > >>> between the time that T-deflate recognizes that async > >>> deflation was canceled and when T-deflate does the extra > decrement, > >>> but I don't see any harm in it. > >>> > >>> > >>>>> It might be worth adding a comment to the code explaining > >>>>> that after the increment, the _contention field can only be > >>>>> set to 0 by a corresponding decrement in the async deflater > >>>>> thread, ensuring that the > >>>>> Atomic::cmpxchg(&mid->_contentions, (jint)0, -max_jint)?on > >>>>> line 2166 fails. In particular, the comment: > >>>>> +. // .... We bump contentions an > >>>>> + // extra time to prevent the async deflater thread from > >>>>> temporarily > >>>>> + // changing it to -max_jint and back to zero (no flicker > >>>>> to confuse > >>>>> + // is_being_async_deflated() > >>>>> confused me as after the deflater thread sets _contentions > >>>>> to -max_jint, the?deflater thread has won the race and the > >>>>> object monitor is about to be deflated. > >>>> For context, here's the code and comment being discussed: > >>>> > >>>>> 527 if (AsyncDeflateIdleMonitors && > >>>>> 528 try_set_owner_from(DEFLATER_MARKER, Self) == > >> DEFLATER_MARKER) { > >>>>> 529 // Cancelled the in-progress async deflation. We bump > >>>>> contentions an > >>>>> 530 // extra time to prevent the async deflater thread from > >>>>> temporarily > >>>>> 531 // changing it to -max_jint and back to zero (no flicker > >>>>> to confuse > >>>>> 532 // is_being_async_deflated()). The async deflater thread > >>>>> will > >>>>> 533 // decrement contentions after it recognizes that the async > >>>>> 534 // deflation was cancelled. > >>>>> 535 add_to_contentions(1); > >>>> This part of the new comment: > >>>> > >>>> ?532???? // ...? The async deflater thread will > >>>> ?533???? // decrement contentions after it recognizes that > >>>> the async > >>>> ?534???? // deflation was cancelled. > >>>> > >>>> makes it clear that the async deflater thread does the > >>>> corresponding decrement > >>>> to the increment done by the T-enter thread so that covers > >>>> this part of your > >>>> comment above: > >>>> > >>>> ??? the _contention field can only be set to 0 by a > >>>> corresponding decrement > >>>> ??? in the async deflater thread > >>>> > >>>> This part of the new comment: > >>>> > >>>> ?529???? // ...? We bump contentions an > >>>> ?530???? // extra time to prevent the async deflater thread > >>>> from temporarily > >>>> ?531???? // changing it to -max_jint and back to zero (no > >>>> flicker to confuse > >>>> ?532???? // is_being_async_deflated()). > >>>> > >>>> makes it clear that we're keeping make-contentions-negative > >>>> part of the > >>>> async deflation protocol from happening so that covers this > >>>> part of your > >>>> comment above: > >>>> > >>>> ??? ensuring that the Atomic::cmpxchg(&mid->_contentions, > >>>> (jint)0, -max_jint) > >>>> ??? on line 2166 fails. > >>>> > >>>> This part of your comment above makes it clear where the > >>>> confusion arises: > >>>> > >>>> ??? confused me as after the deflater thread sets > >>>> _contentions to -max_jint, > >>>> ??? the deflater thread has won the race and the object > >>>> monitor is about to > >>>> ??? be deflated. > >>>> > >>>> Your original algorithm is a three-part async deflation protocol: > >>>> > >>>> Part 1 - set owner field to DEFLATER marker > >>>> Part 2 - make a zero contentions field -max_jint > >>>> Part 3 - check to see if the owner field is still DEFLATER_MARKER > >>>> > >>>> If part 3 fails, then the contentions field that is currently > >>>> negative > >>>> has max_jint added to it to complete the bail out process. > >>>> It's that > >>>> third part that makes the contentions field flicker from: > >>>> > >>>> ??? 0 -> -max_jint -> 0 > >>>> > >>>> And the extra contentions increment in the new two part > >>>> protocol solves > >>>> that flicker and allows us to treat (contentions < 0) as a > >>>> linearization > >>>> point. > >>>> > >>>> Please let me know if this clarifies your concern. > >>>> > >>>> > >>>> I am no?longer confused, but the cause of my confusion is still > >>>> present in the comment. > >>>> > >>>> This group knows about the three part algorithm, but when the > >>>> code is pushed there is no representation of the three part > >>>> algorithm in the code or repository. > >>> That's a really good point and a side effect of my living with this > >>> code for a very long time... > >>> > >>> > >>>> I forgot the details of the algorithm and read the latest version > >>>> of the code to figure out what the flickering was about. As you > >>>> would expect, I found that there is no way the code can cause the > >>>> flicker mentioned. That made me worried. I started to question > >>>> myself: What can?cause the behavior that is described in the > >>>> comments? What am I missing? As a result, I think it is best if > >>>> we keep the flickering to ourselves and update the comment to > >>>> describe that because _owner was DEFLATER_MARKER the deflation > >>>> thread must be in the middle of the protocol for deflating the > >>>> object monitor, and in particular, incrementing _contentions > >>>> ensures the failure of the final CAS in the deflation protocol > >>>> (final in the protocol implemented in the code). > >>> The above is a more clear expression of your concerns and I agree. > >>> > >>> > >>>> To be clear: > >>>> > >>>> > 529 // Cancelled the in-progress async deflation. > >>>> > >>>> I would expend this comment by mentioning that the deflator > >>>> thread cannot win the last part of the 2-part deflation protocol > >>>> as 0 < _contentions (pre-condition to this method). > >>>> > >>>> > We bump contentions an > >>>> > 530 // extra time to prevent the async deflater thread from > >>>> temporarily > >>>> > 531 // changing it to -max_jint and back to zero (no flicker to > >>>> confuse > >>>> > 532 // is_being_async_deflated()). > >>>> > >>>> I would replace this part with something along the lines of: We > >>>> bump contentions an extra time to prevent the deflator thread > >>>> from winning the last part of the (2-part) deflation protocol > >>>> after this thread decrements _contentions as part of the release > >>>> of the object monitor. > >>>> > >>>> > The async deflater thread will > >>>> > 533 // decrement contentions after it recognizes that the async > >>>> > 534 // deflation was cancelled. > >>>> > >>>> I would keep this part. > >>> So here's my rewrite of the code and comment block: > >>> > >>> ? if (AsyncDeflateIdleMonitors && > >>> ????? try_set_owner_from(DEFLATER_MARKER, Self) == > >> DEFLATER_MARKER) { > >>> ??? // Cancelled the in-progress async deflation by changing owner > >>> from > >>> ??? // DEFLATER_MARKER to Self. As part of the contended enter > >>> protocol, > >>> ??? // contentions was incremented to a positive value before EnterI() > >>> ??? // was called and that prevents the deflater thread from > >>> winning the > >>> ??? // last part of the 2-part async deflation protocol. After > >>> EnterI() > >>> ??? // returns to enter(), contentions is decremented because the > >>> caller > >>> ??? // now owns the monitor. We bump contentions an extra time here > to > >>> ??? // prevent the deflater thread from winning the last part of the > >>> ??? // 2-part async deflation protocol after the regular decrement > >>> ??? // occurs in enter(). The deflater thread will decrement > >>> contentions > >>> ??? // after it recognizes that the async deflation was cancelled. > >>> ??? add_to_contentions(1); > >>> > >>> I've made this change to both places in EnterI() that had the original > >>> confusing comment. > >>> > >>> Please let me know if this rewrite works for everyone. > >>> > >>> Since I've already pushed 8153224, I'll file a new bug to push this > >>> clarification once we're all in agreement here. > >>> > >>> Dan > >>> > >>> > >>>> I hope this helps, > >>>> Carsten > >>>> > >>>>> Otherwise, the code looks great. I am looking forward to > >>>>> seeing in the repo. > >>>> Thanks! The code should be there soon. > >>>> > >>>> Dan > >>>> > >>>> > >>>>> Carsten > >>>>> > >>>>> On Mon, Jun 1, 2020 at 8:32 PM Daniel D. Daugherty > >>>>> >>>>> > wrote: > >>>>> > >>>>> Hi David, > >>>>> > >>>>> On 6/1/20 7:58 PM, David Holmes wrote: > >>>>> > Hi Dan, > >>>>> > > >>>>> > Sorry for the delay. > >>>>> > >>>>> No worries. It's always worth waiting for your code > >>>>> review in general > >>>>> and, with the complexity of this project, it's on my > >>>>> must-do list! > >>>>> > >>>>> > >>>>> > > >>>>> > On 28/05/2020 3:20 am, Daniel D. Daugherty wrote: > >>>>> >> Greetings, > >>>>> >> > >>>>> >> Erik O. had an idea for changing the three part async > >>>>> deflation protocol > >>>>> >> into a two part async deflation protocol where the > >>>>> second part (setting > >>>>> >> the contentions field to -max_jint) is a > >>>>> linearization point. I've taken > >>>>> >> Erik's proposal (which was relative to > >>>>> CR12/v2.12/15-for-jdk15), merged > >>>>> >> it with CR13/v2.13/16-for-jdk15, and made a few minor > >>>>> tweaks. > >>>>> >> > >>>>> >> I have attached the change list from CR13 to CR14 and > >>>>> I've also added a > >>>>> >> link to the CR13-to-CR14-changes file to the webrevs > >>>>> so it should be > >>>>> >> easy > >>>>> >> to find. > >>>>> >> > >>>>> >> Main bug URL: > >>>>> >> > >>>>> >> ???? JDK-8153224 Monitor deflation prolong safepoints > >>>>> >> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >> > >>>>> >> The project is currently baselined on jdk-15+24. > >>>>> >> > >>>>> >> Here's the full webrev URL for those folks that want > >>>>> to see all of the > >>>>> >> current Async Monitor Deflation code in one go (v2.14 > >>>>> full): > >>>>> >> > >>>>> >> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/17-for- > >> jdk15+24.v2.14.full/ > >>>>> >> > >>>>> >> > >>>>> >> Some folks might want to see just what has changed > >>>>> since the last review > >>>>> >> cycle so here's a webrev for that (v2.14 inc): > >>>>> >> > >>>>> >> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/17-for- > >> jdk15+24.v2.14.inc/ > >>>>> > > >>>>> > > >>>>> > src/hotspot/share/runtime/synchronizer.cpp > >>>>> > > >>>>> > I'm having a little trouble keeping the _contentions > >>>>> relationships in > >>>>> > my head. In particular with this change I can't quite > >>>>> grok the: > >>>>> > > >>>>> > // Deferred decrement for the JT EnterI() that > >>>>> cancelled the async > >>>>> > deflation. > >>>>> > mid->add_to_contentions(-1); > >>>>> > > >>>>> > change. I kind of get EnterI() does an extra increment > >>>>> and the > >>>>> > deflator thread does the above matching decrement. But > >>>>> given the two > >>>>> > changes can happen in any order I'm not sure what the > >>>>> possible visible > >>>>> > values for _contentions will be and how that might > >>>>> affect other code > >>>>> > inspecting it? > >>>>> > >>>>> I have a sub-section in the OpenJDK wiki dedicated to > >>>>> this particular race: > >>>>> > >>>>> > >> > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#A > >> syncMonitorDeflation-T- > >> enterWinsByCancellationViaDEFLATER_MARKERSwap > >>>>> In order for this race condition to manifest, the > >>>>> T-enter thread has to > >>>>> successfully swap the owner field's DEFLATER_MARKER > >>>>> value for Self. That > >>>>> swap will eventually cause the T-deflate thread to > >>>>> realize that the async > >>>>> deflation that it started has been canceled. > >>>>> > >>>>> The diagram shows the progression of contentions values: > >>>>> > >>>>> - ObjectMonitor box 1 shows contentions == 1 because > >>>>> T-enter incremented > >>>>> ?? the contentions field > >>>>> > >>>>> - ObjectMonitor box 2 shows contentions == 2 because > >>>>> EnterI() did the > >>>>> ?? extra increment. > >>>>> > >>>>> - ObjectMonitor box 3 shows contentions == 1 because > >>>>> T-enter did the > >>>>> ?? regular contentions decrement. > >>>>> > >>>>> - ObjectMonitor box 4 shows contentions == 0 because > >>>>> T-deflate did the > >>>>> ?? extra contentions decrement. > >>>>> > >>>>> Now it is possible for T-deflate to do the extra > >>>>> decrement before T-enter > >>>>> does the extra increment. If I were to add another > >>>>> diagram to show that > >>>>> variant of the race, that progression of contentions > >>>>> values would be: > >>>>> > >>>>> - ObjectMonitor box 1 shows contentions == 1 because > >>>>> T-enter incremented > >>>>> ?? the contentions field > >>>>> > >>>>> - ObjectMonitor box 2 shows contentions == 0 because > >>>>> T-deflate did the > >>>>> ?? extra contentions decrement. > >>>>> > >>>>> - ObjectMonitor box 3 shows contentions == 1 because > >>>>> EnterI() did the > >>>>> ?? extra increment. > >>>>> > >>>>> - ObjectMonitor box 4 shows contentions == 0 because > >>>>> T-enter did the > >>>>> ?? regular contentions decrement. > >>>>> > >>>>> Notice that in this second scenario the contentions > >>>>> field never goes > >>>>> negative so there's nothing to confuse a potential caller of > >>>>> is_being_async_deflated(): > >>>>> > >>>>> inline bool ObjectMonitor::is_being_async_deflated() { > >>>>> ?? return AsyncDeflateIdleMonitors && contentions() < 0; > >>>>> } > >>>>> > >>>>> It is not possible for T-deflate's extra decrement of > >>>>> the contentions > >>>>> field to make the contentions field negative. That > >>>>> decrement only happens > >>>>> when T-deflate detects that the async deflation has been > >>>>> canceled and > >>>>> async deflation can only be canceled after T-enter has > >>>>> already made the > >>>>> contentions field > 0. > >>>>> > >>>>> Please let me know if this resolves your concern about: > >>>>> > >>>>> > // Deferred decrement for the JT EnterI() that > >>>>> cancelled the async > >>>>> > deflation. > >>>>> > mid->add_to_contentions(-1); > >>>>> > >>>>> I'm not planning to update the OpenJDK wiki to add a > >>>>> second variant of > >>>>> the cancellation race. Please let me know if that is okay. > >>>>> > >>>>> > > >>>>> > But otherwise the changes in this version seem good > >>>>> and overall the > >>>>> > protocol seems simpler. > >>>>> > >>>>> This sounds like a thumbs up, but I'm looking for > >>>>> something more definitive. > >>>>> > >>>>> > >>>>> > I'm still going to spend some more time going over the > >>>>> complete webrev > >>>>> > to get a fuller sense of things. > >>>>> > >>>>> As always, if you find something after I've pushed, > >>>>> we'll deal with it. > >>>>> > >>>>> Thanks for your many re-reviews for this project!! > >>>>> > >>>>> Dan > >>>>> > >>>>> > >>>>> > > >>>>> > Thanks, > >>>>> > David > >>>>> > > >>>>> >> > >>>>> >> > >>>>> >> The OpenJDK wiki has been updated for v2.14. > >>>>> >> > >>>>> >> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >> > >>>>> >> The jdk-15+24 based v2.14 version of the patch has > >>>>> gone thru Mach5 > >>>>> >> Tier[1-5] > >>>>> >> testing with no related failures; Mach5 Tier[67] are > >>>>> running now and > >>>>> >> so far > >>>>> >> have no related failures. I'll kick off Mach5 Tier8 > >>>>> after the other > >>>>> >> tiers > >>>>> >> have finished since Mach5 is a bit busy right now. > >>>>> >> > >>>>> >> I'm also running my usual inflation stress testing on > >>>>> Linux-X64 and > >>>>> >> macOSX > >>>>> >> and so far there are no issues. > >>>>> >> > >>>>> >> Thanks, in advance, for any questions, comments or > >>>>> suggestions. > >>>>> >> > >>>>> >> Dan > >>>>> >> > >>>>> >> > >>>>> >> On 5/21/20 2:53 PM, Daniel D. Daugherty wrote: > >>>>> >>> Greetings, > >>>>> >>> > >>>>> >>> I have made changes to the Async Monitor Deflation > >>>>> code in response to > >>>>> >>> the CR12/v2.12/15-for-jdk15 code review cycle. > >>>>> Thanks to David H. and > >>>>> >>> Erik O. for their OpenJDK reviews in the v2.12 round! > >>>>> >>> > >>>>> >>> I have attached the change list from CR12 to CR13 > >>>>> and I've also added a > >>>>> >>> link to the CR12-to-CR13-changes file to the webrevs > >>>>> so it should be > >>>>> >>> easy > >>>>> >>> to find. > >>>>> >>> > >>>>> >>> Main bug URL: > >>>>> >>> > >>>>> >>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>>>> >>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>> > >>>>> >>> The project is currently baselined on jdk-15+24. > >>>>> >>> > >>>>> >>> Here's the full webrev URL for those folks that want > >>>>> to see all of the > >>>>> >>> current Async Monitor Deflation code in one go > >>>>> (v2.13 full): > >>>>> >>> > >>>>> >>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/16-for- > >> jdk15%2b24.v2.13.full/ > >>>>> >>> > >>>>> >>> > >>>>> >>> Some folks might want to see just what has changed > >>>>> since the last > >>>>> >>> review > >>>>> >>> cycle so here's a webrev for that (v2.13 inc): > >>>>> >>> > >>>>> >>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/16-for- > >> jdk15%2b24.v2.13.inc/ > >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> The OpenJDK wiki is currently at v2.13 and might > >>>>> require minor > >>>>> >>> tweaks for v2.12 > >>>>> >>> and v2.13. Yes, I need to make yet another crawl > >>>>> thru review of it... > >>>>> >>> > >>>>> >>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>> > >>>>> >>> The jdk-15+24 based v2.13 version of the patch is > >>>>> going thru the usual > >>>>> >>> Mach5 testing right now. It is also going thru my > >>>>> usual inflation > >>>>> >>> stress > >>>>> >>> testing on Linux-X64 and macOSX. > >>>>> >>> > >>>>> >>> Thanks, in advance, for any questions, comments or > >>>>> suggestions. > >>>>> >>> > >>>>> >>> Dan > >>>>> >>> > >>>>> >>> On 5/14/20 5:40 PM, Daniel D. Daugherty wrote: > >>>>> >>>> Greetings, > >>>>> >>>> > >>>>> >>>> I have made changes to the Async Monitor Deflation > >>>>> code in response to > >>>>> >>>> the CR11/v2.11/14-for-jdk15 code review cycle. > >>>>> Thanks to David H., > >>>>> >>>> Erik O., > >>>>> >>>> and Robbin for their OpenJDK reviews in the v2.11 > >>>>> round! > >>>>> >>>> > >>>>> >>>> I have attached the change list from CR11 to CR12 > >>>>> and I've also > >>>>> >>>> added a > >>>>> >>>> link to the CR11-to-CR12-changes file to the > >>>>> webrevs so it should > >>>>> >>>> be easy > >>>>> >>>> to find. > >>>>> >>>> > >>>>> >>>> Main bug URL: > >>>>> >>>> > >>>>> >>>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>> > >>>>> >>>> The project is currently baselined on jdk-15+23. > >>>>> >>>> > >>>>> >>>> Here's the full webrev URL for those folks that > >>>>> want to see all of the > >>>>> >>>> current Async Monitor Deflation code in one go > >>>>> (v2.12 full): > >>>>> >>>> > >>>>> >>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/15-for- > >> jdk15%2b23.v2.12.full/ > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> Some folks might want to see just what has changed > >>>>> since the last > >>>>> >>>> review > >>>>> >>>> cycle so here's a webrev for that (v2.12 inc): > >>>>> >>>> > >>>>> >>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/15-for- > >> jdk15%2b23.v2.12.inc/ > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> The OpenJDK wiki is currently at v2.11 and might > >>>>> require minor > >>>>> >>>> tweaks for v2.12: > >>>>> >>>> > >>>>> >>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>> > >>>>> >>>> The jdk-15+23 based v2.12 version of the patch is > >>>>> going thru the usual > >>>>> >>>> Mach5 testing right now. > >>>>> >>>> > >>>>> >>>> Thanks, in advance, for any questions, comments or > >>>>> suggestions. > >>>>> >>>> > >>>>> >>>> Dan > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> On 5/7/20 1:08 PM, Daniel D. Daugherty wrote: > >>>>> >>>>> Greetings, > >>>>> >>>>> > >>>>> >>>>> I have made changes to the Async Monitor Deflation > >>>>> code in > >>>>> >>>>> response to > >>>>> >>>>> the CR10/v2.10/13-for-jdk15 code review cycle and > >>>>> DaCapo-h2 perf > >>>>> >>>>> testing. > >>>>> >>>>> Thanks to Erik O., Robbin and David H. for their > >>>>> OpenJDK reviews > >>>>> >>>>> in the > >>>>> >>>>> v2.10 round! Thanks to Eric C. for his help in > >>>>> isolating the > >>>>> >>>>> DaCapo-h2 > >>>>> >>>>> performance regression. > >>>>> >>>>> > >>>>> >>>>> With the removal of ref_counting and the > >>>>> ObjectMonitorHandle > >>>>> >>>>> class, the > >>>>> >>>>> Async Monitor Deflation project is now closer to > >>>>> Carsten's original > >>>>> >>>>> prototype. While ref_counting gave us > >>>>> ObjectMonitor* safety > >>>>> >>>>> enforced by > >>>>> >>>>> code, I saw a ~22.8% slow down with > >>>>> -XX:-AsyncDeflateIdleMonitors > >>>>> >>>>> ("off" > >>>>> >>>>> mode). The slow down with "on" mode > >>>>> -XX:+AsyncDeflateIdleMonitors > >>>>> >>>>> is ~17%. > >>>>> >>>>> > >>>>> >>>>> I have attached the change list from CR10 to CR11 > >>>>> instead of > >>>>> >>>>> putting it in > >>>>> >>>>> the body of this email. I've also added a link to the > >>>>> >>>>> CR10-to-CR11-changes > >>>>> >>>>> file to the webrevs so it should be easy to find. > >>>>> >>>>> > >>>>> >>>>> Main bug URL: > >>>>> >>>>> > >>>>> >>>>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>> > >>>>> >>>>> The project is currently baselined on jdk-15+21. > >>>>> >>>>> > >>>>> >>>>> Here's the full webrev URL for those folks that > >>>>> want to see all of > >>>>> >>>>> the > >>>>> >>>>> current Async Monitor Deflation code in one go > >>>>> (v2.11 full): > >>>>> >>>>> > >>>>> >>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for- > >> jdk15%2b21.v2.11.full/ > >>>>> >>>>> > >>>>> >>>>> > >>>>> >>>>> Some folks might want to see just what has changed > >>>>> since the last > >>>>> >>>>> review > >>>>> >>>>> cycle so here's a webrev for that (v2.11 inc): > >>>>> >>>>> > >>>>> >>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for- > >> jdk15%2b21.v2.11.inc/ > >>>>> >>>>> > >>>>> >>>>> > >>>>> >>>>> Because of the removal of ref_counting and the > >>>>> ObjectMonitorHandle > >>>>> >>>>> class, the > >>>>> >>>>> incremental webrev is a bit noisier than I would > >>>>> have preferred. > >>>>> >>>>> > >>>>> >>>>> > >>>>> >>>>> The OpenJDK wiki has NOT YET been updated for this > >>>>> round of changes: > >>>>> >>>>> > >>>>> >>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>> > >>>>> >>>>> The jdk-15+21 based v2.11 version of the patch has > >>>>> been thru Mach5 > >>>>> >>>>> tier[1-6] > >>>>> >>>>> testing on Oracle's usual set of platforms. Mach5 > >>>>> tier[78] are > >>>>> >>>>> still running. > >>>>> >>>>> I'm running the v2.11 patch through my usual set > >>>>> of stress testing on > >>>>> >>>>> Linux-X64 and macOSX. > >>>>> >>>>> > >>>>> >>>>> I'm planning to do a SPECjbb2015, DaCapo-h2 and > >>>>> volano round on the > >>>>> >>>>> CR11/v2.11/14-for-jdk15 bits. > >>>>> >>>>> > >>>>> >>>>> Thanks, in advance, for any questions, comments or > >>>>> suggestions. > >>>>> >>>>> > >>>>> >>>>> Dan > >>>>> >>>>> > >>>>> >>>>> > >>>>> >>>>> On 2/26/20 5:22 PM, Daniel D. Daugherty wrote: > >>>>> >>>>>> Greetings, > >>>>> >>>>>> > >>>>> >>>>>> I have made changes to the Async Monitor > >>>>> Deflation code in > >>>>> >>>>>> response to > >>>>> >>>>>> the CR9/v2.09/12-for-jdk14 code review cycle. > >>>>> Thanks to Robbin > >>>>> >>>>>> and Erik O. > >>>>> >>>>>> for their comments in this round! > >>>>> >>>>>> > >>>>> >>>>>> With the extraction and push of > >>>>> {8235931,8236035,8235795} to > >>>>> >>>>>> JDK15, the > >>>>> >>>>>> Async Monitor Deflation code is back to "just" > >>>>> async deflation > >>>>> >>>>>> changes! > >>>>> >>>>>> > >>>>> >>>>>> I have attached the change list from CR9 to CR10 > >>>>> instead of > >>>>> >>>>>> putting it in > >>>>> >>>>>> the body of this email. I've also added a link to > >>>>> the > >>>>> >>>>>> CR9-to-CR10-changes > >>>>> >>>>>> file to the webrevs so it should be easy to find. > >>>>> >>>>>> > >>>>> >>>>>> Main bug URL: > >>>>> >>>>>> > >>>>> >>>>>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>>> > >>>>> >>>>>> The project is currently baselined on jdk-15+11. > >>>>> >>>>>> > >>>>> >>>>>> Here's the full webrev URL for those folks that > >>>>> want to see all > >>>>> >>>>>> of the > >>>>> >>>>>> current Async Monitor Deflation code in one go > >>>>> (v2.10 full): > >>>>> >>>>>> > >>>>> >>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for- > >> jdk15+11.v2.10.full/ > >>>>> >>>>>> > >>>>> >>>>>> > >>>>> >>>>>> Some folks might want to see just what has > >>>>> changed since the last > >>>>> >>>>>> review > >>>>> >>>>>> cycle so here's a webrev for that (v2.10 inc): > >>>>> >>>>>> > >>>>> >>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for- > >> jdk15+11.v2.10.inc/ > >>>>> >>>>>> > >>>>> >>>>>> > >>>>> >>>>>> Since we backed out the > >>>>> HandshakeAfterDeflateIdleMonitors option > >>>>> >>>>>> and the > >>>>> >>>>>> C2 ref_count changes and updated the copyright > >>>>> years, the "inc" > >>>>> >>>>>> webrev has > >>>>> >>>>>> a bit more noise in it than usual. Sorry about that! > >>>>> >>>>>> > >>>>> >>>>>> The OpenJDK wiki has been updated for this round > >>>>> of changes: > >>>>> >>>>>> > >>>>> >>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>> > >>>>> >>>>>> > >>>>> >>>>>> The jdk-15+11 based v2.10 version of the patch > >>>>> has been thru > >>>>> >>>>>> Mach5 tier[1-7] > >>>>> >>>>>> testing on Oracle's usual set of platforms. Mach5 > >>>>> tier8 is still > >>>>> >>>>>> running. > >>>>> >>>>>> I'm running the v2.10 patch through my usual set > >>>>> of stress > >>>>> >>>>>> testing on > >>>>> >>>>>> Linux-X64 and macOSX. > >>>>> >>>>>> > >>>>> >>>>>> I'm planning to do a SPECjbb2015 round on the > >>>>> >>>>>> CR10/v2.20/13-for-jdk15 bits. > >>>>> >>>>>> > >>>>> >>>>>> Thanks, in advance, for any questions, comments > >>>>> or suggestions. > >>>>> >>>>>> > >>>>> >>>>>> Dan > >>>>> >>>>>> > >>>>> >>>>>> > >>>>> >>>>>> On 2/4/20 9:41 AM, Daniel D. Daugherty wrote: > >>>>> >>>>>>> Greetings, > >>>>> >>>>>>> > >>>>> >>>>>>> This project is no longer targeted to JDK14 so > >>>>> this is NOT an > >>>>> >>>>>>> urgent code > >>>>> >>>>>>> review request. > >>>>> >>>>>>> > >>>>> >>>>>>> I've extracted the following three fixes from > >>>>> the Async Monitor > >>>>> >>>>>>> Deflation > >>>>> >>>>>>> project code: > >>>>> >>>>>>> > >>>>> >>>>>>> ? ? JDK-8235931 add OM_CACHE_LINE_SIZE and use > >>>>> smaller size on > >>>>> >>>>>>> SPARCv9 and X64 > >>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235931 > >>>>> >>>>>>> > >>>>> >>>>>>> ? ? JDK-8236035 refactor > >>>>> ObjectMonitor::set_owner() and _owner > >>>>> >>>>>>> field setting > >>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8236035 > >>>>> >>>>>>> > >>>>> >>>>>>> ? ? JDK-8235795 replace monitor list > >>>>> >>>>>>> mux{Acquire,Release}(&gListLock) with spin locks > >>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235795 > >>>>> >>>>>>> > >>>>> >>>>>>> Each of these has been reviewed separately and > >>>>> will be pushed to > >>>>> >>>>>>> JDK15 > >>>>> >>>>>>> in the near future (possibly by the end of this > >>>>> week). Of > >>>>> >>>>>>> course, there > >>>>> >>>>>>> were improvements during these review cycles and > >>>>> the purpose of > >>>>> >>>>>>> this > >>>>> >>>>>>> e-mail is to provided updated webrevs for this fix > >>>>> >>>>>>> (CR9/v2.09/12-for-jdk14) > >>>>> >>>>>>> within the revised context provided by {8235931, > >>>>> 8236035, 8235795}. > >>>>> >>>>>>> > >>>>> >>>>>>> Main bug URL: > >>>>> >>>>>>> > >>>>> >>>>>>> ??? JDK-8153224 Monitor deflation prolong safepoints > >>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>>>> > >>>>> >>>>>>> The project is currently baselined on jdk-14+34. > >>>>> >>>>>>> > >>>>> >>>>>>> Here's the full webrev URL for those folks that > >>>>> want to see all > >>>>> >>>>>>> of the > >>>>> >>>>>>> current Async Monitor Deflation code along with > >>>>> {8235931, > >>>>> >>>>>>> 8236035, 8235795} > >>>>> >>>>>>> in one go (v2.09b full): > >>>>> >>>>>>> > >>>>> >>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- > >> jdk14.v2.09b.full/ > >>>>> >>>>>>> > >>>>> >>>>>>> > >>>>> >>>>>>> Compare the open.patch file in > >>>>> 12-for-jdk14.v2.09.full and > >>>>> >>>>>>> 12-for-jdk14.v2.09b.full > >>>>> >>>>>>> using your favorite file comparison/merge tool > >>>>> to see how Async > >>>>> >>>>>>> Monitor Deflation > >>>>> >>>>>>> evolved due to {8235931, 8236035, 8235795}. > >>>>> >>>>>>> > >>>>> >>>>>>> Some folks might want to see just the Async > >>>>> Monitor Deflation > >>>>> >>>>>>> code on top of > >>>>> >>>>>>> {8235931, 8236035, 8235795} so here's a webrev > >>>>> for that (v2.09b > >>>>> >>>>>>> inc): > >>>>> >>>>>>> > >>>>> >>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- > >> jdk14.v2.09b.inc/ > >>>>> >>>>>>> > >>>>> >>>>>>> > >>>>> >>>>>>> These webrevs have gone thru several Mach5 > >>>>> Tier[1-8] runs along > >>>>> >>>>>>> with > >>>>> >>>>>>> my usual stress testing and SPECjbb2015 testing > >>>>> and there aren't > >>>>> >>>>>>> any > >>>>> >>>>>>> surprises relative to CR9/v2.09/12-for-jdk14. > >>>>> >>>>>>> > >>>>> >>>>>>> Thanks, in advance, for any questions, comments > >>>>> or suggestions. > >>>>> >>>>>>> > >>>>> >>>>>>> Dan > >>>>> >>>>>>> > >>>>> >>>>>>> > >>>>> >>>>>>> On 12/11/19 3:41 PM, Daniel D. Daugherty wrote: > >>>>> >>>>>>>> Greetings, > >>>>> >>>>>>>> > >>>>> >>>>>>>> I have made changes to the Async Monitor > >>>>> Deflation code in > >>>>> >>>>>>>> response to > >>>>> >>>>>>>> the CR8/v2.08/11-for-jdk14 code review cycle. > >>>>> Thanks to David > >>>>> >>>>>>>> H., Robbin > >>>>> >>>>>>>> and Erik O. for their comments! > >>>>> >>>>>>>> > >>>>> >>>>>>>> This project is no longer targeted to JDK14 so > >>>>> this is NOT an > >>>>> >>>>>>>> urgent code > >>>>> >>>>>>>> review request. The primary purpose of this > >>>>> webrev is simply to > >>>>> >>>>>>>> close the > >>>>> >>>>>>>> CR8/v2.08/11-for-jdk14 code review loop and to > >>>>> let folks see > >>>>> >>>>>>>> how I resolved > >>>>> >>>>>>>> the code review comments from that round. > >>>>> >>>>>>>> > >>>>> >>>>>>>> Most of the comments in the > >>>>> CR8/v2.08/11-for-jdk14 code review > >>>>> >>>>>>>> cycle were > >>>>> >>>>>>>> on the monitor list changes so I'm going to > >>>>> take a look at > >>>>> >>>>>>>> extracting those > >>>>> >>>>>>>> changes into a standalone patch. Switching from > >>>>> >>>>>>>> Thread::muxAcquire(&gListLock) > >>>>> >>>>>>>> and Thread::muxRelease(&gListLock) to finer > >>>>> grained internal > >>>>> >>>>>>>> spin locks needs > >>>>> >>>>>>>> to be thoroughly reviewed and the best way to > >>>>> do that is > >>>>> >>>>>>>> separately from the > >>>>> >>>>>>>> Async Monitor Deflation changes. Thanks to > >>>>> Coleen for > >>>>> >>>>>>>> suggesting doing this > >>>>> >>>>>>>> extraction earlier. > >>>>> >>>>>>>> > >>>>> >>>>>>>> I have attached the change list from CR8 to CR9 > >>>>> instead of > >>>>> >>>>>>>> putting it in > >>>>> >>>>>>>> the body of this email. I've also added a link > >>>>> to the > >>>>> >>>>>>>> CR8-to-CR9-changes > >>>>> >>>>>>>> file to the webrevs so it should be easy to find. > >>>>> >>>>>>>> > >>>>> >>>>>>>> Main bug URL: > >>>>> >>>>>>>> > >>>>> >>>>>>>> JDK-8153224 Monitor deflation prolong safepoints > >>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>>>>> > >>>>> >>>>>>>> The project is currently baselined on jdk-14+26. > >>>>> >>>>>>>> > >>>>> >>>>>>>> Here's the full webrev URL for those folks that > >>>>> want to see all > >>>>> >>>>>>>> of the > >>>>> >>>>>>>> current Async Monitor Deflation code in one go > >>>>> (v2.09 full): > >>>>> >>>>>>>> > >>>>> >>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- > >> jdk14.v2.09.full/ > >>>>> >>>>>>>> > >>>>> >>>>>>>> > >>>>> >>>>>>>> Some folks might want to see just what has > >>>>> changed since the > >>>>> >>>>>>>> last review > >>>>> >>>>>>>> cycle so here's a webrev for that (v2.09 inc): > >>>>> >>>>>>>> > >>>>> >>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- > >> jdk14.v2.09.inc/ > >>>>> >>>>>>>> > >>>>> >>>>>>>> > >>>>> >>>>>>>> The OpenJDK wiki has NOT yet been updated for > >>>>> this round of > >>>>> >>>>>>>> changes: > >>>>> >>>>>>>> > >>>>> >>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>> > >>>>> >>>>>>>> > >>>>> >>>>>>>> The jdk-14+26 based v2.09 version of the patch > >>>>> has been thru > >>>>> >>>>>>>> Mach5 tier[1-7] > >>>>> >>>>>>>> testing on Oracle's usual set of platforms. > >>>>> Mach5 tier8 is > >>>>> >>>>>>>> still running. > >>>>> >>>>>>>> A slightly older version of the v2.09 patch has > >>>>> also been > >>>>> >>>>>>>> through my usual > >>>>> >>>>>>>> set of stress testing on Linux-X64 and macOSX > >>>>> with the addition > >>>>> >>>>>>>> of Robbin's > >>>>> >>>>>>>> "MoCrazy 1024" test running in parallel on > >>>>> Linux-X64 with the > >>>>> >>>>>>>> other tests in > >>>>> >>>>>>>> my lab. The "MoCrazy 1024" has been going for > > >>>>> 5 days and > >>>>> >>>>>>>> 6700+ iterations > >>>>> >>>>>>>> without any failures. > >>>>> >>>>>>>> > >>>>> >>>>>>>> I'm planning to do a SPECjbb2015 round on the > >>>>> >>>>>>>> CR9/v2.09/12-for-jdk14 bits. > >>>>> >>>>>>>> > >>>>> >>>>>>>> Thanks, in advance, for any questions, comments > >>>>> or suggestions. > >>>>> >>>>>>>> > >>>>> >>>>>>>> Dan > >>>>> >>>>>>>> > >>>>> >>>>>>>> > >>>>> >>>>>>>> On 11/4/19 4:03 PM, Daniel D. Daugherty wrote: > >>>>> >>>>>>>>> Greetings, > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> I have made changes to the Async Monitor > >>>>> Deflation code in > >>>>> >>>>>>>>> response to > >>>>> >>>>>>>>> the CR7/v2.07/10-for-jdk14 code review cycle. > >>>>> Thanks to David > >>>>> >>>>>>>>> H., Robbin > >>>>> >>>>>>>>> and Erik O. for their comments! > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> JDK14 Rampdown phase one is coming on Dec. 12, > >>>>> 2019 and the > >>>>> >>>>>>>>> Async Monitor > >>>>> >>>>>>>>> Deflation project needs to push before Nov. > >>>>> 12, 2019 in order > >>>>> >>>>>>>>> to allow > >>>>> >>>>>>>>> for sufficient bake time for such a big > >>>>> change. Nov. 12 is > >>>>> >>>>>>>>> _next_ Tuesday > >>>>> >>>>>>>>> so we have 8 days from today to finish this > >>>>> code review cycle > >>>>> >>>>>>>>> and push > >>>>> >>>>>>>>> this code for JDK14. > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> Carsten and Roman! Time for you guys to chime > >>>>> in again on the > >>>>> >>>>>>>>> code reviews. > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> I have attached the change list from CR7 to > >>>>> CR8 instead of > >>>>> >>>>>>>>> putting it in > >>>>> >>>>>>>>> the body of this email. I've also added a link > >>>>> to the > >>>>> >>>>>>>>> CR7-to-CR8-changes > >>>>> >>>>>>>>> file to the webrevs so it should be easy to find. > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> Main bug URL: > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints > >>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- > 8153224 > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> The project is currently baselined on jdk-14+21. > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> Here's the full webrev URL for those folks > >>>>> that want to see > >>>>> >>>>>>>>> all of the > >>>>> >>>>>>>>> current Async Monitor Deflation code in one go > >>>>> (v2.08 full): > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for- > >> jdk14.v2.08.full > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> Some folks might want to see just what has > >>>>> changed since the > >>>>> >>>>>>>>> last review > >>>>> >>>>>>>>> cycle so here's a webrev for that (v2.08 inc): > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for- > >> jdk14.v2.08.inc/ > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> The OpenJDK wiki did not need any changes for > >>>>> this round: > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> The jdk-14+21 based v2.08 version of the patch > >>>>> has been thru > >>>>> >>>>>>>>> Mach5 tier[1-8] > >>>>> >>>>>>>>> testing on Oracle's usual set of platforms. It > >>>>> has also been > >>>>> >>>>>>>>> through my usual > >>>>> >>>>>>>>> set of stress testing on Linux-X64, macOSX and > >>>>> Solaris-X64 > >>>>> >>>>>>>>> with the addition > >>>>> >>>>>>>>> of Robbin's "MoCrazy 1024" test running in > >>>>> parallel with the > >>>>> >>>>>>>>> other tests in > >>>>> >>>>>>>>> my lab. Some testing is still running, but so > >>>>> far there are no > >>>>> >>>>>>>>> new regressions. > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> I have not yet done a SPECjbb2015 round on the > >>>>> >>>>>>>>> CR8/v2.08/11-for-jdk14 bits. > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> Thanks, in advance, for any questions, > >>>>> comments or suggestions. > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> Dan > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> > >>>>> >>>>>>>>> On 10/17/19 5:50 PM, Daniel D. Daugherty wrote: > >>>>> >>>>>>>>>> Greetings, > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> The Async Monitor Deflation project is > >>>>> reaching the end game. > >>>>> >>>>>>>>>> I have no > >>>>> >>>>>>>>>> changes planned for the project at this time > >>>>> so all that is > >>>>> >>>>>>>>>> left is code > >>>>> >>>>>>>>>> review and any changes that results from > >>>>> those reviews. > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> Carsten and Roman! Time for you guys to chime > >>>>> in again on the > >>>>> >>>>>>>>>> code reviews. > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> I have attached the list of fixes from CR6 to > >>>>> CR7 instead of > >>>>> >>>>>>>>>> putting it > >>>>> >>>>>>>>>> in the main body of this email. > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> Main bug URL: > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints > >>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- > 8153224 > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> The project is currently baselined on jdk-14+19. > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> Here's the full webrev URL for those folks > >>>>> that want to see > >>>>> >>>>>>>>>> all of the > >>>>> >>>>>>>>>> current Async Monitor Deflation code in one > >>>>> go (v2.07 full): > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for- > >> jdk14.v2.07.full > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> Some folks might want to see just what has > >>>>> changed since the > >>>>> >>>>>>>>>> last review > >>>>> >>>>>>>>>> cycle so here's a webrev for that (v2.07 inc): > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for- > >> jdk14.v2.07.inc/ > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> The OpenJDK wiki has been updated to match the > >>>>> >>>>>>>>>> CR7/v2.07/10-for-jdk14 changes: > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> The jdk-14+18 based v2.07 version of the > >>>>> patch has been thru > >>>>> >>>>>>>>>> Mach5 tier[1-8] > >>>>> >>>>>>>>>> testing on Oracle's usual set of platforms. > >>>>> It has also been > >>>>> >>>>>>>>>> through my usual > >>>>> >>>>>>>>>> set of stress testing on Linux-X64, macOSX > >>>>> and Solaris-X64 > >>>>> >>>>>>>>>> with the addition > >>>>> >>>>>>>>>> of Robbin's "MoCrazy 1024" test running in > >>>>> parallel with the > >>>>> >>>>>>>>>> other tests in > >>>>> >>>>>>>>>> my lab. > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> The jdk-14+19 based v2.07 version of the > >>>>> patch has been thru > >>>>> >>>>>>>>>> Mach5 tier[1-3] > >>>>> >>>>>>>>>> test on Oracle's usual set of platforms. > >>>>> Mach5 tier[4-8] are > >>>>> >>>>>>>>>> in process. > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> I did another round of SPECjbb2015 testing in > >>>>> Oracle's Aurora > >>>>> >>>>>>>>>> Performance lab > >>>>> >>>>>>>>>> using using their tuned SPECjbb2015 Linux-X64 > >>>>> G1 configs: > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> - "base" is jdk-14+18 > >>>>> >>>>>>>>>> - "v2.07" is the latest version and includes C2 > >>>>> >>>>>>>>>> inc_om_ref_count() support > >>>>> >>>>>>>>>> ????? on LP64 X64 and the new > >>>>> >>>>>>>>>> HandshakeAfterDeflateIdleMonitors option > >>>>> >>>>>>>>>> - "off" is with -XX:-AsyncDeflateIdleMonitors > >>>>> specified > >>>>> >>>>>>>>>> - "handshake" is with > >>>>> >>>>>>>>>> -XX:+HandshakeAfterDeflateIdleMonitors > specified > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> ???????? hbIR?????????? hbIR > >>>>> >>>>>>>>>> (max attempted)? (settled)? max-jOPS > >>>>> critical-jOPS runtime > >>>>> >>>>>>>>>> ---------------? ---------? -------- > >>>>> ------------- ------- > >>>>> >>>>>>>>>> ?????????? 34282.00?? 30635.90? 28831.30 > >>>>> 20969.20 3841.30 base > >>>>> >>>>>>>>>> ?????????? 34282.00?? 30973.00? 29345.80 > >>>>> 21025.20 3964.10 v2.07 > >>>>> >>>>>>>>>> ?????????? 34282.00?? 31105.60? 29174.30 > >>>>> 21074.00 3931.30 > >>>>> >>>>>>>>>> v2.07_handshake > >>>>> >>>>>>>>>> ?????????? 34282.00?? 30789.70? 27151.60 > >>>>> 19839.10 3850.20 > >>>>> >>>>>>>>>> v2.07_off > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> - The Aurora Perf comparison tool reports: > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> ??????? Comparison????????????? max-jOPS > >>>>> critical-jOPS > >>>>> >>>>>>>>>> ??????? ---------------------- > >>>>> -------------------- > >>>>> >>>>>>>>>> -------------------- > >>>>> >>>>>>>>>> ??????? base vs 2.07??????????? +1.78% (s, > >>>>> p=0.000) +0.27% > >>>>> >>>>>>>>>> (ns, p=0.790) > >>>>> >>>>>>>>>> ??????? base vs 2.07_handshake? +1.19% (s, > >>>>> p=0.007) +0.58% > >>>>> >>>>>>>>>> (ns, p=0.536) > >>>>> >>>>>>>>>> ??????? base vs 2.07_off??????? -5.83% (ns, > >>>>> p=0.394) -5.39% > >>>>> >>>>>>>>>> (ns, p=0.347) > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> ??????? (s) - significant? (ns) - not-significant > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> - For historical comparison, the Aurora Perf > >>>>> comparision > >>>>> >>>>>>>>>> tool > >>>>> >>>>>>>>>> ??????? reported for v2.06 with a baseline of > >>>>> jdk-13+31: > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> ??????? Comparison????????????? max-jOPS > >>>>> critical-jOPS > >>>>> >>>>>>>>>> ??????? ---------------------- > >>>>> -------------------- > >>>>> >>>>>>>>>> -------------------- > >>>>> >>>>>>>>>> ??????? base vs 2.06??????????? -0.32% (ns, > >>>>> p=0.345) +0.71% > >>>>> >>>>>>>>>> (ns, p=0.646) > >>>>> >>>>>>>>>> ??????? base vs 2.06_off??????? +0.49% (ns, > >>>>> p=0.292) -1.21% > >>>>> >>>>>>>>>> (ns, p=0.481) > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> ??????? (s) - significant? (ns) - not-significant > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> Thanks, in advance, for any questions, > >>>>> comments or suggestions. > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> Dan > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>>> On 8/28/19 5:02 PM, Daniel D. Daugherty wrote: > >>>>> >>>>>>>>>>> Greetings, > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> The Async Monitor Deflation project has > >>>>> rebased to JDK14 so > >>>>> >>>>>>>>>>> it's time > >>>>> >>>>>>>>>>> for our first code review in that new context!! > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> I've been focused on changing the monitor > >>>>> list management > >>>>> >>>>>>>>>>> code to be > >>>>> >>>>>>>>>>> lock-free in order to make SPECjbb2015 > >>>>> happier. Of course > >>>>> >>>>>>>>>>> with a change > >>>>> >>>>>>>>>>> like that, it takes a while to chase down > >>>>> all the new and > >>>>> >>>>>>>>>>> wonderful > >>>>> >>>>>>>>>>> races. At this point, I have the code back > >>>>> to the same > >>>>> >>>>>>>>>>> stability that > >>>>> >>>>>>>>>>> I had with CR5/v2.05/8-for-jdk13. > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> To lay the ground work for this round of > >>>>> review, I pushed > >>>>> >>>>>>>>>>> the following > >>>>> >>>>>>>>>>> two fixes to jdk/jdk earlier today: > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> ??? JDK-8230184 rename, whitespace, indent > >>>>> and comments > >>>>> >>>>>>>>>>> changes in preparation > >>>>> >>>>>>>>>>> ? ? ??????????? for lock free Monitor lists > >>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- > >> 8230184 > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> ??? JDK-8230317 > >>>>> serviceability/sa/ClhsdbPrintStatics.java > >>>>> >>>>>>>>>>> fails after 8230184 > >>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- > >> 8230317 > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> I have attached the list of fixes from CR5 > >>>>> to CR6 instead of > >>>>> >>>>>>>>>>> putting > >>>>> >>>>>>>>>>> in the main body of this email. > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> Main bug URL: > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong > >>>>> safepoints > >>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- > >> 8153224 > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> The project is currently baselined on > >>>>> jdk-14+11 plus the > >>>>> >>>>>>>>>>> fixes for > >>>>> >>>>>>>>>>> JDK-8230184 and JDK-8230317. > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> Here's the full webrev URL for those folks > >>>>> that want to see > >>>>> >>>>>>>>>>> all of the > >>>>> >>>>>>>>>>> current Async Monitor Deflation code in one > >>>>> go (v2.06 full): > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > >> jdk14.v2.06.full/ > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> The primary focus of this review cycle is on > >>>>> the lock-free > >>>>> >>>>>>>>>>> Monitor List > >>>>> >>>>>>>>>>> management changes so here's a webrev for > >>>>> just that patch > >>>>> >>>>>>>>>>> (v2.06c): > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > >> jdk14.v2.06c.inc/ > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> The secondary focus of this review cycle is > >>>>> on the bug fixes > >>>>> >>>>>>>>>>> that have > >>>>> >>>>>>>>>>> been made since CR5/v2.05/8-for-jdk13 so > >>>>> here's a webrev for > >>>>> >>>>>>>>>>> just that > >>>>> >>>>>>>>>>> patch (v2.06b): > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > >> jdk14.v2.06b.inc/ > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> The third and final bucket for this review > >>>>> cycle is the > >>>>> >>>>>>>>>>> rename, whitespace, > >>>>> >>>>>>>>>>> indent and comments changes made in > >>>>> preparation for lock > >>>>> >>>>>>>>>>> free Monitor list > >>>>> >>>>>>>>>>> management. Almost all of that was extracted > >>>>> into > >>>>> >>>>>>>>>>> JDK-8230184 for the > >>>>> >>>>>>>>>>> baseline so this bucket now has just a few > >>>>> comment changes > >>>>> >>>>>>>>>>> relative to > >>>>> >>>>>>>>>>> CR5/v2.05/8-for-jdk13. Here's a webrev for > >>>>> the remainder > >>>>> >>>>>>>>>>> (v2.06a): > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > >> jdk14.v2.06a.inc/ > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> Some folks might want to see just what has > >>>>> changed since the > >>>>> >>>>>>>>>>> last review > >>>>> >>>>>>>>>>> cycle so here's a webrev for that (v2.06 inc): > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > >> jdk14.v2.06.inc/ > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> Last, but not least, some folks might want > >>>>> to see the code > >>>>> >>>>>>>>>>> before the > >>>>> >>>>>>>>>>> addition of lock-free Monitor List > >>>>> management so here's a > >>>>> >>>>>>>>>>> webrev for > >>>>> >>>>>>>>>>> that (v2.00 -> v2.05): > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- > >> jdk14.v2.05.inc/ > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> The OpenJDK wiki will need minor updates to > >>>>> match the CR6 > >>>>> >>>>>>>>>>> changes: > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> but that should only be changes to describe > >>>>> per-thread list > >>>>> >>>>>>>>>>> async monitor > >>>>> >>>>>>>>>>> deflation being done by the ServiceThread. > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> (I did update the OpenJDK wiki for the CR5 > >>>>> changes back on > >>>>> >>>>>>>>>>> 2019.08.14) > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> This version of the patch has been thru > >>>>> Mach5 tier[1-8] > >>>>> >>>>>>>>>>> testing on > >>>>> >>>>>>>>>>> Oracle's usual set of platforms. It has also > >>>>> been through my > >>>>> >>>>>>>>>>> usual set > >>>>> >>>>>>>>>>> of stress testing on Linux-X64, macOSX and > >>>>> Solaris-X64. > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> I did a bunch of SPECjbb2015 testing in > >>>>> Oracle's Aurora > >>>>> >>>>>>>>>>> Performance lab > >>>>> >>>>>>>>>>> using using their tuned SPECjbb2015 > >>>>> Linux-X64 G1 configs. > >>>>> >>>>>>>>>>> This was using > >>>>> >>>>>>>>>>> this patch baselined on jdk-13+31 (for > >>>>> stability): > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> ????????? hbIR?????????? hbIR > >>>>> >>>>>>>>>>> ???? (max attempted)? (settled)? max-jOPS > >>>>> critical-jOPS runtime > >>>>> >>>>>>>>>>> ???? ---------------? ---------? -------- > >>>>> ------------- ------- > >>>>> >>>>>>>>>>> ??????????? 34282.00?? 28837.20? 27905.20 > >>>>> 19817.40 3658.10 base > >>>>> >>>>>>>>>>> ??????????? 34965.70?? 29798.80? 27814.90 > >>>>> 19959.00 3514.60 > >>>>> >>>>>>>>>>> v2.06d > >>>>> >>>>>>>>>>> ??????????? 34282.00?? 29100.70? 28042.50 > >>>>> 19577.00 3701.90 > >>>>> >>>>>>>>>>> v2.06d_off > >>>>> >>>>>>>>>>> ??????????? 34282.00?? 29218.50? 27562.80 > >>>>> 19397.30 3657.60 > >>>>> >>>>>>>>>>> v2.06d_ocache > >>>>> >>>>>>>>>>> ??????????? 34965.70?? 29838.30? 26512.40 > >>>>> 19170.60 3569.90 > >>>>> >>>>>>>>>>> v2.05 > >>>>> >>>>>>>>>>> ??????????? 34282.00?? 28926.10? 27734.00 > >>>>> 19835.10 3588.40 > >>>>> >>>>>>>>>>> v2.05_off > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> The "off" configs are with > >>>>> -XX:-AsyncDeflateIdleMonitors > >>>>> >>>>>>>>>>> specified and > >>>>> >>>>>>>>>>> the "ocache" config is with 128 byte cache > >>>>> line sizes > >>>>> >>>>>>>>>>> instead of 64 byte > >>>>> >>>>>>>>>>> cache lines sizes. "v2.06d" is the last set > >>>>> of changes that > >>>>> >>>>>>>>>>> I made before > >>>>> >>>>>>>>>>> those changes were distributed into the > >>>>> "v2.06a", "v2.06b" > >>>>> >>>>>>>>>>> and "v2.06c" > >>>>> >>>>>>>>>>> buckets for this review recycle. > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> Thanks, in advance, for any questions, > >>>>> comments or suggestions. > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> Dan > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>>> On 7/11/19 3:49 PM, Daniel D. Daugherty wrote: > >>>>> >>>>>>>>>>>> Greetings, > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> I've been focused on chasing down and > >>>>> fixing the rare test > >>>>> >>>>>>>>>>>> failures > >>>>> >>>>>>>>>>>> that only pop up rarely. So this round is > >>>>> primarily fixes > >>>>> >>>>>>>>>>>> for races > >>>>> >>>>>>>>>>>> with a few additional fixes that came from > >>>>> Karen's review > >>>>> >>>>>>>>>>>> of CR4. > >>>>> >>>>>>>>>>>> Thanks Karen! > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> I have attached the list of fixes from CR4 > >>>>> to CR5 instead > >>>>> >>>>>>>>>>>> of putting > >>>>> >>>>>>>>>>>> in the main body of this email. > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> Main bug URL: > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong > >>>>> safepoints > >>>>> >>>>>>>>>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> The project is currently baselined on > >>>>> jdk-13+29. This will > >>>>> >>>>>>>>>>>> likely be > >>>>> >>>>>>>>>>>> the last JDK13 baseline for this project > >>>>> and I'll roll to > >>>>> >>>>>>>>>>>> the JDK14 > >>>>> >>>>>>>>>>>> (jdk/jdk) repo soon... > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> Here's the full webrev URL: > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for- > >> jdk13.full/ > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> Here's the incremental webrev URL: > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for- > >> jdk13.inc/ > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> I have not yet checked the OpenJDK wiki to > >>>>> see if it needs > >>>>> >>>>>>>>>>>> any updates > >>>>> >>>>>>>>>>>> to match the CR5 changes: > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> (I did update the OpenJDK wiki for the CR4 > >>>>> changes back on > >>>>> >>>>>>>>>>>> 2019.06.26) > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> This version of the patch has been thru > >>>>> Mach5 tier[1-3] > >>>>> >>>>>>>>>>>> testing on > >>>>> >>>>>>>>>>>> Oracle's usual set of platforms. Mach5 > >>>>> tier[4-6] is running > >>>>> >>>>>>>>>>>> now and > >>>>> >>>>>>>>>>>> Mach5 tier[78] will follow. I'll kick off > >>>>> the usual stress > >>>>> >>>>>>>>>>>> testing > >>>>> >>>>>>>>>>>> on Linux-X64, macOSX and Solaris-X64 as > >>>>> those machines > >>>>> >>>>>>>>>>>> become available. > >>>>> >>>>>>>>>>>> Since I haven't made any performance > >>>>> changes in this round, > >>>>> >>>>>>>>>>>> I'll only > >>>>> >>>>>>>>>>>> be running SPECjbb2015 to gather the latest > >>>>> >>>>>>>>>>>> monitorinflation logs. > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> Next up: > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> - We're still seeing 4-5% lower performance > >>>>> with > >>>>> >>>>>>>>>>>> SPECjbb2015 on > >>>>> >>>>>>>>>>>> ? Linux-X64 and we've determined that some > >>>>> of that comes from > >>>>> >>>>>>>>>>>> ? contention on the gListLock. So I'm going > >>>>> to investigate > >>>>> >>>>>>>>>>>> removing > >>>>> >>>>>>>>>>>> ? the gListLock. Yes, another lock free set > >>>>> of changes is > >>>>> >>>>>>>>>>>> coming! > >>>>> >>>>>>>>>>>> - Of course, going lock free often causes > >>>>> new races and new > >>>>> >>>>>>>>>>>> failures > >>>>> >>>>>>>>>>>> ? so that's a good reason for make those > >>>>> changes isolated > >>>>> >>>>>>>>>>>> in their > >>>>> >>>>>>>>>>>> ? own round (and not holding up > >>>>> CR5/v2.05/8-for-jdk13 > >>>>> >>>>>>>>>>>> anymore). > >>>>> >>>>>>>>>>>> - I finally have a potential fix for the > >>>>> Win* failure with > >>>>> >>>>>>>>>>>> > >>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java > >>>>> >>>>>>>>>>>> ? but I haven't run it through Mach5 yet so > >>>>> it'll be in the > >>>>> >>>>>>>>>>>> next round. > >>>>> >>>>>>>>>>>> - Some RTM tests were recently re-enabled > >>>>> in Mach5 and I'm > >>>>> >>>>>>>>>>>> seeing some > >>>>> >>>>>>>>>>>> ? monitor related failures there. I suspect > >>>>> that I need to > >>>>> >>>>>>>>>>>> go take a > >>>>> >>>>>>>>>>>> ? look at the C2 RTM macro assembler code > >>>>> and look for > >>>>> >>>>>>>>>>>> things that might > >>>>> >>>>>>>>>>>> ? conflict if Async Monitor Deflation. If > >>>>> you're interested > >>>>> >>>>>>>>>>>> in that kind > >>>>> >>>>>>>>>>>> ? of issue, then see the > >>>>> macroAssembler_x86.cpp sanity > >>>>> >>>>>>>>>>>> check that I > >>>>> >>>>>>>>>>>> ? added in this round! > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> Thanks, in advance, for any questions, > >>>>> comments or > >>>>> >>>>>>>>>>>> suggestions. > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> Dan > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>>> On 5/26/19 8:30 PM, Daniel D. Daugherty > wrote: > >>>>> >>>>>>>>>>>>> Greetings, > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> I have a fix for an issue that came up > >>>>> during performance > >>>>> >>>>>>>>>>>>> testing. > >>>>> >>>>>>>>>>>>> Many thanks to Robbin for diagnosing the > >>>>> issue in his > >>>>> >>>>>>>>>>>>> SPECjbb2015 > >>>>> >>>>>>>>>>>>> experiments. > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Here's the list of changes from CR3 to > >>>>> CR4. The list is a bit > >>>>> >>>>>>>>>>>>> verbose due to the complexity of the > >>>>> issue, but the changes > >>>>> >>>>>>>>>>>>> themselves are not that big. > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Functional: > >>>>> >>>>>>>>>>>>> ? - Change > >>>>> SafepointSynchronize::is_cleanup_needed() from > >>>>> >>>>>>>>>>>>> calling > >>>>> >>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() to > >>>>> calling > >>>>> >>>>>>>>>>>>> > >>>>> ObjectSynchronizer::is_safepoint_deflation_needed(): > >>>>> >>>>>>>>>>>>> ??? - is_safepoint_deflation_needed() > >>>>> returns the result of > >>>>> >>>>>>>>>>>>> monitors_used_above_threshold() for > >>>>> safepoint based > >>>>> >>>>>>>>>>>>> ????? monitor deflation > >>>>> (!AsyncDeflateIdleMonitors). > >>>>> >>>>>>>>>>>>> ??? - For AsyncDeflateIdleMonitors, it > >>>>> only returns true if > >>>>> >>>>>>>>>>>>> ????? there is a special deflation > >>>>> request, e.g., System.gc() > >>>>> >>>>>>>>>>>>> ????? - This solves a bug where there are > >>>>> a bunch of Cleanup > >>>>> >>>>>>>>>>>>> ??????? safepoints that simply request > >>>>> async deflation which > >>>>> >>>>>>>>>>>>> ??????? keeps the async JavaThreads from > >>>>> making progress on > >>>>> >>>>>>>>>>>>> ??????? their async deflation work. > >>>>> >>>>>>>>>>>>> ? - Add AsyncDeflationInterval diagnostic > >>>>> option. > >>>>> >>>>>>>>>>>>> Description: > >>>>> >>>>>>>>>>>>> ????? Async deflate idle monitors every so > >>>>> many > >>>>> >>>>>>>>>>>>> milliseconds when > >>>>> >>>>>>>>>>>>> MonitorUsedDeflationThreshold is exceeded > >>>>> (0 is off). > >>>>> >>>>>>>>>>>>> ? - Replace > >>>>> >>>>>>>>>>>>> > >>>>> ObjectSynchronizer::gOmShouldDeflateIdleMonitors() with > >>>>> >>>>>>>>>>>>> > >>>>> ObjectSynchronizer::is_async_deflation_needed(): > >>>>> >>>>>>>>>>>>> ??? - is_async_deflation_needed() returns > >>>>> true when > >>>>> >>>>>>>>>>>>> is_async_cleanup_requested() is true or > when > >>>>> >>>>>>>>>>>>> monitors_used_above_threshold() is true > >>>>> (but no more > >>>>> >>>>>>>>>>>>> often than > >>>>> >>>>>>>>>>>>> AsyncDeflationInterval). > >>>>> >>>>>>>>>>>>> ??? - if AsyncDeflateIdleMonitors > >>>>> Service_lock->wait() now > >>>>> >>>>>>>>>>>>> waits for > >>>>> >>>>>>>>>>>>> ????? at most GuaranteedSafepointInterval > >>>>> millis: > >>>>> >>>>>>>>>>>>> ????? - This allows > >>>>> is_async_deflation_needed() to be > >>>>> >>>>>>>>>>>>> checked at > >>>>> >>>>>>>>>>>>> ??????? the same interval as > >>>>> GuaranteedSafepointInterval. > >>>>> >>>>>>>>>>>>> ??????? (default is 1000 millis/1 second) > >>>>> >>>>>>>>>>>>> ????? - Once is_async_deflation_needed() > >>>>> has returned > >>>>> >>>>>>>>>>>>> true, it > >>>>> >>>>>>>>>>>>> ??????? generally cannot return true for > >>>>> >>>>>>>>>>>>> AsyncDeflationInterval. > >>>>> >>>>>>>>>>>>> ??????? This is to prevent async deflation > >>>>> from swamping the > >>>>> >>>>>>>>>>>>> ServiceThread. > >>>>> >>>>>>>>>>>>> ? - The ServiceThread still handles async > >>>>> deflation of the > >>>>> >>>>>>>>>>>>> global > >>>>> >>>>>>>>>>>>> ??? in-use list and now it also marks > >>>>> JavaThreads for > >>>>> >>>>>>>>>>>>> async deflation > >>>>> >>>>>>>>>>>>> ??? of their in-use lists. > >>>>> >>>>>>>>>>>>> ??? - The ServiceThread will check for > >>>>> async deflation > >>>>> >>>>>>>>>>>>> work every > >>>>> >>>>>>>>>>>>> GuaranteedSafepointInterval. > >>>>> >>>>>>>>>>>>> ??? - A safepoint can still cause the > >>>>> ServiceThread to > >>>>> >>>>>>>>>>>>> check for > >>>>> >>>>>>>>>>>>> ????? async deflation work via > >>>>> is_async_deflation_requested. > >>>>> >>>>>>>>>>>>> ? - Refactor code from > >>>>> >>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() > into > >>>>> >>>>>>>>>>>>> monitors_used_above_threshold() and > remove > >>>>> >>>>>>>>>>>>> is_cleanup_needed(). > >>>>> >>>>>>>>>>>>> ? - In addition to System.gc(), the > >>>>> VM_Exit VM op and the > >>>>> >>>>>>>>>>>>> final > >>>>> >>>>>>>>>>>>> ??? VMThread safepoint now set the > >>>>> >>>>>>>>>>>>> is_special_deflation_requested > >>>>> >>>>>>>>>>>>> ??? flag to reduce the in-use monitor > >>>>> population that is > >>>>> >>>>>>>>>>>>> reported by > >>>>> >>>>>>>>>>>>> > >>>>> ObjectSynchronizer::log_in_use_monitor_details() at VM exit. > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Test update: > >>>>> >>>>>>>>>>>>> ? - > >>>>> test/hotspot/gtest/oops/test_markOop.cpp is updated to > >>>>> >>>>>>>>>>>>> work with > >>>>> >>>>>>>>>>>>> AsyncDeflateIdleMonitors. > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Collateral: > >>>>> >>>>>>>>>>>>> ? - Add/clarify/update some logging messages. > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Cleanup: > >>>>> >>>>>>>>>>>>> ? - Updated comments based on Karen's code > >>>>> review. > >>>>> >>>>>>>>>>>>> ? - Change 'special cleanup' -> 'special > >>>>> deflation' and > >>>>> >>>>>>>>>>>>> ??? 'async cleanup' -> 'async deflation'. > >>>>> >>>>>>>>>>>>> ??? - comment and function name changes > >>>>> >>>>>>>>>>>>> ? - Clarify MonitorUsedDeflationThreshold > >>>>> description; > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Main bug URL: > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong > >>>>> safepoints > >>>>> >>>>>>>>>>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> The project is currently baselined on > >>>>> jdk-13+22. > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Here's the full webrev URL: > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for- > >> jdk13.full/ > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Here's the incremental webrev URL: > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for- > >> jdk13.inc/ > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> I have not updated the OpenJDK wiki to > >>>>> reflect the CR4 > >>>>> >>>>>>>>>>>>> changes: > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> The wiki doesn't say a whole lot about the > >>>>> async deflation > >>>>> >>>>>>>>>>>>> invocation > >>>>> >>>>>>>>>>>>> mechanism so I have to figure out how to > >>>>> add that content. > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> This version of the patch has been thru > >>>>> Mach5 tier[1-8] > >>>>> >>>>>>>>>>>>> testing on > >>>>> >>>>>>>>>>>>> Oracle's usual set of platforms. My > >>>>> Solaris-X64 stress kit > >>>>> >>>>>>>>>>>>> run is > >>>>> >>>>>>>>>>>>> running now. Kitchensink8H on product, > >>>>> fastdebug, and > >>>>> >>>>>>>>>>>>> slowdebug bits > >>>>> >>>>>>>>>>>>> are running on Linux-X64, MacOSX and > >>>>> Solaris-X64. I still > >>>>> >>>>>>>>>>>>> have to run > >>>>> >>>>>>>>>>>>> my stress kit on Linux-X64. I still have > >>>>> to run the > >>>>> >>>>>>>>>>>>> SPECjbb2015 > >>>>> >>>>>>>>>>>>> baseline and CR4 runs on Linux-X64, MacOSX > >>>>> and Solaris-X64. > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Thanks, in advance, for any questions, > >>>>> comments or > >>>>> >>>>>>>>>>>>> suggestions. > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> Dan > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> On 5/6/19 11:52 AM, Daniel D. Daugherty > wrote: > >>>>> >>>>>>>>>>>>>> Greetings, > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> I had some discussions with Karen about a > >>>>> race that was > >>>>> >>>>>>>>>>>>>> in the > >>>>> >>>>>>>>>>>>>> ObjectMonitor::enter() code in > >>>>> CR2/v2.02/5-for-jdk13. > >>>>> >>>>>>>>>>>>>> This race was > >>>>> >>>>>>>>>>>>>> theoretical and I had no test failures > >>>>> due to it. The fix > >>>>> >>>>>>>>>>>>>> is pretty > >>>>> >>>>>>>>>>>>>> simple: remove the special case code for > >>>>> async deflation > >>>>> >>>>>>>>>>>>>> in the > >>>>> >>>>>>>>>>>>>> ObjectMonitor::enter() function and rely > >>>>> solely on the > >>>>> >>>>>>>>>>>>>> ref_count > >>>>> >>>>>>>>>>>>>> for ObjectMonitor::enter() protection. > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> During those discussions Karen also > >>>>> floated the idea of > >>>>> >>>>>>>>>>>>>> using the > >>>>> >>>>>>>>>>>>>> ref_count field instead of the > >>>>> contentions field for the > >>>>> >>>>>>>>>>>>>> Async > >>>>> >>>>>>>>>>>>>> Monitor Deflation protocol. I decided to > >>>>> go ahead and > >>>>> >>>>>>>>>>>>>> code up that > >>>>> >>>>>>>>>>>>>> change and I have run it through the > >>>>> usual stress and > >>>>> >>>>>>>>>>>>>> Mach5 testing > >>>>> >>>>>>>>>>>>>> with no issues. It's also known as v2.03 > >>>>> (for those for > >>>>> >>>>>>>>>>>>>> with the > >>>>> >>>>>>>>>>>>>> patches) and as webrev/6-for-jdk13 (for > >>>>> those with webrev > >>>>> >>>>>>>>>>>>>> URLs). > >>>>> >>>>>>>>>>>>>> Sorry for all the names... > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> Main bug URL: > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong > >>>>> safepoints > >>>>> >>>>>>>>>>>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> The project is currently baselined on > >>>>> jdk-13+18. > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> Here's the full webrev URL: > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for- > >> jdk13.full/ > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> Here's the incremental webrev URL: > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for- > >> jdk13.inc/ > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> I have also updated the OpenJDK wiki to > >>>>> reflect the CR3 > >>>>> >>>>>>>>>>>>>> changes: > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> This version of the patch has been thru > >>>>> Mach5 tier[1-8] > >>>>> >>>>>>>>>>>>>> testing on > >>>>> >>>>>>>>>>>>>> Oracle's usual set of platforms. My > >>>>> Solaris-X64 stress > >>>>> >>>>>>>>>>>>>> kit run had > >>>>> >>>>>>>>>>>>>> no issues. Kitchensink8H on product, > >>>>> fastdebug, and > >>>>> >>>>>>>>>>>>>> slowdebug bits > >>>>> >>>>>>>>>>>>>> had no failures on Linux-X64; MacOSX > >>>>> fastdebug and > >>>>> >>>>>>>>>>>>>> slowdebug and > >>>>> >>>>>>>>>>>>>> Solaris-X64 release had the usual "Too > >>>>> large time diff" > >>>>> >>>>>>>>>>>>>> complaints. > >>>>> >>>>>>>>>>>>>> 12 hour Inflate2 runs on product, > >>>>> fastdebug and slowdebug > >>>>> >>>>>>>>>>>>>> bits on > >>>>> >>>>>>>>>>>>>> Linux-X64, MacOSX and Solaris-X64 had no > >>>>> failures. My > >>>>> >>>>>>>>>>>>>> Linux-X64 > >>>>> >>>>>>>>>>>>>> stress kit is running right now. > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> I've done the SPECjbb2015 baseline and > >>>>> CR3 runs. I need > >>>>> >>>>>>>>>>>>>> to gather > >>>>> >>>>>>>>>>>>>> the results and analyze them. > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> Thanks, in advance, for any questions, > >>>>> comments or > >>>>> >>>>>>>>>>>>>> suggestions. > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> Dan > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> On 4/25/19 12:38 PM, Daniel D. Daugherty > >>>>> wrote: > >>>>> >>>>>>>>>>>>>>> Greetings, > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> I have a small but important bug fix for > >>>>> the Async > >>>>> >>>>>>>>>>>>>>> Monitor Deflation > >>>>> >>>>>>>>>>>>>>> project ready to go. It's also known as > >>>>> v2.02 (for those > >>>>> >>>>>>>>>>>>>>> for with the > >>>>> >>>>>>>>>>>>>>> patches) and as webrev/5-for-jdk13 (for > >>>>> those with > >>>>> >>>>>>>>>>>>>>> webrev URLs). Sorry > >>>>> >>>>>>>>>>>>>>> for all the names... > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> JDK-8222295 was pushed to jdk/jdk two > >>>>> days ago so that > >>>>> >>>>>>>>>>>>>>> baseline patch > >>>>> >>>>>>>>>>>>>>> is out of our hair. > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> Main bug URL: > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong > >>>>> safepoints > >>>>> >>>>>>>>>>>>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> The project is currently baselined on > >>>>> jdk-13+17. > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> Here's the full webrev URL: > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for- > >> jdk13.full/ > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> Here's the incremental webrev URL > >>>>> (JDK-8153224): > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for- > >> jdk13.inc/ > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> I still have to update the OpenJDK wiki > >>>>> to reflect the > >>>>> >>>>>>>>>>>>>>> CR2 changes: > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> This version of the patch has been thru > >>>>> Mach5 tier[1-6] > >>>>> >>>>>>>>>>>>>>> testing on > >>>>> >>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 > >>>>> tier[7-8] is > >>>>> >>>>>>>>>>>>>>> running now. > >>>>> >>>>>>>>>>>>>>> My stress kit is running on Solaris-X64 > >>>>> now. > >>>>> >>>>>>>>>>>>>>> Kitchensink8H is running > >>>>> >>>>>>>>>>>>>>> now on product, fastdebug, and > slowdebug > >>>>> bits on > >>>>> >>>>>>>>>>>>>>> Linux-X64, MacOSX > >>>>> >>>>>>>>>>>>>>> and Solaris-X64. 12 hour Inflate2 runs > >>>>> are running now > >>>>> >>>>>>>>>>>>>>> on product, > >>>>> >>>>>>>>>>>>>>> fastdebug and slowdebug bits on > >>>>> Linux-X64, MacOSX and > >>>>> >>>>>>>>>>>>>>> Solaris-X64. > >>>>> >>>>>>>>>>>>>>> I'll start my my stress kit on Linux-X64 > >>>>> sometime on > >>>>> >>>>>>>>>>>>>>> Sunday (after > >>>>> >>>>>>>>>>>>>>> my jdk-13+18 stress run is done). > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> I'll do SPECjbb2015 baseline and CR2 > >>>>> runs after all the > >>>>> >>>>>>>>>>>>>>> stress > >>>>> >>>>>>>>>>>>>>> testing is done. > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> Thanks, in advance, for any questions, > >>>>> comments or > >>>>> >>>>>>>>>>>>>>> suggestions. > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> Dan > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> On 4/19/19 11:58 AM, Daniel D. Daugherty > >>>>> wrote: > >>>>> >>>>>>>>>>>>>>>> Greetings, > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> I finally have CR1 for the Async > >>>>> Monitor Deflation > >>>>> >>>>>>>>>>>>>>>> project ready to > >>>>> >>>>>>>>>>>>>>>> go. It's also known as v2.01 (for those > >>>>> for with the > >>>>> >>>>>>>>>>>>>>>> patches) and as > >>>>> >>>>>>>>>>>>>>>> webrev/4-for-jdk13 (for those with > >>>>> webrev URLs). Sorry > >>>>> >>>>>>>>>>>>>>>> for all the > >>>>> >>>>>>>>>>>>>>>> names... > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> Main bug URL: > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong > >>>>> safepoints > >>>>> >>>>>>>>>>>>>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> Baseline bug fixes URL: > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> JDK-8222295 more baseline cleanups from > >>>>> Async > >>>>> >>>>>>>>>>>>>>>> Monitor Deflation project > >>>>> >>>>>>>>>>>>>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8222295 > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> The project is currently baselined on > >>>>> jdk-13+15. > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> Here's the webrev for the latest > >>>>> baseline changes > >>>>> >>>>>>>>>>>>>>>> (JDK-8222295): > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- > >> jdk13.8222295 > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> Here's the full webrev URL (JDK-8153224 > >>>>> only): > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- > >> jdk13.full/ > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> Here's the incremental webrev URL > >>>>> (JDK-8153224): > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- > >> jdk13.inc/ > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> So I'm looking for reviews for both > >>>>> JDK-8222295 and the > >>>>> >>>>>>>>>>>>>>>> latest version > >>>>> >>>>>>>>>>>>>>>> of JDK-8153224... > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> I still have to update the OpenJDK wiki > >>>>> to reflect the > >>>>> >>>>>>>>>>>>>>>> CR changes: > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> This version of the patch has been thru > >>>>> Mach5 tier[1-3] > >>>>> >>>>>>>>>>>>>>>> testing on > >>>>> >>>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 > >>>>> tier[4-6] is > >>>>> >>>>>>>>>>>>>>>> running now and > >>>>> >>>>>>>>>>>>>>>> Mach5 tier[78] will be run later today. > >>>>> My stress kit > >>>>> >>>>>>>>>>>>>>>> on Solaris-X64 > >>>>> >>>>>>>>>>>>>>>> is running now. Linux-X64 stress > >>>>> testing will start on > >>>>> >>>>>>>>>>>>>>>> Sunday. I'm > >>>>> >>>>>>>>>>>>>>>> planning to do Kitchensink runs, > >>>>> SPECjbb2015 runs and > >>>>> >>>>>>>>>>>>>>>> my monitor > >>>>> >>>>>>>>>>>>>>>> inflation stress tests on Linux-X64, > >>>>> MacOSX and > >>>>> >>>>>>>>>>>>>>>> Solaris-X64. > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> Thanks, in advance, for any questions, > >>>>> comments or > >>>>> >>>>>>>>>>>>>>>> suggestions. > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> Dan > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> On 3/24/19 9:57 AM, Daniel D. Daugherty > >>>>> wrote: > >>>>> >>>>>>>>>>>>>>>>> Greetings, > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> Welcome to the OpenJDK review thread > >>>>> for my port of > >>>>> >>>>>>>>>>>>>>>>> Carsten's work on: > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong > >>>>> safepoints > >>>>> >>>>>>>>>>>>>>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> Here's a link to the OpenJDK wiki that > >>>>> describes my port: > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> > >>>>> > >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> Here's the webrev URL: > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> > >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for- > >> jdk13/ > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> Here's a link to Carsten's original > >>>>> webrev: > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> > >>>>> > http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ > >>>>> > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> Earlier versions of this patch have > >>>>> been through > >>>>> >>>>>>>>>>>>>>>>> several rounds of > >>>>> >>>>>>>>>>>>>>>>> preliminary review. Many thanks to > >>>>> Carsten, Coleen, > >>>>> >>>>>>>>>>>>>>>>> Robbin, and > >>>>> >>>>>>>>>>>>>>>>> Roman for their preliminary code > >>>>> review comments. A > >>>>> >>>>>>>>>>>>>>>>> very special > >>>>> >>>>>>>>>>>>>>>>> thanks to Robbin and Roman for > >>>>> building and testing > >>>>> >>>>>>>>>>>>>>>>> the patch in > >>>>> >>>>>>>>>>>>>>>>> their own environments (including > >>>>> specJBB2015). > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> This version of the patch has been > >>>>> thru Mach5 > >>>>> >>>>>>>>>>>>>>>>> tier[1-8] testing on > >>>>> >>>>>>>>>>>>>>>>> Oracle's usual set of platforms. > >>>>> Earlier versions have > >>>>> >>>>>>>>>>>>>>>>> been run > >>>>> >>>>>>>>>>>>>>>>> through my stress kit on my Linux-X64 > >>>>> and Solaris-X64 > >>>>> >>>>>>>>>>>>>>>>> servers > >>>>> >>>>>>>>>>>>>>>>> (product, fastdebug, > >>>>> slowdebug).Earlier versions have > >>>>> >>>>>>>>>>>>>>>>> run Kitchensink > >>>>> >>>>>>>>>>>>>>>>> for 12 hours on MacOSX, Linux-X64 and > >>>>> Solaris-X64 > >>>>> >>>>>>>>>>>>>>>>> (product, fastdebug > >>>>> >>>>>>>>>>>>>>>>> and slowdebug). Earlier versions have > >>>>> run my monitor > >>>>> >>>>>>>>>>>>>>>>> inflation stress > >>>>> >>>>>>>>>>>>>>>>> tests for 12 hours on MacOSX, > >>>>> Linux-X64 and > >>>>> >>>>>>>>>>>>>>>>> Solaris-X64 (product, > >>>>> >>>>>>>>>>>>>>>>> fastdebug and slowdebug). > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> All of the testing done on earlier > >>>>> versions will be > >>>>> >>>>>>>>>>>>>>>>> redone on the > >>>>> >>>>>>>>>>>>>>>>> latest version of the patch. > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> Thanks, in advance, for any questions, > >>>>> comments or > >>>>> >>>>>>>>>>>>>>>>> suggestions. > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> Dan > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>> P.S. > >>>>> >>>>>>>>>>>>>>>>> One subtest in > >>>>> >>>>>>>>>>>>>>>>> > >>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java > >>>>> >>>>>>>>>>>>>>>>> is currently failing in -Xcomp mode on > >>>>> Win* only. I've > >>>>> >>>>>>>>>>>>>>>>> been trying > >>>>> >>>>>>>>>>>>>>>>> to characterize/analyze this failure > >>>>> for more than a > >>>>> >>>>>>>>>>>>>>>>> week now. At > >>>>> >>>>>>>>>>>>>>>>> this point I'm convinced that Async > >>>>> Monitor Deflation > >>>>> >>>>>>>>>>>>>>>>> is aggravating > >>>>> >>>>>>>>>>>>>>>>> an existing bug. However, I plan to > >>>>> have a better > >>>>> >>>>>>>>>>>>>>>>> handle on that > >>>>> >>>>>>>>>>>>>>>>> failure before these bits are pushed > >>>>> to the jdk/jdk repo. > >>>>> >>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>> > >>>>> >>>>>>>>>>>> > >>>>> >>>>>>>>>>> > >>>>> >>>>>>>>>> > >>>>> >>>>>>>>> > >>>>> >>>>>>>> > >>>>> >>>>>>> > >>>>> >>>>>> > >>>>> >>>>> > >>>>> >>>> > >>>>> >>> > >>>>> >> > >>>>> From daniel.daugherty at oracle.com Tue Sep 15 16:42:57 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 15 Sep 2020 12:42:57 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints (CR14/v2.14/17-for-jdk15) In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <029b596d-46e5-9fa7-38fd-c34d3a32987b@oracle.com> <54fd4f9c-afef-0819-de2d-b81b25fa6c22@oracle.com> <79bfb73a-20d7-7b85-4a84-dd22b150ed0d@oracle.com> <9fcc131c-dfbd-b943-381b-2ea8d854fcd7@oracle.com> <3b76e50f-fd8f-d61b-272a-27338df99094@oracle.com> <1d6a6087-b82d-96cf-bfb4-87cb03869bd6@oracle.com> <158ce5cc-3869-737b-b644-a82c962949cc@oracle.com> Message-ID: <2c3c5603-e924-e1bc-a22d-a1149b7947cb@oracle.com> Thanks for filing the bug. Dan On 9/15/20 12:32 PM, Doerr, Martin wrote: > Thank you, Dan! > > I've created https://bugs.openjdk.java.net/browse/JDK-8253183 > Feel free to modify/assign. > > Best regards, > Martin > > >> -----Original Message----- >> From: Daniel D. Daugherty >> Sent: Dienstag, 15. September 2020 16:59 >> To: Doerr, Martin ; Carsten Varming >> ; Erik ?sterlund >> Cc: Roman Kennke ; hotspot-runtime- >> dev at openjdk.java.net >> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >> (CR14/v2.14/17-for-jdk15) >> >> Hi Martin, >> >> I believe that the support_IRIW_for_not_multiple_copy_atomic_cpu stuff >> came from Erik O. so I'm adding him to this email thread. >> >> Yes, please create an issue that describes the problem and we'll >> figure out who should take the issue... >> >> Dan >> >> >> On 9/15/20 10:52 AM, Doerr, Martin wrote: >>> Hi Dan and Carsten, >>> >>> I just noticed that this change introduced 2 usages of >> "support_IRIW_for_not_multiple_copy_atomic_cpu". >>> I think this is incorrect for arm32 which is not multi-copy-atomic, but uses >> support_IRIW_for_not_multiple_copy_atomic_cpu = false. >>> You probably meant "#ifdef CPU_MULTI_COPY_ATOMIC"? >>> >>> I haven't studied the access patterns you were trying to fix, but this looks >> wrong. >>> Should I create an issue? Would be great if I could assign it to somebody >> familiar with this new code. >>> Best regards, >>> Martin >>> >>> >>>> -----Original Message----- >>>> From: hotspot-runtime-dev >>> bounces at openjdk.java.net> On Behalf Of Daniel D. Daugherty >>>> Sent: Dienstag, 2. Juni 2020 21:25 >>>> To: Carsten Varming >>>> Cc: Roman Kennke ; hotspot-runtime- >>>> dev at openjdk.java.net >>>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>>> (CR14/v2.14/17-for-jdk15) >>>> >>>> Hi Carsten, >>>> >>>> Thanks for the fast review of the updated comments. >>>> >>>> I filed the following new bug to track the change: >>>> >>>> ??? JDK-8246359 clarify confusing comment in ObjectMonitor::EnterI()'s >>>> ??????????????? race with async deflation >>>> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >>>> >>>> And I started a review thread for the fix under that new bug ID. >>>> >>>> Dan >>>> >>>> >>>> On 6/2/20 2:13 PM, Carsten Varming wrote: >>>>> Hi Dan, >>>>> >>>>> I like the new comment. Thank you for doing the update. >>>>> >>>>> Carsten >>>>> >>>>> On Tue, Jun 2, 2020 at 1:54 PM Daniel D. Daugherty >>>>> > > >>>> wrote: >>>>> Hi Carsten, >>>>> >>>>> See replies below... >>>>> >>>>> David, Erik and Robbin, if you folks could also check out the revised >>>>> comment below that would be appreciated. >>>>> >>>>> >>>>> On 6/2/20 9:39 AM, Carsten Varming wrote: >>>>>> Hi Dan, >>>>>> >>>>>> See inline. >>>>>> >>>>>> On Mon, Jun 1, 2020 at 11:32 PM Daniel D. Daugherty >>>>>> >>>>> > wrote: >>>>>> >>>>>> Hi Carsten, >>>>>> >>>>>> Thanks for chiming in on this review thread!! >>>>>> >>>>>> >>>>>> It is my pleasure. You know the code is solid when the discussion >>>>>> is focused on the comments. >>>>> So true, so very true! >>>>> >>>>> >>>>>> On 6/1/20 10:41 PM, Carsten Varming wrote: >>>>>>> Hi Dan, >>>>>>> >>>>>>> I like the new protocol, but I had to think about how the >>>>>>> extra increment to _contentions replaced the check on _owner >>>>>>> that I originally?added. >>>>>> Right. The check on _owner was described in detail in the >>>>>> OpenJDK wiki >>>>>> subsection that was called "T-enter Wins By A-B-A". It can >>>>>> still be >>>>>> found by going thru the wiki's history links. >>>>>> >>>>>> That subsection was renamed and rewritten and can be found >> here: >>>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#A >>>> syncMonitorDeflation-T- >>>> enterWinsByCancellationViaDEFLATER_MARKERSwap >>>>>>> I am thinking that the increased _contention value is a >>>>>>> little mark left on the ObjectMonitor to signal to the >>>>>>> deflater thread (which must be in the middle of trying to >>>>>>> acquire the object monitor as _owner was set to >>>>>>> DEFLATER_MARKER) that the deflater thread lost the race. >>>>>> That is exactly what the extra increment is being used for. >>>>>> >>>>>> In my reply to David H. that you quoted below, I describe the >>>>>> progression >>>>>> of contention values thru the two possible race scenarios. >>>>>> The progression >>>>>> shows the T-enter thread winning the race and marking the >>>>>> contention field >>>>>> with the extra increment while the T-deflater thread >>>>>> recognizes that it has >>>>>> lost the race and unmarks the contention field with an extra >>>>>> decrement. >>>>>> >>>>>> >>>>>> I noticed that. Looks like David and I were racing and David won. :) >>>>>> >>>>>>> That little mark stays with the object monitor long after >>>>>>> the thread is done with the monitor. >>>>>> The "little mark" stays with the ObjectMonitor after T-enter >>>>>> is done >>>>>> entering until the T-deflater thread recognizes that the >>>>>> async deflation >>>>>> was canceled and does an extra decrement. I don't think I >>>>>> would describe >>>>>> it as "long after". >>>>>> >>>>>> >>>>>> Sorry about the use of "long after". When I think about the >>>>>> correctness of protocols, like the deflation protocol, I end up >>>>>> thinking about sequences of instructions and the relevant >>>>>> interleavings. In that context I often end up using phrases like >>>>>> "long after" and "after" to mean anything after a particular >>>>>> instruction. I did not mean to imply anything about the relative >>>>>> speed of the execution of the code. >>>>> It's okay. I do something similar in the transaction diagrams that >>>>> I use to work out timing issues: ... >>>>> >>>>> The only point that I was trying to make is that the T-deflate thread >>>>> is responsible for cleaning up the extra mark and it's committed to >>>>> the code path that will result in the cleanup. Yes, there may be a >>>>> between the time that T-deflate recognizes that async >>>>> deflation was canceled and when T-deflate does the extra >> decrement, >>>>> but I don't see any harm in it. >>>>> >>>>> >>>>>>> It might be worth adding a comment to the code explaining >>>>>>> that after the increment, the _contention field can only be >>>>>>> set to 0 by a corresponding decrement in the async deflater >>>>>>> thread, ensuring that the >>>>>>> Atomic::cmpxchg(&mid->_contentions, (jint)0, -max_jint)?on >>>>>>> line 2166 fails. In particular, the comment: >>>>>>> +. // .... We bump contentions an >>>>>>> + // extra time to prevent the async deflater thread from >>>>>>> temporarily >>>>>>> + // changing it to -max_jint and back to zero (no flicker >>>>>>> to confuse >>>>>>> + // is_being_async_deflated() >>>>>>> confused me as after the deflater thread sets _contentions >>>>>>> to -max_jint, the?deflater thread has won the race and the >>>>>>> object monitor is about to be deflated. >>>>>> For context, here's the code and comment being discussed: >>>>>> >>>>>>> 527 if (AsyncDeflateIdleMonitors && >>>>>>> 528 try_set_owner_from(DEFLATER_MARKER, Self) == >>>> DEFLATER_MARKER) { >>>>>>> 529 // Cancelled the in-progress async deflation. We bump >>>>>>> contentions an >>>>>>> 530 // extra time to prevent the async deflater thread from >>>>>>> temporarily >>>>>>> 531 // changing it to -max_jint and back to zero (no flicker >>>>>>> to confuse >>>>>>> 532 // is_being_async_deflated()). The async deflater thread >>>>>>> will >>>>>>> 533 // decrement contentions after it recognizes that the async >>>>>>> 534 // deflation was cancelled. >>>>>>> 535 add_to_contentions(1); >>>>>> This part of the new comment: >>>>>> >>>>>> ?532???? // ...? The async deflater thread will >>>>>> ?533???? // decrement contentions after it recognizes that >>>>>> the async >>>>>> ?534???? // deflation was cancelled. >>>>>> >>>>>> makes it clear that the async deflater thread does the >>>>>> corresponding decrement >>>>>> to the increment done by the T-enter thread so that covers >>>>>> this part of your >>>>>> comment above: >>>>>> >>>>>> ??? the _contention field can only be set to 0 by a >>>>>> corresponding decrement >>>>>> ??? in the async deflater thread >>>>>> >>>>>> This part of the new comment: >>>>>> >>>>>> ?529???? // ...? We bump contentions an >>>>>> ?530???? // extra time to prevent the async deflater thread >>>>>> from temporarily >>>>>> ?531???? // changing it to -max_jint and back to zero (no >>>>>> flicker to confuse >>>>>> ?532???? // is_being_async_deflated()). >>>>>> >>>>>> makes it clear that we're keeping make-contentions-negative >>>>>> part of the >>>>>> async deflation protocol from happening so that covers this >>>>>> part of your >>>>>> comment above: >>>>>> >>>>>> ??? ensuring that the Atomic::cmpxchg(&mid->_contentions, >>>>>> (jint)0, -max_jint) >>>>>> ??? on line 2166 fails. >>>>>> >>>>>> This part of your comment above makes it clear where the >>>>>> confusion arises: >>>>>> >>>>>> ??? confused me as after the deflater thread sets >>>>>> _contentions to -max_jint, >>>>>> ??? the deflater thread has won the race and the object >>>>>> monitor is about to >>>>>> ??? be deflated. >>>>>> >>>>>> Your original algorithm is a three-part async deflation protocol: >>>>>> >>>>>> Part 1 - set owner field to DEFLATER marker >>>>>> Part 2 - make a zero contentions field -max_jint >>>>>> Part 3 - check to see if the owner field is still DEFLATER_MARKER >>>>>> >>>>>> If part 3 fails, then the contentions field that is currently >>>>>> negative >>>>>> has max_jint added to it to complete the bail out process. >>>>>> It's that >>>>>> third part that makes the contentions field flicker from: >>>>>> >>>>>> ??? 0 -> -max_jint -> 0 >>>>>> >>>>>> And the extra contentions increment in the new two part >>>>>> protocol solves >>>>>> that flicker and allows us to treat (contentions < 0) as a >>>>>> linearization >>>>>> point. >>>>>> >>>>>> Please let me know if this clarifies your concern. >>>>>> >>>>>> >>>>>> I am no?longer confused, but the cause of my confusion is still >>>>>> present in the comment. >>>>>> >>>>>> This group knows about the three part algorithm, but when the >>>>>> code is pushed there is no representation of the three part >>>>>> algorithm in the code or repository. >>>>> That's a really good point and a side effect of my living with this >>>>> code for a very long time... >>>>> >>>>> >>>>>> I forgot the details of the algorithm and read the latest version >>>>>> of the code to figure out what the flickering was about. As you >>>>>> would expect, I found that there is no way the code can cause the >>>>>> flicker mentioned. That made me worried. I started to question >>>>>> myself: What can?cause the behavior that is described in the >>>>>> comments? What am I missing? As a result, I think it is best if >>>>>> we keep the flickering to ourselves and update the comment to >>>>>> describe that because _owner was DEFLATER_MARKER the deflation >>>>>> thread must be in the middle of the protocol for deflating the >>>>>> object monitor, and in particular, incrementing _contentions >>>>>> ensures the failure of the final CAS in the deflation protocol >>>>>> (final in the protocol implemented in the code). >>>>> The above is a more clear expression of your concerns and I agree. >>>>> >>>>> >>>>>> To be clear: >>>>>> >>>>>> > 529 // Cancelled the in-progress async deflation. >>>>>> >>>>>> I would expend this comment by mentioning that the deflator >>>>>> thread cannot win the last part of the 2-part deflation protocol >>>>>> as 0 < _contentions (pre-condition to this method). >>>>>> >>>>>> > We bump contentions an >>>>>> > 530 // extra time to prevent the async deflater thread from >>>>>> temporarily >>>>>> > 531 // changing it to -max_jint and back to zero (no flicker to >>>>>> confuse >>>>>> > 532 // is_being_async_deflated()). >>>>>> >>>>>> I would replace this part with something along the lines of: We >>>>>> bump contentions an extra time to prevent the deflator thread >>>>>> from winning the last part of the (2-part) deflation protocol >>>>>> after this thread decrements _contentions as part of the release >>>>>> of the object monitor. >>>>>> >>>>>> > The async deflater thread will >>>>>> > 533 // decrement contentions after it recognizes that the async >>>>>> > 534 // deflation was cancelled. >>>>>> >>>>>> I would keep this part. >>>>> So here's my rewrite of the code and comment block: >>>>> >>>>> ? if (AsyncDeflateIdleMonitors && >>>>> ????? try_set_owner_from(DEFLATER_MARKER, Self) == >>>> DEFLATER_MARKER) { >>>>> ??? // Cancelled the in-progress async deflation by changing owner >>>>> from >>>>> ??? // DEFLATER_MARKER to Self. As part of the contended enter >>>>> protocol, >>>>> ??? // contentions was incremented to a positive value before EnterI() >>>>> ??? // was called and that prevents the deflater thread from >>>>> winning the >>>>> ??? // last part of the 2-part async deflation protocol. After >>>>> EnterI() >>>>> ??? // returns to enter(), contentions is decremented because the >>>>> caller >>>>> ??? // now owns the monitor. We bump contentions an extra time here >> to >>>>> ??? // prevent the deflater thread from winning the last part of the >>>>> ??? // 2-part async deflation protocol after the regular decrement >>>>> ??? // occurs in enter(). The deflater thread will decrement >>>>> contentions >>>>> ??? // after it recognizes that the async deflation was cancelled. >>>>> ??? add_to_contentions(1); >>>>> >>>>> I've made this change to both places in EnterI() that had the original >>>>> confusing comment. >>>>> >>>>> Please let me know if this rewrite works for everyone. >>>>> >>>>> Since I've already pushed 8153224, I'll file a new bug to push this >>>>> clarification once we're all in agreement here. >>>>> >>>>> Dan >>>>> >>>>> >>>>>> I hope this helps, >>>>>> Carsten >>>>>> >>>>>>> Otherwise, the code looks great. I am looking forward to >>>>>>> seeing in the repo. >>>>>> Thanks! The code should be there soon. >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> Carsten >>>>>>> >>>>>>> On Mon, Jun 1, 2020 at 8:32 PM Daniel D. Daugherty >>>>>>> >>>>>> > wrote: >>>>>>> >>>>>>> Hi David, >>>>>>> >>>>>>> On 6/1/20 7:58 PM, David Holmes wrote: >>>>>>> > Hi Dan, >>>>>>> > >>>>>>> > Sorry for the delay. >>>>>>> >>>>>>> No worries. It's always worth waiting for your code >>>>>>> review in general >>>>>>> and, with the complexity of this project, it's on my >>>>>>> must-do list! >>>>>>> >>>>>>> >>>>>>> > >>>>>>> > On 28/05/2020 3:20 am, Daniel D. Daugherty wrote: >>>>>>> >> Greetings, >>>>>>> >> >>>>>>> >> Erik O. had an idea for changing the three part async >>>>>>> deflation protocol >>>>>>> >> into a two part async deflation protocol where the >>>>>>> second part (setting >>>>>>> >> the contentions field to -max_jint) is a >>>>>>> linearization point. I've taken >>>>>>> >> Erik's proposal (which was relative to >>>>>>> CR12/v2.12/15-for-jdk15), merged >>>>>>> >> it with CR13/v2.13/16-for-jdk15, and made a few minor >>>>>>> tweaks. >>>>>>> >> >>>>>>> >> I have attached the change list from CR13 to CR14 and >>>>>>> I've also added a >>>>>>> >> link to the CR13-to-CR14-changes file to the webrevs >>>>>>> so it should be >>>>>>> >> easy >>>>>>> >> to find. >>>>>>> >> >>>>>>> >> Main bug URL: >>>>>>> >> >>>>>>> >> ???? JDK-8153224 Monitor deflation prolong safepoints >>>>>>> >> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >> >>>>>>> >> The project is currently baselined on jdk-15+24. >>>>>>> >> >>>>>>> >> Here's the full webrev URL for those folks that want >>>>>>> to see all of the >>>>>>> >> current Async Monitor Deflation code in one go (v2.14 >>>>>>> full): >>>>>>> >> >>>>>>> >> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/17-for- >>>> jdk15+24.v2.14.full/ >>>>>>> >> >>>>>>> >> >>>>>>> >> Some folks might want to see just what has changed >>>>>>> since the last review >>>>>>> >> cycle so here's a webrev for that (v2.14 inc): >>>>>>> >> >>>>>>> >> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/17-for- >>>> jdk15+24.v2.14.inc/ >>>>>>> > >>>>>>> > >>>>>>> > src/hotspot/share/runtime/synchronizer.cpp >>>>>>> > >>>>>>> > I'm having a little trouble keeping the _contentions >>>>>>> relationships in >>>>>>> > my head. In particular with this change I can't quite >>>>>>> grok the: >>>>>>> > >>>>>>> > // Deferred decrement for the JT EnterI() that >>>>>>> cancelled the async >>>>>>> > deflation. >>>>>>> > mid->add_to_contentions(-1); >>>>>>> > >>>>>>> > change. I kind of get EnterI() does an extra increment >>>>>>> and the >>>>>>> > deflator thread does the above matching decrement. But >>>>>>> given the two >>>>>>> > changes can happen in any order I'm not sure what the >>>>>>> possible visible >>>>>>> > values for _contentions will be and how that might >>>>>>> affect other code >>>>>>> > inspecting it? >>>>>>> >>>>>>> I have a sub-section in the OpenJDK wiki dedicated to >>>>>>> this particular race: >>>>>>> >>>>>>> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation#A >>>> syncMonitorDeflation-T- >>>> enterWinsByCancellationViaDEFLATER_MARKERSwap >>>>>>> In order for this race condition to manifest, the >>>>>>> T-enter thread has to >>>>>>> successfully swap the owner field's DEFLATER_MARKER >>>>>>> value for Self. That >>>>>>> swap will eventually cause the T-deflate thread to >>>>>>> realize that the async >>>>>>> deflation that it started has been canceled. >>>>>>> >>>>>>> The diagram shows the progression of contentions values: >>>>>>> >>>>>>> - ObjectMonitor box 1 shows contentions == 1 because >>>>>>> T-enter incremented >>>>>>> ?? the contentions field >>>>>>> >>>>>>> - ObjectMonitor box 2 shows contentions == 2 because >>>>>>> EnterI() did the >>>>>>> ?? extra increment. >>>>>>> >>>>>>> - ObjectMonitor box 3 shows contentions == 1 because >>>>>>> T-enter did the >>>>>>> ?? regular contentions decrement. >>>>>>> >>>>>>> - ObjectMonitor box 4 shows contentions == 0 because >>>>>>> T-deflate did the >>>>>>> ?? extra contentions decrement. >>>>>>> >>>>>>> Now it is possible for T-deflate to do the extra >>>>>>> decrement before T-enter >>>>>>> does the extra increment. If I were to add another >>>>>>> diagram to show that >>>>>>> variant of the race, that progression of contentions >>>>>>> values would be: >>>>>>> >>>>>>> - ObjectMonitor box 1 shows contentions == 1 because >>>>>>> T-enter incremented >>>>>>> ?? the contentions field >>>>>>> >>>>>>> - ObjectMonitor box 2 shows contentions == 0 because >>>>>>> T-deflate did the >>>>>>> ?? extra contentions decrement. >>>>>>> >>>>>>> - ObjectMonitor box 3 shows contentions == 1 because >>>>>>> EnterI() did the >>>>>>> ?? extra increment. >>>>>>> >>>>>>> - ObjectMonitor box 4 shows contentions == 0 because >>>>>>> T-enter did the >>>>>>> ?? regular contentions decrement. >>>>>>> >>>>>>> Notice that in this second scenario the contentions >>>>>>> field never goes >>>>>>> negative so there's nothing to confuse a potential caller of >>>>>>> is_being_async_deflated(): >>>>>>> >>>>>>> inline bool ObjectMonitor::is_being_async_deflated() { >>>>>>> ?? return AsyncDeflateIdleMonitors && contentions() < 0; >>>>>>> } >>>>>>> >>>>>>> It is not possible for T-deflate's extra decrement of >>>>>>> the contentions >>>>>>> field to make the contentions field negative. That >>>>>>> decrement only happens >>>>>>> when T-deflate detects that the async deflation has been >>>>>>> canceled and >>>>>>> async deflation can only be canceled after T-enter has >>>>>>> already made the >>>>>>> contentions field > 0. >>>>>>> >>>>>>> Please let me know if this resolves your concern about: >>>>>>> >>>>>>> > // Deferred decrement for the JT EnterI() that >>>>>>> cancelled the async >>>>>>> > deflation. >>>>>>> > mid->add_to_contentions(-1); >>>>>>> >>>>>>> I'm not planning to update the OpenJDK wiki to add a >>>>>>> second variant of >>>>>>> the cancellation race. Please let me know if that is okay. >>>>>>> >>>>>>> > >>>>>>> > But otherwise the changes in this version seem good >>>>>>> and overall the >>>>>>> > protocol seems simpler. >>>>>>> >>>>>>> This sounds like a thumbs up, but I'm looking for >>>>>>> something more definitive. >>>>>>> >>>>>>> >>>>>>> > I'm still going to spend some more time going over the >>>>>>> complete webrev >>>>>>> > to get a fuller sense of things. >>>>>>> >>>>>>> As always, if you find something after I've pushed, >>>>>>> we'll deal with it. >>>>>>> >>>>>>> Thanks for your many re-reviews for this project!! >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>> > >>>>>>> > Thanks, >>>>>>> > David >>>>>>> > >>>>>>> >> >>>>>>> >> >>>>>>> >> The OpenJDK wiki has been updated for v2.14. >>>>>>> >> >>>>>>> >> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >> >>>>>>> >> The jdk-15+24 based v2.14 version of the patch has >>>>>>> gone thru Mach5 >>>>>>> >> Tier[1-5] >>>>>>> >> testing with no related failures; Mach5 Tier[67] are >>>>>>> running now and >>>>>>> >> so far >>>>>>> >> have no related failures. I'll kick off Mach5 Tier8 >>>>>>> after the other >>>>>>> >> tiers >>>>>>> >> have finished since Mach5 is a bit busy right now. >>>>>>> >> >>>>>>> >> I'm also running my usual inflation stress testing on >>>>>>> Linux-X64 and >>>>>>> >> macOSX >>>>>>> >> and so far there are no issues. >>>>>>> >> >>>>>>> >> Thanks, in advance, for any questions, comments or >>>>>>> suggestions. >>>>>>> >> >>>>>>> >> Dan >>>>>>> >> >>>>>>> >> >>>>>>> >> On 5/21/20 2:53 PM, Daniel D. Daugherty wrote: >>>>>>> >>> Greetings, >>>>>>> >>> >>>>>>> >>> I have made changes to the Async Monitor Deflation >>>>>>> code in response to >>>>>>> >>> the CR12/v2.12/15-for-jdk15 code review cycle. >>>>>>> Thanks to David H. and >>>>>>> >>> Erik O. for their OpenJDK reviews in the v2.12 round! >>>>>>> >>> >>>>>>> >>> I have attached the change list from CR12 to CR13 >>>>>>> and I've also added a >>>>>>> >>> link to the CR12-to-CR13-changes file to the webrevs >>>>>>> so it should be >>>>>>> >>> easy >>>>>>> >>> to find. >>>>>>> >>> >>>>>>> >>> Main bug URL: >>>>>>> >>> >>>>>>> >>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>>>> >>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>> >>>>>>> >>> The project is currently baselined on jdk-15+24. >>>>>>> >>> >>>>>>> >>> Here's the full webrev URL for those folks that want >>>>>>> to see all of the >>>>>>> >>> current Async Monitor Deflation code in one go >>>>>>> (v2.13 full): >>>>>>> >>> >>>>>>> >>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/16-for- >>>> jdk15%2b24.v2.13.full/ >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> Some folks might want to see just what has changed >>>>>>> since the last >>>>>>> >>> review >>>>>>> >>> cycle so here's a webrev for that (v2.13 inc): >>>>>>> >>> >>>>>>> >>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/16-for- >>>> jdk15%2b24.v2.13.inc/ >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> The OpenJDK wiki is currently at v2.13 and might >>>>>>> require minor >>>>>>> >>> tweaks for v2.12 >>>>>>> >>> and v2.13. Yes, I need to make yet another crawl >>>>>>> thru review of it... >>>>>>> >>> >>>>>>> >>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>> >>>>>>> >>> The jdk-15+24 based v2.13 version of the patch is >>>>>>> going thru the usual >>>>>>> >>> Mach5 testing right now. It is also going thru my >>>>>>> usual inflation >>>>>>> >>> stress >>>>>>> >>> testing on Linux-X64 and macOSX. >>>>>>> >>> >>>>>>> >>> Thanks, in advance, for any questions, comments or >>>>>>> suggestions. >>>>>>> >>> >>>>>>> >>> Dan >>>>>>> >>> >>>>>>> >>> On 5/14/20 5:40 PM, Daniel D. Daugherty wrote: >>>>>>> >>>> Greetings, >>>>>>> >>>> >>>>>>> >>>> I have made changes to the Async Monitor Deflation >>>>>>> code in response to >>>>>>> >>>> the CR11/v2.11/14-for-jdk15 code review cycle. >>>>>>> Thanks to David H., >>>>>>> >>>> Erik O., >>>>>>> >>>> and Robbin for their OpenJDK reviews in the v2.11 >>>>>>> round! >>>>>>> >>>> >>>>>>> >>>> I have attached the change list from CR11 to CR12 >>>>>>> and I've also >>>>>>> >>>> added a >>>>>>> >>>> link to the CR11-to-CR12-changes file to the >>>>>>> webrevs so it should >>>>>>> >>>> be easy >>>>>>> >>>> to find. >>>>>>> >>>> >>>>>>> >>>> Main bug URL: >>>>>>> >>>> >>>>>>> >>>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>> >>>>>>> >>>> The project is currently baselined on jdk-15+23. >>>>>>> >>>> >>>>>>> >>>> Here's the full webrev URL for those folks that >>>>>>> want to see all of the >>>>>>> >>>> current Async Monitor Deflation code in one go >>>>>>> (v2.12 full): >>>>>>> >>>> >>>>>>> >>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/15-for- >>>> jdk15%2b23.v2.12.full/ >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> Some folks might want to see just what has changed >>>>>>> since the last >>>>>>> >>>> review >>>>>>> >>>> cycle so here's a webrev for that (v2.12 inc): >>>>>>> >>>> >>>>>>> >>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/15-for- >>>> jdk15%2b23.v2.12.inc/ >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> The OpenJDK wiki is currently at v2.11 and might >>>>>>> require minor >>>>>>> >>>> tweaks for v2.12: >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>> >>>>>>> >>>> The jdk-15+23 based v2.12 version of the patch is >>>>>>> going thru the usual >>>>>>> >>>> Mach5 testing right now. >>>>>>> >>>> >>>>>>> >>>> Thanks, in advance, for any questions, comments or >>>>>>> suggestions. >>>>>>> >>>> >>>>>>> >>>> Dan >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> On 5/7/20 1:08 PM, Daniel D. Daugherty wrote: >>>>>>> >>>>> Greetings, >>>>>>> >>>>> >>>>>>> >>>>> I have made changes to the Async Monitor Deflation >>>>>>> code in >>>>>>> >>>>> response to >>>>>>> >>>>> the CR10/v2.10/13-for-jdk15 code review cycle and >>>>>>> DaCapo-h2 perf >>>>>>> >>>>> testing. >>>>>>> >>>>> Thanks to Erik O., Robbin and David H. for their >>>>>>> OpenJDK reviews >>>>>>> >>>>> in the >>>>>>> >>>>> v2.10 round! Thanks to Eric C. for his help in >>>>>>> isolating the >>>>>>> >>>>> DaCapo-h2 >>>>>>> >>>>> performance regression. >>>>>>> >>>>> >>>>>>> >>>>> With the removal of ref_counting and the >>>>>>> ObjectMonitorHandle >>>>>>> >>>>> class, the >>>>>>> >>>>> Async Monitor Deflation project is now closer to >>>>>>> Carsten's original >>>>>>> >>>>> prototype. While ref_counting gave us >>>>>>> ObjectMonitor* safety >>>>>>> >>>>> enforced by >>>>>>> >>>>> code, I saw a ~22.8% slow down with >>>>>>> -XX:-AsyncDeflateIdleMonitors >>>>>>> >>>>> ("off" >>>>>>> >>>>> mode). The slow down with "on" mode >>>>>>> -XX:+AsyncDeflateIdleMonitors >>>>>>> >>>>> is ~17%. >>>>>>> >>>>> >>>>>>> >>>>> I have attached the change list from CR10 to CR11 >>>>>>> instead of >>>>>>> >>>>> putting it in >>>>>>> >>>>> the body of this email. I've also added a link to the >>>>>>> >>>>> CR10-to-CR11-changes >>>>>>> >>>>> file to the webrevs so it should be easy to find. >>>>>>> >>>>> >>>>>>> >>>>> Main bug URL: >>>>>>> >>>>> >>>>>>> >>>>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>> >>>>>>> >>>>> The project is currently baselined on jdk-15+21. >>>>>>> >>>>> >>>>>>> >>>>> Here's the full webrev URL for those folks that >>>>>>> want to see all of >>>>>>> >>>>> the >>>>>>> >>>>> current Async Monitor Deflation code in one go >>>>>>> (v2.11 full): >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for- >>>> jdk15%2b21.v2.11.full/ >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>> Some folks might want to see just what has changed >>>>>>> since the last >>>>>>> >>>>> review >>>>>>> >>>>> cycle so here's a webrev for that (v2.11 inc): >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/14-for- >>>> jdk15%2b21.v2.11.inc/ >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>> Because of the removal of ref_counting and the >>>>>>> ObjectMonitorHandle >>>>>>> >>>>> class, the >>>>>>> >>>>> incremental webrev is a bit noisier than I would >>>>>>> have preferred. >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>> The OpenJDK wiki has NOT YET been updated for this >>>>>>> round of changes: >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>> >>>>>>> >>>>> The jdk-15+21 based v2.11 version of the patch has >>>>>>> been thru Mach5 >>>>>>> >>>>> tier[1-6] >>>>>>> >>>>> testing on Oracle's usual set of platforms. Mach5 >>>>>>> tier[78] are >>>>>>> >>>>> still running. >>>>>>> >>>>> I'm running the v2.11 patch through my usual set >>>>>>> of stress testing on >>>>>>> >>>>> Linux-X64 and macOSX. >>>>>>> >>>>> >>>>>>> >>>>> I'm planning to do a SPECjbb2015, DaCapo-h2 and >>>>>>> volano round on the >>>>>>> >>>>> CR11/v2.11/14-for-jdk15 bits. >>>>>>> >>>>> >>>>>>> >>>>> Thanks, in advance, for any questions, comments or >>>>>>> suggestions. >>>>>>> >>>>> >>>>>>> >>>>> Dan >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>> On 2/26/20 5:22 PM, Daniel D. Daugherty wrote: >>>>>>> >>>>>> Greetings, >>>>>>> >>>>>> >>>>>>> >>>>>> I have made changes to the Async Monitor >>>>>>> Deflation code in >>>>>>> >>>>>> response to >>>>>>> >>>>>> the CR9/v2.09/12-for-jdk14 code review cycle. >>>>>>> Thanks to Robbin >>>>>>> >>>>>> and Erik O. >>>>>>> >>>>>> for their comments in this round! >>>>>>> >>>>>> >>>>>>> >>>>>> With the extraction and push of >>>>>>> {8235931,8236035,8235795} to >>>>>>> >>>>>> JDK15, the >>>>>>> >>>>>> Async Monitor Deflation code is back to "just" >>>>>>> async deflation >>>>>>> >>>>>> changes! >>>>>>> >>>>>> >>>>>>> >>>>>> I have attached the change list from CR9 to CR10 >>>>>>> instead of >>>>>>> >>>>>> putting it in >>>>>>> >>>>>> the body of this email. I've also added a link to >>>>>>> the >>>>>>> >>>>>> CR9-to-CR10-changes >>>>>>> >>>>>> file to the webrevs so it should be easy to find. >>>>>>> >>>>>> >>>>>>> >>>>>> Main bug URL: >>>>>>> >>>>>> >>>>>>> >>>>>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>> >>>>>>> >>>>>> The project is currently baselined on jdk-15+11. >>>>>>> >>>>>> >>>>>>> >>>>>> Here's the full webrev URL for those folks that >>>>>>> want to see all >>>>>>> >>>>>> of the >>>>>>> >>>>>> current Async Monitor Deflation code in one go >>>>>>> (v2.10 full): >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for- >>>> jdk15+11.v2.10.full/ >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> Some folks might want to see just what has >>>>>>> changed since the last >>>>>>> >>>>>> review >>>>>>> >>>>>> cycle so here's a webrev for that (v2.10 inc): >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/13-for- >>>> jdk15+11.v2.10.inc/ >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> Since we backed out the >>>>>>> HandshakeAfterDeflateIdleMonitors option >>>>>>> >>>>>> and the >>>>>>> >>>>>> C2 ref_count changes and updated the copyright >>>>>>> years, the "inc" >>>>>>> >>>>>> webrev has >>>>>>> >>>>>> a bit more noise in it than usual. Sorry about that! >>>>>>> >>>>>> >>>>>>> >>>>>> The OpenJDK wiki has been updated for this round >>>>>>> of changes: >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> The jdk-15+11 based v2.10 version of the patch >>>>>>> has been thru >>>>>>> >>>>>> Mach5 tier[1-7] >>>>>>> >>>>>> testing on Oracle's usual set of platforms. Mach5 >>>>>>> tier8 is still >>>>>>> >>>>>> running. >>>>>>> >>>>>> I'm running the v2.10 patch through my usual set >>>>>>> of stress >>>>>>> >>>>>> testing on >>>>>>> >>>>>> Linux-X64 and macOSX. >>>>>>> >>>>>> >>>>>>> >>>>>> I'm planning to do a SPECjbb2015 round on the >>>>>>> >>>>>> CR10/v2.20/13-for-jdk15 bits. >>>>>>> >>>>>> >>>>>>> >>>>>> Thanks, in advance, for any questions, comments >>>>>>> or suggestions. >>>>>>> >>>>>> >>>>>>> >>>>>> Dan >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> On 2/4/20 9:41 AM, Daniel D. Daugherty wrote: >>>>>>> >>>>>>> Greetings, >>>>>>> >>>>>>> >>>>>>> >>>>>>> This project is no longer targeted to JDK14 so >>>>>>> this is NOT an >>>>>>> >>>>>>> urgent code >>>>>>> >>>>>>> review request. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I've extracted the following three fixes from >>>>>>> the Async Monitor >>>>>>> >>>>>>> Deflation >>>>>>> >>>>>>> project code: >>>>>>> >>>>>>> >>>>>>> >>>>>>> ? ? JDK-8235931 add OM_CACHE_LINE_SIZE and use >>>>>>> smaller size on >>>>>>> >>>>>>> SPARCv9 and X64 >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235931 >>>>>>> >>>>>>> >>>>>>> >>>>>>> ? ? JDK-8236035 refactor >>>>>>> ObjectMonitor::set_owner() and _owner >>>>>>> >>>>>>> field setting >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8236035 >>>>>>> >>>>>>> >>>>>>> >>>>>>> ? ? JDK-8235795 replace monitor list >>>>>>> >>>>>>> mux{Acquire,Release}(&gListLock) with spin locks >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8235795 >>>>>>> >>>>>>> >>>>>>> >>>>>>> Each of these has been reviewed separately and >>>>>>> will be pushed to >>>>>>> >>>>>>> JDK15 >>>>>>> >>>>>>> in the near future (possibly by the end of this >>>>>>> week). Of >>>>>>> >>>>>>> course, there >>>>>>> >>>>>>> were improvements during these review cycles and >>>>>>> the purpose of >>>>>>> >>>>>>> this >>>>>>> >>>>>>> e-mail is to provided updated webrevs for this fix >>>>>>> >>>>>>> (CR9/v2.09/12-for-jdk14) >>>>>>> >>>>>>> within the revised context provided by {8235931, >>>>>>> 8236035, 8235795}. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Main bug URL: >>>>>>> >>>>>>> >>>>>>> >>>>>>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>> >>>>>>> >>>>>>> The project is currently baselined on jdk-14+34. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Here's the full webrev URL for those folks that >>>>>>> want to see all >>>>>>> >>>>>>> of the >>>>>>> >>>>>>> current Async Monitor Deflation code along with >>>>>>> {8235931, >>>>>>> >>>>>>> 8236035, 8235795} >>>>>>> >>>>>>> in one go (v2.09b full): >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- >>>> jdk14.v2.09b.full/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Compare the open.patch file in >>>>>>> 12-for-jdk14.v2.09.full and >>>>>>> >>>>>>> 12-for-jdk14.v2.09b.full >>>>>>> >>>>>>> using your favorite file comparison/merge tool >>>>>>> to see how Async >>>>>>> >>>>>>> Monitor Deflation >>>>>>> >>>>>>> evolved due to {8235931, 8236035, 8235795}. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Some folks might want to see just the Async >>>>>>> Monitor Deflation >>>>>>> >>>>>>> code on top of >>>>>>> >>>>>>> {8235931, 8236035, 8235795} so here's a webrev >>>>>>> for that (v2.09b >>>>>>> >>>>>>> inc): >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- >>>> jdk14.v2.09b.inc/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> These webrevs have gone thru several Mach5 >>>>>>> Tier[1-8] runs along >>>>>>> >>>>>>> with >>>>>>> >>>>>>> my usual stress testing and SPECjbb2015 testing >>>>>>> and there aren't >>>>>>> >>>>>>> any >>>>>>> >>>>>>> surprises relative to CR9/v2.09/12-for-jdk14. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks, in advance, for any questions, comments >>>>>>> or suggestions. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 12/11/19 3:41 PM, Daniel D. Daugherty wrote: >>>>>>> >>>>>>>> Greetings, >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> I have made changes to the Async Monitor >>>>>>> Deflation code in >>>>>>> >>>>>>>> response to >>>>>>> >>>>>>>> the CR8/v2.08/11-for-jdk14 code review cycle. >>>>>>> Thanks to David >>>>>>> >>>>>>>> H., Robbin >>>>>>> >>>>>>>> and Erik O. for their comments! >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> This project is no longer targeted to JDK14 so >>>>>>> this is NOT an >>>>>>> >>>>>>>> urgent code >>>>>>> >>>>>>>> review request. The primary purpose of this >>>>>>> webrev is simply to >>>>>>> >>>>>>>> close the >>>>>>> >>>>>>>> CR8/v2.08/11-for-jdk14 code review loop and to >>>>>>> let folks see >>>>>>> >>>>>>>> how I resolved >>>>>>> >>>>>>>> the code review comments from that round. >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> Most of the comments in the >>>>>>> CR8/v2.08/11-for-jdk14 code review >>>>>>> >>>>>>>> cycle were >>>>>>> >>>>>>>> on the monitor list changes so I'm going to >>>>>>> take a look at >>>>>>> >>>>>>>> extracting those >>>>>>> >>>>>>>> changes into a standalone patch. Switching from >>>>>>> >>>>>>>> Thread::muxAcquire(&gListLock) >>>>>>> >>>>>>>> and Thread::muxRelease(&gListLock) to finer >>>>>>> grained internal >>>>>>> >>>>>>>> spin locks needs >>>>>>> >>>>>>>> to be thoroughly reviewed and the best way to >>>>>>> do that is >>>>>>> >>>>>>>> separately from the >>>>>>> >>>>>>>> Async Monitor Deflation changes. Thanks to >>>>>>> Coleen for >>>>>>> >>>>>>>> suggesting doing this >>>>>>> >>>>>>>> extraction earlier. >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> I have attached the change list from CR8 to CR9 >>>>>>> instead of >>>>>>> >>>>>>>> putting it in >>>>>>> >>>>>>>> the body of this email. I've also added a link >>>>>>> to the >>>>>>> >>>>>>>> CR8-to-CR9-changes >>>>>>> >>>>>>>> file to the webrevs so it should be easy to find. >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> Main bug URL: >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> JDK-8153224 Monitor deflation prolong safepoints >>>>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> The project is currently baselined on jdk-14+26. >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> Here's the full webrev URL for those folks that >>>>>>> want to see all >>>>>>> >>>>>>>> of the >>>>>>> >>>>>>>> current Async Monitor Deflation code in one go >>>>>>> (v2.09 full): >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- >>>> jdk14.v2.09.full/ >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> Some folks might want to see just what has >>>>>>> changed since the >>>>>>> >>>>>>>> last review >>>>>>> >>>>>>>> cycle so here's a webrev for that (v2.09 inc): >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/12-for- >>>> jdk14.v2.09.inc/ >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> The OpenJDK wiki has NOT yet been updated for >>>>>>> this round of >>>>>>> >>>>>>>> changes: >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> The jdk-14+26 based v2.09 version of the patch >>>>>>> has been thru >>>>>>> >>>>>>>> Mach5 tier[1-7] >>>>>>> >>>>>>>> testing on Oracle's usual set of platforms. >>>>>>> Mach5 tier8 is >>>>>>> >>>>>>>> still running. >>>>>>> >>>>>>>> A slightly older version of the v2.09 patch has >>>>>>> also been >>>>>>> >>>>>>>> through my usual >>>>>>> >>>>>>>> set of stress testing on Linux-X64 and macOSX >>>>>>> with the addition >>>>>>> >>>>>>>> of Robbin's >>>>>>> >>>>>>>> "MoCrazy 1024" test running in parallel on >>>>>>> Linux-X64 with the >>>>>>> >>>>>>>> other tests in >>>>>>> >>>>>>>> my lab. The "MoCrazy 1024" has been going for > >>>>>>> 5 days and >>>>>>> >>>>>>>> 6700+ iterations >>>>>>> >>>>>>>> without any failures. >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> I'm planning to do a SPECjbb2015 round on the >>>>>>> >>>>>>>> CR9/v2.09/12-for-jdk14 bits. >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> Thanks, in advance, for any questions, comments >>>>>>> or suggestions. >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> Dan >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> On 11/4/19 4:03 PM, Daniel D. Daugherty wrote: >>>>>>> >>>>>>>>> Greetings, >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> I have made changes to the Async Monitor >>>>>>> Deflation code in >>>>>>> >>>>>>>>> response to >>>>>>> >>>>>>>>> the CR7/v2.07/10-for-jdk14 code review cycle. >>>>>>> Thanks to David >>>>>>> >>>>>>>>> H., Robbin >>>>>>> >>>>>>>>> and Erik O. for their comments! >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> JDK14 Rampdown phase one is coming on Dec. 12, >>>>>>> 2019 and the >>>>>>> >>>>>>>>> Async Monitor >>>>>>> >>>>>>>>> Deflation project needs to push before Nov. >>>>>>> 12, 2019 in order >>>>>>> >>>>>>>>> to allow >>>>>>> >>>>>>>>> for sufficient bake time for such a big >>>>>>> change. Nov. 12 is >>>>>>> >>>>>>>>> _next_ Tuesday >>>>>>> >>>>>>>>> so we have 8 days from today to finish this >>>>>>> code review cycle >>>>>>> >>>>>>>>> and push >>>>>>> >>>>>>>>> this code for JDK14. >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> Carsten and Roman! Time for you guys to chime >>>>>>> in again on the >>>>>>> >>>>>>>>> code reviews. >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> I have attached the change list from CR7 to >>>>>>> CR8 instead of >>>>>>> >>>>>>>>> putting it in >>>>>>> >>>>>>>>> the body of this email. I've also added a link >>>>>>> to the >>>>>>> >>>>>>>>> CR7-to-CR8-changes >>>>>>> >>>>>>>>> file to the webrevs so it should be easy to find. >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> Main bug URL: >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints >>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- >> 8153224 >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> The project is currently baselined on jdk-14+21. >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> Here's the full webrev URL for those folks >>>>>>> that want to see >>>>>>> >>>>>>>>> all of the >>>>>>> >>>>>>>>> current Async Monitor Deflation code in one go >>>>>>> (v2.08 full): >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for- >>>> jdk14.v2.08.full >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> Some folks might want to see just what has >>>>>>> changed since the >>>>>>> >>>>>>>>> last review >>>>>>> >>>>>>>>> cycle so here's a webrev for that (v2.08 inc): >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/11-for- >>>> jdk14.v2.08.inc/ >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> The OpenJDK wiki did not need any changes for >>>>>>> this round: >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> The jdk-14+21 based v2.08 version of the patch >>>>>>> has been thru >>>>>>> >>>>>>>>> Mach5 tier[1-8] >>>>>>> >>>>>>>>> testing on Oracle's usual set of platforms. It >>>>>>> has also been >>>>>>> >>>>>>>>> through my usual >>>>>>> >>>>>>>>> set of stress testing on Linux-X64, macOSX and >>>>>>> Solaris-X64 >>>>>>> >>>>>>>>> with the addition >>>>>>> >>>>>>>>> of Robbin's "MoCrazy 1024" test running in >>>>>>> parallel with the >>>>>>> >>>>>>>>> other tests in >>>>>>> >>>>>>>>> my lab. Some testing is still running, but so >>>>>>> far there are no >>>>>>> >>>>>>>>> new regressions. >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> I have not yet done a SPECjbb2015 round on the >>>>>>> >>>>>>>>> CR8/v2.08/11-for-jdk14 bits. >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> Thanks, in advance, for any questions, >>>>>>> comments or suggestions. >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> Dan >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> On 10/17/19 5:50 PM, Daniel D. Daugherty wrote: >>>>>>> >>>>>>>>>> Greetings, >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> The Async Monitor Deflation project is >>>>>>> reaching the end game. >>>>>>> >>>>>>>>>> I have no >>>>>>> >>>>>>>>>> changes planned for the project at this time >>>>>>> so all that is >>>>>>> >>>>>>>>>> left is code >>>>>>> >>>>>>>>>> review and any changes that results from >>>>>>> those reviews. >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> Carsten and Roman! Time for you guys to chime >>>>>>> in again on the >>>>>>> >>>>>>>>>> code reviews. >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> I have attached the list of fixes from CR6 to >>>>>>> CR7 instead of >>>>>>> >>>>>>>>>> putting it >>>>>>> >>>>>>>>>> in the main body of this email. >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> Main bug URL: >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> JDK-8153224 Monitor deflation prolong safepoints >>>>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- >> 8153224 >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> The project is currently baselined on jdk-14+19. >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> Here's the full webrev URL for those folks >>>>>>> that want to see >>>>>>> >>>>>>>>>> all of the >>>>>>> >>>>>>>>>> current Async Monitor Deflation code in one >>>>>>> go (v2.07 full): >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for- >>>> jdk14.v2.07.full >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> Some folks might want to see just what has >>>>>>> changed since the >>>>>>> >>>>>>>>>> last review >>>>>>> >>>>>>>>>> cycle so here's a webrev for that (v2.07 inc): >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/10-for- >>>> jdk14.v2.07.inc/ >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> The OpenJDK wiki has been updated to match the >>>>>>> >>>>>>>>>> CR7/v2.07/10-for-jdk14 changes: >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> The jdk-14+18 based v2.07 version of the >>>>>>> patch has been thru >>>>>>> >>>>>>>>>> Mach5 tier[1-8] >>>>>>> >>>>>>>>>> testing on Oracle's usual set of platforms. >>>>>>> It has also been >>>>>>> >>>>>>>>>> through my usual >>>>>>> >>>>>>>>>> set of stress testing on Linux-X64, macOSX >>>>>>> and Solaris-X64 >>>>>>> >>>>>>>>>> with the addition >>>>>>> >>>>>>>>>> of Robbin's "MoCrazy 1024" test running in >>>>>>> parallel with the >>>>>>> >>>>>>>>>> other tests in >>>>>>> >>>>>>>>>> my lab. >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> The jdk-14+19 based v2.07 version of the >>>>>>> patch has been thru >>>>>>> >>>>>>>>>> Mach5 tier[1-3] >>>>>>> >>>>>>>>>> test on Oracle's usual set of platforms. >>>>>>> Mach5 tier[4-8] are >>>>>>> >>>>>>>>>> in process. >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> I did another round of SPECjbb2015 testing in >>>>>>> Oracle's Aurora >>>>>>> >>>>>>>>>> Performance lab >>>>>>> >>>>>>>>>> using using their tuned SPECjbb2015 Linux-X64 >>>>>>> G1 configs: >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> - "base" is jdk-14+18 >>>>>>> >>>>>>>>>> - "v2.07" is the latest version and includes C2 >>>>>>> >>>>>>>>>> inc_om_ref_count() support >>>>>>> >>>>>>>>>> ????? on LP64 X64 and the new >>>>>>> >>>>>>>>>> HandshakeAfterDeflateIdleMonitors option >>>>>>> >>>>>>>>>> - "off" is with -XX:-AsyncDeflateIdleMonitors >>>>>>> specified >>>>>>> >>>>>>>>>> - "handshake" is with >>>>>>> >>>>>>>>>> -XX:+HandshakeAfterDeflateIdleMonitors >> specified >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> ???????? hbIR?????????? hbIR >>>>>>> >>>>>>>>>> (max attempted)? (settled)? max-jOPS >>>>>>> critical-jOPS runtime >>>>>>> >>>>>>>>>> ---------------? ---------? -------- >>>>>>> ------------- ------- >>>>>>> >>>>>>>>>> ?????????? 34282.00?? 30635.90? 28831.30 >>>>>>> 20969.20 3841.30 base >>>>>>> >>>>>>>>>> ?????????? 34282.00?? 30973.00? 29345.80 >>>>>>> 21025.20 3964.10 v2.07 >>>>>>> >>>>>>>>>> ?????????? 34282.00?? 31105.60? 29174.30 >>>>>>> 21074.00 3931.30 >>>>>>> >>>>>>>>>> v2.07_handshake >>>>>>> >>>>>>>>>> ?????????? 34282.00?? 30789.70? 27151.60 >>>>>>> 19839.10 3850.20 >>>>>>> >>>>>>>>>> v2.07_off >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> - The Aurora Perf comparison tool reports: >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> ??????? Comparison????????????? max-jOPS >>>>>>> critical-jOPS >>>>>>> >>>>>>>>>> ??????? ---------------------- >>>>>>> -------------------- >>>>>>> >>>>>>>>>> -------------------- >>>>>>> >>>>>>>>>> ??????? base vs 2.07??????????? +1.78% (s, >>>>>>> p=0.000) +0.27% >>>>>>> >>>>>>>>>> (ns, p=0.790) >>>>>>> >>>>>>>>>> ??????? base vs 2.07_handshake? +1.19% (s, >>>>>>> p=0.007) +0.58% >>>>>>> >>>>>>>>>> (ns, p=0.536) >>>>>>> >>>>>>>>>> ??????? base vs 2.07_off??????? -5.83% (ns, >>>>>>> p=0.394) -5.39% >>>>>>> >>>>>>>>>> (ns, p=0.347) >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> ??????? (s) - significant? (ns) - not-significant >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> - For historical comparison, the Aurora Perf >>>>>>> comparision >>>>>>> >>>>>>>>>> tool >>>>>>> >>>>>>>>>> ??????? reported for v2.06 with a baseline of >>>>>>> jdk-13+31: >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> ??????? Comparison????????????? max-jOPS >>>>>>> critical-jOPS >>>>>>> >>>>>>>>>> ??????? ---------------------- >>>>>>> -------------------- >>>>>>> >>>>>>>>>> -------------------- >>>>>>> >>>>>>>>>> ??????? base vs 2.06??????????? -0.32% (ns, >>>>>>> p=0.345) +0.71% >>>>>>> >>>>>>>>>> (ns, p=0.646) >>>>>>> >>>>>>>>>> ??????? base vs 2.06_off??????? +0.49% (ns, >>>>>>> p=0.292) -1.21% >>>>>>> >>>>>>>>>> (ns, p=0.481) >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> ??????? (s) - significant? (ns) - not-significant >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> Thanks, in advance, for any questions, >>>>>>> comments or suggestions. >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> Dan >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> On 8/28/19 5:02 PM, Daniel D. Daugherty wrote: >>>>>>> >>>>>>>>>>> Greetings, >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> The Async Monitor Deflation project has >>>>>>> rebased to JDK14 so >>>>>>> >>>>>>>>>>> it's time >>>>>>> >>>>>>>>>>> for our first code review in that new context!! >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> I've been focused on changing the monitor >>>>>>> list management >>>>>>> >>>>>>>>>>> code to be >>>>>>> >>>>>>>>>>> lock-free in order to make SPECjbb2015 >>>>>>> happier. Of course >>>>>>> >>>>>>>>>>> with a change >>>>>>> >>>>>>>>>>> like that, it takes a while to chase down >>>>>>> all the new and >>>>>>> >>>>>>>>>>> wonderful >>>>>>> >>>>>>>>>>> races. At this point, I have the code back >>>>>>> to the same >>>>>>> >>>>>>>>>>> stability that >>>>>>> >>>>>>>>>>> I had with CR5/v2.05/8-for-jdk13. >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> To lay the ground work for this round of >>>>>>> review, I pushed >>>>>>> >>>>>>>>>>> the following >>>>>>> >>>>>>>>>>> two fixes to jdk/jdk earlier today: >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> ??? JDK-8230184 rename, whitespace, indent >>>>>>> and comments >>>>>>> >>>>>>>>>>> changes in preparation >>>>>>> >>>>>>>>>>> ? ? ??????????? for lock free Monitor lists >>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- >>>> 8230184 >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> ??? JDK-8230317 >>>>>>> serviceability/sa/ClhsdbPrintStatics.java >>>>>>> >>>>>>>>>>> fails after 8230184 >>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- >>>> 8230317 >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> I have attached the list of fixes from CR5 >>>>>>> to CR6 instead of >>>>>>> >>>>>>>>>>> putting >>>>>>> >>>>>>>>>>> in the main body of this email. >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> Main bug URL: >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong >>>>>>> safepoints >>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK- >>>> 8153224 >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> The project is currently baselined on >>>>>>> jdk-14+11 plus the >>>>>>> >>>>>>>>>>> fixes for >>>>>>> >>>>>>>>>>> JDK-8230184 and JDK-8230317. >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> Here's the full webrev URL for those folks >>>>>>> that want to see >>>>>>> >>>>>>>>>>> all of the >>>>>>> >>>>>>>>>>> current Async Monitor Deflation code in one >>>>>>> go (v2.06 full): >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >>>> jdk14.v2.06.full/ >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> The primary focus of this review cycle is on >>>>>>> the lock-free >>>>>>> >>>>>>>>>>> Monitor List >>>>>>> >>>>>>>>>>> management changes so here's a webrev for >>>>>>> just that patch >>>>>>> >>>>>>>>>>> (v2.06c): >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >>>> jdk14.v2.06c.inc/ >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> The secondary focus of this review cycle is >>>>>>> on the bug fixes >>>>>>> >>>>>>>>>>> that have >>>>>>> >>>>>>>>>>> been made since CR5/v2.05/8-for-jdk13 so >>>>>>> here's a webrev for >>>>>>> >>>>>>>>>>> just that >>>>>>> >>>>>>>>>>> patch (v2.06b): >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >>>> jdk14.v2.06b.inc/ >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> The third and final bucket for this review >>>>>>> cycle is the >>>>>>> >>>>>>>>>>> rename, whitespace, >>>>>>> >>>>>>>>>>> indent and comments changes made in >>>>>>> preparation for lock >>>>>>> >>>>>>>>>>> free Monitor list >>>>>>> >>>>>>>>>>> management. Almost all of that was extracted >>>>>>> into >>>>>>> >>>>>>>>>>> JDK-8230184 for the >>>>>>> >>>>>>>>>>> baseline so this bucket now has just a few >>>>>>> comment changes >>>>>>> >>>>>>>>>>> relative to >>>>>>> >>>>>>>>>>> CR5/v2.05/8-for-jdk13. Here's a webrev for >>>>>>> the remainder >>>>>>> >>>>>>>>>>> (v2.06a): >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >>>> jdk14.v2.06a.inc/ >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> Some folks might want to see just what has >>>>>>> changed since the >>>>>>> >>>>>>>>>>> last review >>>>>>> >>>>>>>>>>> cycle so here's a webrev for that (v2.06 inc): >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >>>> jdk14.v2.06.inc/ >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> Last, but not least, some folks might want >>>>>>> to see the code >>>>>>> >>>>>>>>>>> before the >>>>>>> >>>>>>>>>>> addition of lock-free Monitor List >>>>>>> management so here's a >>>>>>> >>>>>>>>>>> webrev for >>>>>>> >>>>>>>>>>> that (v2.00 -> v2.05): >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/9-for- >>>> jdk14.v2.05.inc/ >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> The OpenJDK wiki will need minor updates to >>>>>>> match the CR6 >>>>>>> >>>>>>>>>>> changes: >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> but that should only be changes to describe >>>>>>> per-thread list >>>>>>> >>>>>>>>>>> async monitor >>>>>>> >>>>>>>>>>> deflation being done by the ServiceThread. >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> (I did update the OpenJDK wiki for the CR5 >>>>>>> changes back on >>>>>>> >>>>>>>>>>> 2019.08.14) >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> This version of the patch has been thru >>>>>>> Mach5 tier[1-8] >>>>>>> >>>>>>>>>>> testing on >>>>>>> >>>>>>>>>>> Oracle's usual set of platforms. It has also >>>>>>> been through my >>>>>>> >>>>>>>>>>> usual set >>>>>>> >>>>>>>>>>> of stress testing on Linux-X64, macOSX and >>>>>>> Solaris-X64. >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> I did a bunch of SPECjbb2015 testing in >>>>>>> Oracle's Aurora >>>>>>> >>>>>>>>>>> Performance lab >>>>>>> >>>>>>>>>>> using using their tuned SPECjbb2015 >>>>>>> Linux-X64 G1 configs. >>>>>>> >>>>>>>>>>> This was using >>>>>>> >>>>>>>>>>> this patch baselined on jdk-13+31 (for >>>>>>> stability): >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> ????????? hbIR?????????? hbIR >>>>>>> >>>>>>>>>>> ???? (max attempted)? (settled)? max-jOPS >>>>>>> critical-jOPS runtime >>>>>>> >>>>>>>>>>> ???? ---------------? ---------? -------- >>>>>>> ------------- ------- >>>>>>> >>>>>>>>>>> ??????????? 34282.00?? 28837.20? 27905.20 >>>>>>> 19817.40 3658.10 base >>>>>>> >>>>>>>>>>> ??????????? 34965.70?? 29798.80? 27814.90 >>>>>>> 19959.00 3514.60 >>>>>>> >>>>>>>>>>> v2.06d >>>>>>> >>>>>>>>>>> ??????????? 34282.00?? 29100.70? 28042.50 >>>>>>> 19577.00 3701.90 >>>>>>> >>>>>>>>>>> v2.06d_off >>>>>>> >>>>>>>>>>> ??????????? 34282.00?? 29218.50? 27562.80 >>>>>>> 19397.30 3657.60 >>>>>>> >>>>>>>>>>> v2.06d_ocache >>>>>>> >>>>>>>>>>> ??????????? 34965.70?? 29838.30? 26512.40 >>>>>>> 19170.60 3569.90 >>>>>>> >>>>>>>>>>> v2.05 >>>>>>> >>>>>>>>>>> ??????????? 34282.00?? 28926.10? 27734.00 >>>>>>> 19835.10 3588.40 >>>>>>> >>>>>>>>>>> v2.05_off >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> The "off" configs are with >>>>>>> -XX:-AsyncDeflateIdleMonitors >>>>>>> >>>>>>>>>>> specified and >>>>>>> >>>>>>>>>>> the "ocache" config is with 128 byte cache >>>>>>> line sizes >>>>>>> >>>>>>>>>>> instead of 64 byte >>>>>>> >>>>>>>>>>> cache lines sizes. "v2.06d" is the last set >>>>>>> of changes that >>>>>>> >>>>>>>>>>> I made before >>>>>>> >>>>>>>>>>> those changes were distributed into the >>>>>>> "v2.06a", "v2.06b" >>>>>>> >>>>>>>>>>> and "v2.06c" >>>>>>> >>>>>>>>>>> buckets for this review recycle. >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> Thanks, in advance, for any questions, >>>>>>> comments or suggestions. >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> Dan >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> On 7/11/19 3:49 PM, Daniel D. Daugherty wrote: >>>>>>> >>>>>>>>>>>> Greetings, >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> I've been focused on chasing down and >>>>>>> fixing the rare test >>>>>>> >>>>>>>>>>>> failures >>>>>>> >>>>>>>>>>>> that only pop up rarely. So this round is >>>>>>> primarily fixes >>>>>>> >>>>>>>>>>>> for races >>>>>>> >>>>>>>>>>>> with a few additional fixes that came from >>>>>>> Karen's review >>>>>>> >>>>>>>>>>>> of CR4. >>>>>>> >>>>>>>>>>>> Thanks Karen! >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> I have attached the list of fixes from CR4 >>>>>>> to CR5 instead >>>>>>> >>>>>>>>>>>> of putting >>>>>>> >>>>>>>>>>>> in the main body of this email. >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> Main bug URL: >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong >>>>>>> safepoints >>>>>>> >>>>>>>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> The project is currently baselined on >>>>>>> jdk-13+29. This will >>>>>>> >>>>>>>>>>>> likely be >>>>>>> >>>>>>>>>>>> the last JDK13 baseline for this project >>>>>>> and I'll roll to >>>>>>> >>>>>>>>>>>> the JDK14 >>>>>>> >>>>>>>>>>>> (jdk/jdk) repo soon... >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> Here's the full webrev URL: >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for- >>>> jdk13.full/ >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> Here's the incremental webrev URL: >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/8-for- >>>> jdk13.inc/ >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> I have not yet checked the OpenJDK wiki to >>>>>>> see if it needs >>>>>>> >>>>>>>>>>>> any updates >>>>>>> >>>>>>>>>>>> to match the CR5 changes: >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> (I did update the OpenJDK wiki for the CR4 >>>>>>> changes back on >>>>>>> >>>>>>>>>>>> 2019.06.26) >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> This version of the patch has been thru >>>>>>> Mach5 tier[1-3] >>>>>>> >>>>>>>>>>>> testing on >>>>>>> >>>>>>>>>>>> Oracle's usual set of platforms. Mach5 >>>>>>> tier[4-6] is running >>>>>>> >>>>>>>>>>>> now and >>>>>>> >>>>>>>>>>>> Mach5 tier[78] will follow. I'll kick off >>>>>>> the usual stress >>>>>>> >>>>>>>>>>>> testing >>>>>>> >>>>>>>>>>>> on Linux-X64, macOSX and Solaris-X64 as >>>>>>> those machines >>>>>>> >>>>>>>>>>>> become available. >>>>>>> >>>>>>>>>>>> Since I haven't made any performance >>>>>>> changes in this round, >>>>>>> >>>>>>>>>>>> I'll only >>>>>>> >>>>>>>>>>>> be running SPECjbb2015 to gather the latest >>>>>>> >>>>>>>>>>>> monitorinflation logs. >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> Next up: >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> - We're still seeing 4-5% lower performance >>>>>>> with >>>>>>> >>>>>>>>>>>> SPECjbb2015 on >>>>>>> >>>>>>>>>>>> ? Linux-X64 and we've determined that some >>>>>>> of that comes from >>>>>>> >>>>>>>>>>>> ? contention on the gListLock. So I'm going >>>>>>> to investigate >>>>>>> >>>>>>>>>>>> removing >>>>>>> >>>>>>>>>>>> ? the gListLock. Yes, another lock free set >>>>>>> of changes is >>>>>>> >>>>>>>>>>>> coming! >>>>>>> >>>>>>>>>>>> - Of course, going lock free often causes >>>>>>> new races and new >>>>>>> >>>>>>>>>>>> failures >>>>>>> >>>>>>>>>>>> ? so that's a good reason for make those >>>>>>> changes isolated >>>>>>> >>>>>>>>>>>> in their >>>>>>> >>>>>>>>>>>> ? own round (and not holding up >>>>>>> CR5/v2.05/8-for-jdk13 >>>>>>> >>>>>>>>>>>> anymore). >>>>>>> >>>>>>>>>>>> - I finally have a potential fix for the >>>>>>> Win* failure with >>>>>>> >>>>>>>>>>>> >>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>>>> >>>>>>>>>>>> ? but I haven't run it through Mach5 yet so >>>>>>> it'll be in the >>>>>>> >>>>>>>>>>>> next round. >>>>>>> >>>>>>>>>>>> - Some RTM tests were recently re-enabled >>>>>>> in Mach5 and I'm >>>>>>> >>>>>>>>>>>> seeing some >>>>>>> >>>>>>>>>>>> ? monitor related failures there. I suspect >>>>>>> that I need to >>>>>>> >>>>>>>>>>>> go take a >>>>>>> >>>>>>>>>>>> ? look at the C2 RTM macro assembler code >>>>>>> and look for >>>>>>> >>>>>>>>>>>> things that might >>>>>>> >>>>>>>>>>>> ? conflict if Async Monitor Deflation. If >>>>>>> you're interested >>>>>>> >>>>>>>>>>>> in that kind >>>>>>> >>>>>>>>>>>> ? of issue, then see the >>>>>>> macroAssembler_x86.cpp sanity >>>>>>> >>>>>>>>>>>> check that I >>>>>>> >>>>>>>>>>>> ? added in this round! >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> Thanks, in advance, for any questions, >>>>>>> comments or >>>>>>> >>>>>>>>>>>> suggestions. >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> Dan >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> On 5/26/19 8:30 PM, Daniel D. Daugherty >> wrote: >>>>>>> >>>>>>>>>>>>> Greetings, >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> I have a fix for an issue that came up >>>>>>> during performance >>>>>>> >>>>>>>>>>>>> testing. >>>>>>> >>>>>>>>>>>>> Many thanks to Robbin for diagnosing the >>>>>>> issue in his >>>>>>> >>>>>>>>>>>>> SPECjbb2015 >>>>>>> >>>>>>>>>>>>> experiments. >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Here's the list of changes from CR3 to >>>>>>> CR4. The list is a bit >>>>>>> >>>>>>>>>>>>> verbose due to the complexity of the >>>>>>> issue, but the changes >>>>>>> >>>>>>>>>>>>> themselves are not that big. >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Functional: >>>>>>> >>>>>>>>>>>>> ? - Change >>>>>>> SafepointSynchronize::is_cleanup_needed() from >>>>>>> >>>>>>>>>>>>> calling >>>>>>> >>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() to >>>>>>> calling >>>>>>> >>>>>>>>>>>>> >>>>>>> ObjectSynchronizer::is_safepoint_deflation_needed(): >>>>>>> >>>>>>>>>>>>> ??? - is_safepoint_deflation_needed() >>>>>>> returns the result of >>>>>>> >>>>>>>>>>>>> monitors_used_above_threshold() for >>>>>>> safepoint based >>>>>>> >>>>>>>>>>>>> ????? monitor deflation >>>>>>> (!AsyncDeflateIdleMonitors). >>>>>>> >>>>>>>>>>>>> ??? - For AsyncDeflateIdleMonitors, it >>>>>>> only returns true if >>>>>>> >>>>>>>>>>>>> ????? there is a special deflation >>>>>>> request, e.g., System.gc() >>>>>>> >>>>>>>>>>>>> ????? - This solves a bug where there are >>>>>>> a bunch of Cleanup >>>>>>> >>>>>>>>>>>>> ??????? safepoints that simply request >>>>>>> async deflation which >>>>>>> >>>>>>>>>>>>> ??????? keeps the async JavaThreads from >>>>>>> making progress on >>>>>>> >>>>>>>>>>>>> ??????? their async deflation work. >>>>>>> >>>>>>>>>>>>> ? - Add AsyncDeflationInterval diagnostic >>>>>>> option. >>>>>>> >>>>>>>>>>>>> Description: >>>>>>> >>>>>>>>>>>>> ????? Async deflate idle monitors every so >>>>>>> many >>>>>>> >>>>>>>>>>>>> milliseconds when >>>>>>> >>>>>>>>>>>>> MonitorUsedDeflationThreshold is exceeded >>>>>>> (0 is off). >>>>>>> >>>>>>>>>>>>> ? - Replace >>>>>>> >>>>>>>>>>>>> >>>>>>> ObjectSynchronizer::gOmShouldDeflateIdleMonitors() with >>>>>>> >>>>>>>>>>>>> >>>>>>> ObjectSynchronizer::is_async_deflation_needed(): >>>>>>> >>>>>>>>>>>>> ??? - is_async_deflation_needed() returns >>>>>>> true when >>>>>>> >>>>>>>>>>>>> is_async_cleanup_requested() is true or >> when >>>>>>> >>>>>>>>>>>>> monitors_used_above_threshold() is true >>>>>>> (but no more >>>>>>> >>>>>>>>>>>>> often than >>>>>>> >>>>>>>>>>>>> AsyncDeflationInterval). >>>>>>> >>>>>>>>>>>>> ??? - if AsyncDeflateIdleMonitors >>>>>>> Service_lock->wait() now >>>>>>> >>>>>>>>>>>>> waits for >>>>>>> >>>>>>>>>>>>> ????? at most GuaranteedSafepointInterval >>>>>>> millis: >>>>>>> >>>>>>>>>>>>> ????? - This allows >>>>>>> is_async_deflation_needed() to be >>>>>>> >>>>>>>>>>>>> checked at >>>>>>> >>>>>>>>>>>>> ??????? the same interval as >>>>>>> GuaranteedSafepointInterval. >>>>>>> >>>>>>>>>>>>> ??????? (default is 1000 millis/1 second) >>>>>>> >>>>>>>>>>>>> ????? - Once is_async_deflation_needed() >>>>>>> has returned >>>>>>> >>>>>>>>>>>>> true, it >>>>>>> >>>>>>>>>>>>> ??????? generally cannot return true for >>>>>>> >>>>>>>>>>>>> AsyncDeflationInterval. >>>>>>> >>>>>>>>>>>>> ??????? This is to prevent async deflation >>>>>>> from swamping the >>>>>>> >>>>>>>>>>>>> ServiceThread. >>>>>>> >>>>>>>>>>>>> ? - The ServiceThread still handles async >>>>>>> deflation of the >>>>>>> >>>>>>>>>>>>> global >>>>>>> >>>>>>>>>>>>> ??? in-use list and now it also marks >>>>>>> JavaThreads for >>>>>>> >>>>>>>>>>>>> async deflation >>>>>>> >>>>>>>>>>>>> ??? of their in-use lists. >>>>>>> >>>>>>>>>>>>> ??? - The ServiceThread will check for >>>>>>> async deflation >>>>>>> >>>>>>>>>>>>> work every >>>>>>> >>>>>>>>>>>>> GuaranteedSafepointInterval. >>>>>>> >>>>>>>>>>>>> ??? - A safepoint can still cause the >>>>>>> ServiceThread to >>>>>>> >>>>>>>>>>>>> check for >>>>>>> >>>>>>>>>>>>> ????? async deflation work via >>>>>>> is_async_deflation_requested. >>>>>>> >>>>>>>>>>>>> ? - Refactor code from >>>>>>> >>>>>>>>>>>>> ObjectSynchronizer::is_cleanup_needed() >> into >>>>>>> >>>>>>>>>>>>> monitors_used_above_threshold() and >> remove >>>>>>> >>>>>>>>>>>>> is_cleanup_needed(). >>>>>>> >>>>>>>>>>>>> ? - In addition to System.gc(), the >>>>>>> VM_Exit VM op and the >>>>>>> >>>>>>>>>>>>> final >>>>>>> >>>>>>>>>>>>> ??? VMThread safepoint now set the >>>>>>> >>>>>>>>>>>>> is_special_deflation_requested >>>>>>> >>>>>>>>>>>>> ??? flag to reduce the in-use monitor >>>>>>> population that is >>>>>>> >>>>>>>>>>>>> reported by >>>>>>> >>>>>>>>>>>>> >>>>>>> ObjectSynchronizer::log_in_use_monitor_details() at VM exit. >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Test update: >>>>>>> >>>>>>>>>>>>> ? - >>>>>>> test/hotspot/gtest/oops/test_markOop.cpp is updated to >>>>>>> >>>>>>>>>>>>> work with >>>>>>> >>>>>>>>>>>>> AsyncDeflateIdleMonitors. >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Collateral: >>>>>>> >>>>>>>>>>>>> ? - Add/clarify/update some logging messages. >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Cleanup: >>>>>>> >>>>>>>>>>>>> ? - Updated comments based on Karen's code >>>>>>> review. >>>>>>> >>>>>>>>>>>>> ? - Change 'special cleanup' -> 'special >>>>>>> deflation' and >>>>>>> >>>>>>>>>>>>> ??? 'async cleanup' -> 'async deflation'. >>>>>>> >>>>>>>>>>>>> ??? - comment and function name changes >>>>>>> >>>>>>>>>>>>> ? - Clarify MonitorUsedDeflationThreshold >>>>>>> description; >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Main bug URL: >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong >>>>>>> safepoints >>>>>>> >>>>>>>>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> The project is currently baselined on >>>>>>> jdk-13+22. >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Here's the full webrev URL: >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for- >>>> jdk13.full/ >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Here's the incremental webrev URL: >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/7-for- >>>> jdk13.inc/ >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> I have not updated the OpenJDK wiki to >>>>>>> reflect the CR4 >>>>>>> >>>>>>>>>>>>> changes: >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> The wiki doesn't say a whole lot about the >>>>>>> async deflation >>>>>>> >>>>>>>>>>>>> invocation >>>>>>> >>>>>>>>>>>>> mechanism so I have to figure out how to >>>>>>> add that content. >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> This version of the patch has been thru >>>>>>> Mach5 tier[1-8] >>>>>>> >>>>>>>>>>>>> testing on >>>>>>> >>>>>>>>>>>>> Oracle's usual set of platforms. My >>>>>>> Solaris-X64 stress kit >>>>>>> >>>>>>>>>>>>> run is >>>>>>> >>>>>>>>>>>>> running now. Kitchensink8H on product, >>>>>>> fastdebug, and >>>>>>> >>>>>>>>>>>>> slowdebug bits >>>>>>> >>>>>>>>>>>>> are running on Linux-X64, MacOSX and >>>>>>> Solaris-X64. I still >>>>>>> >>>>>>>>>>>>> have to run >>>>>>> >>>>>>>>>>>>> my stress kit on Linux-X64. I still have >>>>>>> to run the >>>>>>> >>>>>>>>>>>>> SPECjbb2015 >>>>>>> >>>>>>>>>>>>> baseline and CR4 runs on Linux-X64, MacOSX >>>>>>> and Solaris-X64. >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>>>> comments or >>>>>>> >>>>>>>>>>>>> suggestions. >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> Dan >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> On 5/6/19 11:52 AM, Daniel D. Daugherty >> wrote: >>>>>>> >>>>>>>>>>>>>> Greetings, >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> I had some discussions with Karen about a >>>>>>> race that was >>>>>>> >>>>>>>>>>>>>> in the >>>>>>> >>>>>>>>>>>>>> ObjectMonitor::enter() code in >>>>>>> CR2/v2.02/5-for-jdk13. >>>>>>> >>>>>>>>>>>>>> This race was >>>>>>> >>>>>>>>>>>>>> theoretical and I had no test failures >>>>>>> due to it. The fix >>>>>>> >>>>>>>>>>>>>> is pretty >>>>>>> >>>>>>>>>>>>>> simple: remove the special case code for >>>>>>> async deflation >>>>>>> >>>>>>>>>>>>>> in the >>>>>>> >>>>>>>>>>>>>> ObjectMonitor::enter() function and rely >>>>>>> solely on the >>>>>>> >>>>>>>>>>>>>> ref_count >>>>>>> >>>>>>>>>>>>>> for ObjectMonitor::enter() protection. >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> During those discussions Karen also >>>>>>> floated the idea of >>>>>>> >>>>>>>>>>>>>> using the >>>>>>> >>>>>>>>>>>>>> ref_count field instead of the >>>>>>> contentions field for the >>>>>>> >>>>>>>>>>>>>> Async >>>>>>> >>>>>>>>>>>>>> Monitor Deflation protocol. I decided to >>>>>>> go ahead and >>>>>>> >>>>>>>>>>>>>> code up that >>>>>>> >>>>>>>>>>>>>> change and I have run it through the >>>>>>> usual stress and >>>>>>> >>>>>>>>>>>>>> Mach5 testing >>>>>>> >>>>>>>>>>>>>> with no issues. It's also known as v2.03 >>>>>>> (for those for >>>>>>> >>>>>>>>>>>>>> with the >>>>>>> >>>>>>>>>>>>>> patches) and as webrev/6-for-jdk13 (for >>>>>>> those with webrev >>>>>>> >>>>>>>>>>>>>> URLs). >>>>>>> >>>>>>>>>>>>>> Sorry for all the names... >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> Main bug URL: >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> ??? JDK-8153224 Monitor deflation prolong >>>>>>> safepoints >>>>>>> >>>>>>>>>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> The project is currently baselined on >>>>>>> jdk-13+18. >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> Here's the full webrev URL: >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for- >>>> jdk13.full/ >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> Here's the incremental webrev URL: >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/6-for- >>>> jdk13.inc/ >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> I have also updated the OpenJDK wiki to >>>>>>> reflect the CR3 >>>>>>> >>>>>>>>>>>>>> changes: >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> This version of the patch has been thru >>>>>>> Mach5 tier[1-8] >>>>>>> >>>>>>>>>>>>>> testing on >>>>>>> >>>>>>>>>>>>>> Oracle's usual set of platforms. My >>>>>>> Solaris-X64 stress >>>>>>> >>>>>>>>>>>>>> kit run had >>>>>>> >>>>>>>>>>>>>> no issues. Kitchensink8H on product, >>>>>>> fastdebug, and >>>>>>> >>>>>>>>>>>>>> slowdebug bits >>>>>>> >>>>>>>>>>>>>> had no failures on Linux-X64; MacOSX >>>>>>> fastdebug and >>>>>>> >>>>>>>>>>>>>> slowdebug and >>>>>>> >>>>>>>>>>>>>> Solaris-X64 release had the usual "Too >>>>>>> large time diff" >>>>>>> >>>>>>>>>>>>>> complaints. >>>>>>> >>>>>>>>>>>>>> 12 hour Inflate2 runs on product, >>>>>>> fastdebug and slowdebug >>>>>>> >>>>>>>>>>>>>> bits on >>>>>>> >>>>>>>>>>>>>> Linux-X64, MacOSX and Solaris-X64 had no >>>>>>> failures. My >>>>>>> >>>>>>>>>>>>>> Linux-X64 >>>>>>> >>>>>>>>>>>>>> stress kit is running right now. >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> I've done the SPECjbb2015 baseline and >>>>>>> CR3 runs. I need >>>>>>> >>>>>>>>>>>>>> to gather >>>>>>> >>>>>>>>>>>>>> the results and analyze them. >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>>>> comments or >>>>>>> >>>>>>>>>>>>>> suggestions. >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> Dan >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> On 4/25/19 12:38 PM, Daniel D. Daugherty >>>>>>> wrote: >>>>>>> >>>>>>>>>>>>>>> Greetings, >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> I have a small but important bug fix for >>>>>>> the Async >>>>>>> >>>>>>>>>>>>>>> Monitor Deflation >>>>>>> >>>>>>>>>>>>>>> project ready to go. It's also known as >>>>>>> v2.02 (for those >>>>>>> >>>>>>>>>>>>>>> for with the >>>>>>> >>>>>>>>>>>>>>> patches) and as webrev/5-for-jdk13 (for >>>>>>> those with >>>>>>> >>>>>>>>>>>>>>> webrev URLs). Sorry >>>>>>> >>>>>>>>>>>>>>> for all the names... >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> JDK-8222295 was pushed to jdk/jdk two >>>>>>> days ago so that >>>>>>> >>>>>>>>>>>>>>> baseline patch >>>>>>> >>>>>>>>>>>>>>> is out of our hair. >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> Main bug URL: >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong >>>>>>> safepoints >>>>>>> >>>>>>>>>>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> The project is currently baselined on >>>>>>> jdk-13+17. >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> Here's the full webrev URL: >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for- >>>> jdk13.full/ >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> Here's the incremental webrev URL >>>>>>> (JDK-8153224): >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for- >>>> jdk13.inc/ >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> I still have to update the OpenJDK wiki >>>>>>> to reflect the >>>>>>> >>>>>>>>>>>>>>> CR2 changes: >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> This version of the patch has been thru >>>>>>> Mach5 tier[1-6] >>>>>>> >>>>>>>>>>>>>>> testing on >>>>>>> >>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 >>>>>>> tier[7-8] is >>>>>>> >>>>>>>>>>>>>>> running now. >>>>>>> >>>>>>>>>>>>>>> My stress kit is running on Solaris-X64 >>>>>>> now. >>>>>>> >>>>>>>>>>>>>>> Kitchensink8H is running >>>>>>> >>>>>>>>>>>>>>> now on product, fastdebug, and >> slowdebug >>>>>>> bits on >>>>>>> >>>>>>>>>>>>>>> Linux-X64, MacOSX >>>>>>> >>>>>>>>>>>>>>> and Solaris-X64. 12 hour Inflate2 runs >>>>>>> are running now >>>>>>> >>>>>>>>>>>>>>> on product, >>>>>>> >>>>>>>>>>>>>>> fastdebug and slowdebug bits on >>>>>>> Linux-X64, MacOSX and >>>>>>> >>>>>>>>>>>>>>> Solaris-X64. >>>>>>> >>>>>>>>>>>>>>> I'll start my my stress kit on Linux-X64 >>>>>>> sometime on >>>>>>> >>>>>>>>>>>>>>> Sunday (after >>>>>>> >>>>>>>>>>>>>>> my jdk-13+18 stress run is done). >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> I'll do SPECjbb2015 baseline and CR2 >>>>>>> runs after all the >>>>>>> >>>>>>>>>>>>>>> stress >>>>>>> >>>>>>>>>>>>>>> testing is done. >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>>>> comments or >>>>>>> >>>>>>>>>>>>>>> suggestions. >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> Dan >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> On 4/19/19 11:58 AM, Daniel D. Daugherty >>>>>>> wrote: >>>>>>> >>>>>>>>>>>>>>>> Greetings, >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> I finally have CR1 for the Async >>>>>>> Monitor Deflation >>>>>>> >>>>>>>>>>>>>>>> project ready to >>>>>>> >>>>>>>>>>>>>>>> go. It's also known as v2.01 (for those >>>>>>> for with the >>>>>>> >>>>>>>>>>>>>>>> patches) and as >>>>>>> >>>>>>>>>>>>>>>> webrev/4-for-jdk13 (for those with >>>>>>> webrev URLs). Sorry >>>>>>> >>>>>>>>>>>>>>>> for all the >>>>>>> >>>>>>>>>>>>>>>> names... >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> Main bug URL: >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong >>>>>>> safepoints >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> Baseline bug fixes URL: >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> JDK-8222295 more baseline cleanups from >>>>>>> Async >>>>>>> >>>>>>>>>>>>>>>> Monitor Deflation project >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222295 >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> The project is currently baselined on >>>>>>> jdk-13+15. >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> Here's the webrev for the latest >>>>>>> baseline changes >>>>>>> >>>>>>>>>>>>>>>> (JDK-8222295): >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- >>>> jdk13.8222295 >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> Here's the full webrev URL (JDK-8153224 >>>>>>> only): >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- >>>> jdk13.full/ >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> Here's the incremental webrev URL >>>>>>> (JDK-8153224): >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for- >>>> jdk13.inc/ >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> So I'm looking for reviews for both >>>>>>> JDK-8222295 and the >>>>>>> >>>>>>>>>>>>>>>> latest version >>>>>>> >>>>>>>>>>>>>>>> of JDK-8153224... >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> I still have to update the OpenJDK wiki >>>>>>> to reflect the >>>>>>> >>>>>>>>>>>>>>>> CR changes: >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> This version of the patch has been thru >>>>>>> Mach5 tier[1-3] >>>>>>> >>>>>>>>>>>>>>>> testing on >>>>>>> >>>>>>>>>>>>>>>> Oracle's usual set of platforms. Mach5 >>>>>>> tier[4-6] is >>>>>>> >>>>>>>>>>>>>>>> running now and >>>>>>> >>>>>>>>>>>>>>>> Mach5 tier[78] will be run later today. >>>>>>> My stress kit >>>>>>> >>>>>>>>>>>>>>>> on Solaris-X64 >>>>>>> >>>>>>>>>>>>>>>> is running now. Linux-X64 stress >>>>>>> testing will start on >>>>>>> >>>>>>>>>>>>>>>> Sunday. I'm >>>>>>> >>>>>>>>>>>>>>>> planning to do Kitchensink runs, >>>>>>> SPECjbb2015 runs and >>>>>>> >>>>>>>>>>>>>>>> my monitor >>>>>>> >>>>>>>>>>>>>>>> inflation stress tests on Linux-X64, >>>>>>> MacOSX and >>>>>>> >>>>>>>>>>>>>>>> Solaris-X64. >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>>>> comments or >>>>>>> >>>>>>>>>>>>>>>> suggestions. >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> Dan >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> On 3/24/19 9:57 AM, Daniel D. Daugherty >>>>>>> wrote: >>>>>>> >>>>>>>>>>>>>>>>> Greetings, >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> Welcome to the OpenJDK review thread >>>>>>> for my port of >>>>>>> >>>>>>>>>>>>>>>>> Carsten's work on: >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> JDK-8153224 Monitor deflation prolong >>>>>>> safepoints >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> Here's a link to the OpenJDK wiki that >>>>>>> describes my port: >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> Here's the webrev URL: >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for- >>>> jdk13/ >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> Here's a link to Carsten's original >>>>>>> webrev: >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> Earlier versions of this patch have >>>>>>> been through >>>>>>> >>>>>>>>>>>>>>>>> several rounds of >>>>>>> >>>>>>>>>>>>>>>>> preliminary review. Many thanks to >>>>>>> Carsten, Coleen, >>>>>>> >>>>>>>>>>>>>>>>> Robbin, and >>>>>>> >>>>>>>>>>>>>>>>> Roman for their preliminary code >>>>>>> review comments. A >>>>>>> >>>>>>>>>>>>>>>>> very special >>>>>>> >>>>>>>>>>>>>>>>> thanks to Robbin and Roman for >>>>>>> building and testing >>>>>>> >>>>>>>>>>>>>>>>> the patch in >>>>>>> >>>>>>>>>>>>>>>>> their own environments (including >>>>>>> specJBB2015). >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> This version of the patch has been >>>>>>> thru Mach5 >>>>>>> >>>>>>>>>>>>>>>>> tier[1-8] testing on >>>>>>> >>>>>>>>>>>>>>>>> Oracle's usual set of platforms. >>>>>>> Earlier versions have >>>>>>> >>>>>>>>>>>>>>>>> been run >>>>>>> >>>>>>>>>>>>>>>>> through my stress kit on my Linux-X64 >>>>>>> and Solaris-X64 >>>>>>> >>>>>>>>>>>>>>>>> servers >>>>>>> >>>>>>>>>>>>>>>>> (product, fastdebug, >>>>>>> slowdebug).Earlier versions have >>>>>>> >>>>>>>>>>>>>>>>> run Kitchensink >>>>>>> >>>>>>>>>>>>>>>>> for 12 hours on MacOSX, Linux-X64 and >>>>>>> Solaris-X64 >>>>>>> >>>>>>>>>>>>>>>>> (product, fastdebug >>>>>>> >>>>>>>>>>>>>>>>> and slowdebug). Earlier versions have >>>>>>> run my monitor >>>>>>> >>>>>>>>>>>>>>>>> inflation stress >>>>>>> >>>>>>>>>>>>>>>>> tests for 12 hours on MacOSX, >>>>>>> Linux-X64 and >>>>>>> >>>>>>>>>>>>>>>>> Solaris-X64 (product, >>>>>>> >>>>>>>>>>>>>>>>> fastdebug and slowdebug). >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> All of the testing done on earlier >>>>>>> versions will be >>>>>>> >>>>>>>>>>>>>>>>> redone on the >>>>>>> >>>>>>>>>>>>>>>>> latest version of the patch. >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> Thanks, in advance, for any questions, >>>>>>> comments or >>>>>>> >>>>>>>>>>>>>>>>> suggestions. >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> Dan >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>> P.S. >>>>>>> >>>>>>>>>>>>>>>>> One subtest in >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>>>> >>>>>>>>>>>>>>>>> is currently failing in -Xcomp mode on >>>>>>> Win* only. I've >>>>>>> >>>>>>>>>>>>>>>>> been trying >>>>>>> >>>>>>>>>>>>>>>>> to characterize/analyze this failure >>>>>>> for more than a >>>>>>> >>>>>>>>>>>>>>>>> week now. At >>>>>>> >>>>>>>>>>>>>>>>> this point I'm convinced that Async >>>>>>> Monitor Deflation >>>>>>> >>>>>>>>>>>>>>>>> is aggravating >>>>>>> >>>>>>>>>>>>>>>>> an existing bug. However, I plan to >>>>>>> have a better >>>>>>> >>>>>>>>>>>>>>>>> handle on that >>>>>>> >>>>>>>>>>>>>>>>> failure before these bits are pushed >>>>>>> to the jdk/jdk repo. >>>>>>> >>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>>> >>>>> >>>>>>> >>>> >>>>>>> >>> >>>>>>> >> >>>>>>> From minqi at openjdk.java.net Tue Sep 15 17:12:40 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 15 Sep 2020 17:12:40 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive In-Reply-To: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: <0TAHuutonD-rpLrHLWemVe9q2FIbuClG5aSXxDnEY4U=.2a2a2172-9bf8-4742-b7e7-f3254d01a3b1@github.com> On Tue, 15 Sep 2020 14:37:44 GMT, Zhengyu Gu wrote: > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > post_run() method. Looks good. ------------- Marked as reviewed by minqi (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/185 From ccheung at openjdk.java.net Tue Sep 15 19:34:12 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Tue, 15 Sep 2020 19:34:12 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive In-Reply-To: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Tue, 15 Sep 2020 14:37:44 GMT, Zhengyu Gu wrote: > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > post_run() method. Thanks for the fix. I've done some testing on the patch. It passed tier1 and also passed running the appcds/dynamicArchive/methodHandles/MethodHandlesAsCollectorTest.java test 30 times on Windows and linux. One question : why the JavaThread::post_run doesn't need to set the following? set_stack_base(NULL); set_stack_size(0); ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/185 From zgu at openjdk.java.net Tue Sep 15 20:18:59 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 15 Sep 2020 20:18:59 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Tue, 15 Sep 2020 19:31:35 GMT, Calvin Cheung wrote: > Thanks for the fix. I've done some testing on the patch. It passed tier1 and also passed running the > appcds/dynamicArchive/methodHandles/MethodHandlesAsCollectorTest.java test 30 times on Windows and linux. One question > : why the JavaThread::post_run doesn't need to set the following? set_stack_base(NULL); > set_stack_size(0); Thanks for reviewing, Calvin. The last statement of JavaThread::post_run() deletes 'thread' object, so there is no point to reset its states here. While NonJavaThread object lives pass lifespan of actual thread, its state does matter, I believe. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From serguei.spitsyn at oracle.com Tue Sep 15 20:28:50 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 15 Sep 2020 13:28:50 -0700 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <682ee88d-097a-df57-7374-b3413b7964fd@oracle.com> <3ae58a8e-405a-d98c-79c5-c6a0bdf5cc27@oracle.com> <96ad21a3-cae4-2218-b047-6912e6a07b21@oracle.com> Message-ID: <0686c4e5-ee04-9da7-e88e-a6730d69c6a9@oracle.com> Hi Richard, This is on my review list. I'll try to get it reviewed by the end of this week. Thanks, Serguei On 9/8/20 10:02, Reingruber, Richard wrote: > Hello Marty, > > Sure. I'd be happy if Serguei could review the change. > > Thanks, Richard. > > -----Original Message----- > From: Marty Thompson > Sent: Dienstag, 8. September 2020 18:55 > To: Reingruber, Richard ; Daniel Daugherty ; serviceability-dev ; hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hello Richard, > > It would be good if Serguei Spitsyn could review before this is pushed. Serguei is out this week. Can you wait until Serguei is back in the office the week of Sept 14? > > Regards, > > Marty > >> -----Original Message----- >> From: Reingruber, Richard >> Sent: Tuesday, September 8, 2020 9:45 AM >> To: Daniel Daugherty ; serviceability-dev >> ; hotspot-compiler- >> dev at openjdk.java.net; Hotspot dev runtime > dev at openjdk.java.net> >> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance >> in the Presence of JVMTI Agents >> >> Hi Dan, >> >> I'd be very happy about a review from somebody on the Serviceability team. >> I have asked for reviews many times (kindly I hope). And the change is for >> review for more than a year now. >> >> According to [1] I'd think all requirements to push are met already. But >> maybe I missed something? >> >> After renaming of methods in SafepointMechanism the change needs to be >> rebased (already done). I'll publish a pull request as soon as possible. >> >> Thanks, Richard. >> >> [1] >> https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change >> >> -----Original Message----- >> From: Daniel D. Daugherty >> Sent: Dienstag, 8. September 2020 18:16 >> To: Reingruber, Richard ; serviceability-dev >> ; hotspot-compiler- >> dev at openjdk.java.net; Hotspot dev runtime > dev at openjdk.java.net> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance >> in the Presence of JVMTI Agents >> >> Hi Richard, >> >> I haven't seen a review from anyone on the Serviceability team and I think >> you should get a review from them since JVM/TI is involved. >> Perhaps I missed it... >> >> Dan >> >> >> On 9/7/20 10:09 AM, Reingruber, Richard wrote: >>> Hi, >>> >>> I would like to close the review of this change. >>> >>> It has received a lot of helpful feedback during the process and 2 >>> full Reviews. Thanks everybody! >>> >>> I'm planning to push it this week on Thursday as solution for JBS items: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>> https://bugs.openjdk.java.net/browse/JDK-8233915 >>> >>> Version to be pushed: >>> >>> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/ >>> >>> Hope to get my GIT/Skara setup going until then... :) >>> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: hotspot-compiler-dev >>> On Behalf Of Reingruber, >>> Richard >>> Sent: Mittwoch, 2. September 2020 23:27 >>> To: Robbin Ehn ; serviceability-dev >>> ; >>> hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime >>> >>> Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for >>> Better Performance in the Presence of JVMTI Agents >>> >>> Hi Robin, >>> >>>> On 2020-09-02 15:48, Reingruber, Richard wrote: >>>>> Hi Robbin, >>>>> >>>>> // taking the discussion back to the mailing lists >>>>> >>>>> > I still don't understand why you don't deoptimize the objects inside >> the >>>>> > handshake/safepoint instead? >>>> So for handshakes using asynch handshake and allowing blocking inside >>>> would fix that. (future fix, I'm working on that now) >>> Just to make it clear: I'm not fond of the extra suspension mechanism >>> currently used for JDK-8227745 either. I want to get rid of it and I >>> will work on it. Asynch handshakes (JDK-8238761) could be a >>> replacement for it. At least I think they can be used to suspend the target >> thread. >>>> For safepoint, since we have suspended all threads, ~'safepointed them' >>>> with a JavaThread, you _could_ just execute the action directly (e.g. >>>> skipping VM_HeapWalkOperation safepoint) since they are suppose to be >>>> safely suspended until the destructor of EB, no? >>> Yes, this should be possible. This would be an advanced change though. >>> I would like EscapeBarriers to be a no-op and fall back to current >>> implementation, if C2-EscapeAnalysis/Graal are disabled. >>> >>>> So I suggest future work to instead just execute the safepoint with >>>> the requesting JT instead of having a this special safepoiting mechanism. >>>> Since you are missing above functionality I see why you went this way. >>>> If you need to push it, it's fine by me. >>> We will work on further improvements. Top of the list would be >>> eliminating the extra suspend mechanism. >>> >>> The implementation has matured for more than 12 months now [1]. It's >>> been tested extensively at SAP over that time and passed also extended >>> testing at Oracle kindly conducted by Vladimir Kozlov. We've got two >>> full Reviews and incorporated extensive feedback from a number of >>> OpenJDK Reviewers (including you, thanks!). Based on that I reckon >>> we're good to push the change as enhancement >>> (JDK-8227745) and bug fix (JDK-8233915). >>> >>>> Thanks for explaining once again :) >>> Pleasure :) >>> >>> Thanks, Richard. >>> >>> [1] >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/02 >>> 8729.html >>> >>> -----Original Message----- >>> From: Robbin Ehn >>> Sent: Mittwoch, 2. September 2020 16:54 >>> To: Reingruber, Richard ; >>> serviceability-dev ; >>> hotspot-compiler-dev at openjdk.java.net; Hotspot dev runtime >>> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>> Performance in the Presence of JVMTI Agents >>> >>> Hi Richard, >>> >>> On 2020-09-02 15:48, Reingruber, Richard wrote: >>>> Hi Robbin, >>>> >>>> // taking the discussion back to the mailing lists >>>> >>>> > I still don't understand why you don't deoptimize the objects inside >> the >>>> > handshake/safepoint instead? >>> So for handshakes using asynch handshake and allowing blocking inside >>> would fix that. (future fix, I'm working on that now) >>> >>> For safepoint, since we have suspended all threads, ~'safepointed them' >>> with a JavaThread, you _could_ just execute the action directly (e.g. >>> skipping VM_HeapWalkOperation safepoint) since they are suppose to be >>> safely suspended until the destructor of EB, no? >>> >>> So I suggest future work to instead just execute the safepoint with >>> the requesting JT instead of having a this special safepoiting mechanism. >>> >>> Since you are missing above functionality I see why you went this way. >>> If you need to push it, it's fine by me. >>> >>> Thanks for explaining once again :) >>> >>> /Robbin >>> >>>> This is unfortunately not possible. Deoptimizing objects includes >>>> reallocating scalar replaced objects, i.e. calling >>>> Deoptimization::realloc_objects(). This cannot be done at a safepoint or >> handshake. >>>> 1. The vm thread is not allowed to allocate on the java heap >>>> See for instance assertions in ParallelScavengeHeap::mem_allocate() >>>> >>>> >> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e >> 045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/par >> allelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKy >> a >>>> 9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$ >>>> >>>> This is not easy to change, I suppose, because it will be difficult to gc if >>>> necessary. >>>> >>>> 2. Using a direct handshake would not work either. The problem there is >> again >>>> gc. Let J be the JavaThread that is executing the direct handshake. The >> vm >>>> would deadlock if the vm thread waits for J to execute the closure of a >>>> handshake-all and J waits for the vm thread to execute a gc vm >> operation. >>>> Patricio Chilano made me aware of this: >>>> https://bugs.openjdk.java.net/browse/JDK-8230594 >>>> >>>> Cheers, Richard. >>>> >>>> -----Original Message----- >>>> From: Robbin Ehn >>>> Sent: Mittwoch, 2. September 2020 13:56 >>>> To: Reingruber, Richard >>>> Cc: Lindenmaier, Goetz ; Vladimir Kozlov >>>> ; David Holmes >> >>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>> Performance in the Presence of JVMTI Agents >>>> >>>> Hi, >>>> >>>> I still don't understand why you don't deoptimize the objects inside >>>> the handshake/safepoint instead? >>>> >>>> E.g. >>>> >>>> JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the >>>> code >>>> from: >>>> eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over >>>> the stack, so: >>>> >>>> void >>>> GetOwnedMonitorInfoClosure::do_thread(Thread *target) { >>>> assert(target->is_Java_thread(), "just checking"); >>>> JavaThread *jt = (JavaThread *)target; >>>> >>>> if (!jt->is_exiting() && (jt->threadObj() != NULL)) { >>>> + if (EscapeBarrier::deoptimize_objects(jt, >>>> + MaxJavaStackTraceDepth)) { >>>> _result = >>>> ((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, >>>> _owned_monitors_list); >>>> } else { >>>> _result = JVMTI_ERROR_OUT_OF_MEMORY; >>>> } >>>> } >>>> } >>>> >>>> Why try 'suspend' the thread first? >>>> >>>> >>>> When we de-optimize all threads why not just in the following safepoint? >>>> E.g. >>>> VM_HeapWalkOperation::doit() { >>>> + EscapeBarrier::deoptimize_objects_all_threads(); >>>> ... >>>> } >>>> >>>> Thanks, Robbin >>>> >>>> From yumin.qi at oracle.com Tue Sep 15 20:45:24 2020 From: yumin.qi at oracle.com (Yumin Qi) Date: Tue, 15 Sep 2020 13:45:24 -0700 Subject: RFR: 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive Message-ID: <503de1cc-5513-68ad-80a9-cc8e12d59abb@oracle.com> HI, all ? Please review changes for? 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive. ? What happened: ? I pushed with commit comment: ??? 8247536: Support for pre-generated java.lang.invoke classes in CDS static archive ? When created pullrequest, the title is jdk 8247536, I did not pay attention to it, thought it will be filled correct. But it wasn't. The jcheck failed on checking the title, after I modified the title to correct form, there are no emails sent out, so I send here the link, please review via this link: ? https://github.com/openjdk/jdk/pull/193/files ? Thanks ? Yumin From sspitsyn at openjdk.java.net Wed Sep 16 00:09:42 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 16 Sep 2020 00:09:42 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: <1rj4zO-L65NEG1ZyUdi3YyJR3A6AOTeb5cBsmVOiJ4E=.40e51e90-4008-46ea-a451-526144312035@github.com> References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> <1rj4zO-L65NEG1ZyUdi3YyJR3A6AOTeb5cBsmVOiJ4E=.40e51e90-4008-46ea-a451-526144312035@github.com> Message-ID: On Mon, 14 Sep 2020 17:10:39 GMT, Daniel D. Daugherty wrote: >> I don't see anything in the HPROF format description that claims this is a strong root. At a minimum this seems to be a >> behavioural change that would warrant a CSR request. This also seems to be something that the serviceability folk >> should be made aware of and have a chance to comment on. > > I've taken a first pass at creating a CSR: > JDK-8253121 migrate ObjectMonitor::_object to OopStorage > https://bugs.openjdk.java.net/browse/JDK-8253121 I've looked at the CSR and added myself as a reviewer. We had a slack chat with Dan, and I agree that with a weak handle it would be racy/unsafe for JVMTI_HEAP_REFERENCE_MONITOR calls back to be called. The ObjectMonitors do not pin objects anymore (there are no strong refs from them), so it has to be okay to skip reporting the JVMTI_HEAP_REFERENCE_MONITOR and and JVMTI_HEAP_ROOT_MONITOR (old Heap Walking API) reference types. The JVMTI does not need an update as other VM implementations can still report these reference types. Alan added a comment to the CSR saying that memory profiling tools that use the JVMTI functions (FollowReferences with jvmtiHeapReferenceCallback or IterateOverReachableObjects with jvmtiHeapRootCallback) to iterate over the heap should not have a compatibility impact as these reference types are just informational but adding a release note can be useful. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From sspitsyn at openjdk.java.net Wed Sep 16 00:23:48 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 16 Sep 2020 00:23:48 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Mon, 14 Sep 2020 17:10:26 GMT, Daniel D. Daugherty wrote: >> From the spec I'm not clear on exactly what JVMTI_HEAP_REFERENCE_MONITOR is intended to be. Serviceability folk should >> be giving some input here though. > > I've taken a first pass at creating a CSR: > JDK-8253121 migrate ObjectMonitor::_object to OopStorage > https://bugs.openjdk.java.net/browse/JDK-8253121 Just a minor comment. The line 1754 can be deleted as the JVMTI_HEAP_REFERENCE_MONITOR reference type will be never encountered: 1750 static jvmtiHeapRootKind toJvmtiHeapRootKind(jvmtiHeapReferenceKind kind) { 1751 switch (kind) { 1752 case JVMTI_HEAP_REFERENCE_JNI_GLOBAL: return JVMTI_HEAP_ROOT_JNI_GLOBAL; 1753 case JVMTI_HEAP_REFERENCE_SYSTEM_CLASS: return JVMTI_HEAP_ROOT_SYSTEM_CLASS; 1754 case JVMTI_HEAP_REFERENCE_MONITOR: return JVMTI_HEAP_ROOT_MONITOR; 1755 case JVMTI_HEAP_REFERENCE_STACK_LOCAL: return JVMTI_HEAP_ROOT_STACK_LOCAL; 1756 case JVMTI_HEAP_REFERENCE_JNI_LOCAL: return JVMTI_HEAP_ROOT_JNI_LOCAL; 1757 case JVMTI_HEAP_REFERENCE_THREAD: return JVMTI_HEAP_ROOT_THREAD; 1758 case JVMTI_HEAP_REFERENCE_OTHER: return JVMTI_HEAP_ROOT_OTHER; 1759 default: ShouldNotReachHere(); return JVMTI_HEAP_ROOT_OTHER; 1760 } 1761 } Other than that the update in this file looks okay to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From sspitsyn at openjdk.java.net Wed Sep 16 00:23:47 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 16 Sep 2020 00:23:47 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 21:15:00 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > rkennke, coleenp, fisk CR - delete random assert() that knows too much about markWords. Marked as reviewed by sspitsyn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dholmes at openjdk.java.net Wed Sep 16 02:57:18 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 16 Sep 2020 02:57:18 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive In-Reply-To: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Tue, 15 Sep 2020 14:37:44 GMT, Zhengyu Gu wrote: > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > post_run() method. src/hotspot/share/runtime/thread.hpp line 762: > 760: public: > 761: // Stack overflow support > 762: address stack_base() const { return _stack_base; } Why did you remove the assertion? We want the assertion in general to ensure there are no improper uses of stack_base(). ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From dholmes at openjdk.java.net Wed Sep 16 07:00:55 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 16 Sep 2020 07:00:55 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v2] In-Reply-To: <_2RfxBOE39VhwtDZe2F2qLb52IfF_JiCWwE2cJsEuiM=.01bb1177-808a-45ea-a8bf-3dccfab6ea38@github.com> References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> <_2RfxBOE39VhwtDZe2F2qLb52IfF_JiCWwE2cJsEuiM=.01bb1177-808a-45ea-a8bf-3dccfab6ea38@github.com> Message-ID: <8uyHRtbTr67w0rqGE-VS-SCGrD0uVnBNV7rURU6WZII=.424ce1f3-98fa-4bbe-b971-9df6fdef239b@github.com> On Tue, 15 Sep 2020 10:59:06 GMT, Jamsheed Mohammed C M wrote: >> Hi >> >> Moving the review that is based on mercurial repo to github. >> The history of conversation is >> [here](https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039861.html) >> Issue:[ JDK-8249451 ](https://bugs.openjdk.java.net/browse/JDK-8249451) >> >> @dholmes-ora could you please have a look. > > Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: > > removing unused definition load_class_by_index Looks good to me! No further comments. src/hotspot/share/runtime/thread.cpp line 2392: > 2390: if (check_unsafe_error && > 2391: condition == _async_unsafe_access_error && !has_pending_exception()) { > 2392: // May be we are at method entry and requires to save do not unlock flag. Suggest: // We may be at method entry which requires we save the do-not-unlock flag. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/169 From patricio.chilano.mateo at oracle.com Wed Sep 16 08:20:08 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 16 Sep 2020 05:20:08 -0300 Subject: RFR: 8238761: Asynchronous handshakes In-Reply-To: References: Message-ID: Hi Robbin, Changes look good to me! Some minor comments: src/hotspot/share/prims/jvmtiThreadState.cpp 222:?? assert(current_thread == get_thread() || 223:????????? SafepointSynchronize::is_at_safepoint() || 224: get_thread()->is_handshake_safe_for(current_thread), 225:????????? "call by myself / at safepoint / at handshake"); Extra current_thread == get_thread() is already handled by is_handshake_safe_for(). src/hotspot/share/prims/jvmtiEnvBase.cpp Same as above. src/hotspot/share/runtime/handshake.cpp 198:???? log_info(handshake)("Handshake \"%s\", Targeted threads: %d, Executed by targeted threads: %d, Total completion time: " JLONG_FORMAT " ns%s%s", 199:???????????????????????? name, targets, 200:???????????????????????? targets - vmt_executed, In the calls to log_handshake_info() in VM_HandshakeAllThreads and Handshake::execute() we are passing as vmt_executed the number of handshakes that the driver executed which could be more than the targets parameter. So the operation "targets - vmt_executed" to calculate the handshakes executed by the targets would no longer be valid. Personally I would just leave ProcessResult as an enum and log as before. We still have a log_trace() in try_process(), so that already keeps track of extra handshakes executed by the handshaker. src/hotspot/share/runtime/handshake.cpp 387:???? NoSafepointVerifier nsv; 388:???? process_self_inner(); 389:?? } Shouldn't NoSafepointVerifier be placed before the execution of the handshake closure so that we also cover the case when the handshake is executed by the handshaker? As in: // Only actually execute the operation for non terminated threads. if (!thread->is_terminated()) { ??? NoSafepointVerifier nsv; ??? _handshake_cl->do_thread(thread); } src/hotspot/share/runtime/interfaceSupport.inline.hpp 156:???? // Threads shouldn't block if they are in the middle of printing, but... 157: ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); What's the issue of having NoSafepointVerifier inside the handshake? Thanks! Patricio On 9/15/20 4:39 AM, Robbin Ehn wrote: > This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes > are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) > Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an > arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a > singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first > pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first > pointer also CAS. > > The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake operations matching > the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything except asynchronous > handshakes. In this initial change-set there is no need to do any other filtering. If needed filtering can easily be > exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake operation to be done > out-order. Since the filter determins who execute the operation and not the invoked method, there is now only one > method to call when handshaking one thread. > > Some comments about the changes: > - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake > all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much > worse then this. > > - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that > thread. > > - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a > handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. > > - Added WB testing method. > > - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. > > - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. > > - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a > safepoint and half of them after, in many handshake all operations. > > - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. > > - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the > moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. > > - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to the NoSafepointVerifyer. > > - Added filtered queue and gtest for it. > > Passes multiple t1-8 runs. > Been through some pre-reviwing. > > ------------- > > Commit messages: > - Rebase version 1.0 > > Changes: https://git.openjdk.java.net/jdk/pull/151/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8238761 > Stats: 1047 lines in 24 files changed: 693 ins; 150 del; 204 mod > Patch: https://git.openjdk.java.net/jdk/pull/151.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/151/head:pull/151 > > PR: https://git.openjdk.java.net/jdk/pull/151 From eosterlund at openjdk.java.net Wed Sep 16 08:26:17 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 16 Sep 2020 08:26:17 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 00:21:02 GMT, Serguei Spitsyn wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> rkennke, coleenp, fisk CR - delete random assert() that knows too much about markWords. > > Marked as reviewed by sspitsyn (Reviewer). I added a release note (https://bugs.openjdk.java.net/browse/JDK-8253225) describing that these roots are now weak, and hence won't be reported. Please have a look at that, to make sure what I am describing makes sense. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From jcm at openjdk.java.net Wed Sep 16 08:56:01 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Wed, 16 Sep 2020 08:56:01 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v2] In-Reply-To: <8uyHRtbTr67w0rqGE-VS-SCGrD0uVnBNV7rURU6WZII=.424ce1f3-98fa-4bbe-b971-9df6fdef239b@github.com> References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> <_2RfxBOE39VhwtDZe2F2qLb52IfF_JiCWwE2cJsEuiM=.01bb1177-808a-45ea-a8bf-3dccfab6ea38@github.com> <8uyHRtbTr67w0rqGE-VS-SCGrD0uVnBNV7rURU6WZII=.424ce1f3-98fa-4bbe-b971-9df6fdef239b@github.com> Message-ID: On Wed, 16 Sep 2020 06:56:31 GMT, David Holmes wrote: >> Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: >> >> removing unused definition load_class_by_index > > src/hotspot/share/runtime/thread.cpp line 2392: > >> 2390: if (check_unsafe_error && >> 2391: condition == _async_unsafe_access_error && !has_pending_exception()) { >> 2392: // May be we are at method entry and requires to save do not unlock flag. > > Suggest: > // We may be at method entry which requires we save the do-not-unlock flag. Thank you @dholmes-ora , Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/169 From jcm at openjdk.java.net Wed Sep 16 09:09:45 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Wed, 16 Sep 2020 09:09:45 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v2] In-Reply-To: <8uyHRtbTr67w0rqGE-VS-SCGrD0uVnBNV7rURU6WZII=.424ce1f3-98fa-4bbe-b971-9df6fdef239b@github.com> References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> <_2RfxBOE39VhwtDZe2F2qLb52IfF_JiCWwE2cJsEuiM=.01bb1177-808a-45ea-a8bf-3dccfab6ea38@github.com> <8uyHRtbTr67w0rqGE-VS-SCGrD0uVnBNV7rURU6WZII=.424ce1f3-98fa-4bbe-b971-9df6fdef239b@github.com> Message-ID: On Wed, 16 Sep 2020 06:58:18 GMT, David Holmes wrote: >> Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: >> >> removing unused definition load_class_by_index > > Looks good to me! No further comments. could you i get second review ------------- PR: https://git.openjdk.java.net/jdk/pull/169 From jcm at openjdk.java.net Wed Sep 16 09:09:43 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Wed, 16 Sep 2020 09:09:43 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v3] In-Reply-To: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> Message-ID: > Hi > > Moving the review that is based on mercurial repo to github. > The history of conversation is > [here](https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039861.html) > Issue:[ JDK-8249451 ](https://bugs.openjdk.java.net/browse/JDK-8249451) > > @dholmes-ora could you please have a look. Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: comment modified wrt review feedback ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/169/files - new: https://git.openjdk.java.net/jdk/pull/169/files/1c0786a5..506094bf Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=169&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=169&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/169.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/169/head:pull/169 PR: https://git.openjdk.java.net/jdk/pull/169 From jcm at openjdk.java.net Wed Sep 16 09:36:28 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Wed, 16 Sep 2020 09:36:28 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v4] In-Reply-To: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> Message-ID: > Hi > > Moving the review that is based on mercurial repo to github. > The history of conversation is > [here](https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039861.html) > Issue:[ JDK-8249451 ](https://bugs.openjdk.java.net/browse/JDK-8249451) > > @dholmes-ora could you please have a look. Jamsheed Mohammed C M has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: comment modified wrt review feedback ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/169/files - new: https://git.openjdk.java.net/jdk/pull/169/files/506094bf..9777e8c4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=169&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=169&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/169.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/169/head:pull/169 PR: https://git.openjdk.java.net/jdk/pull/169 From kim.barrett at oracle.com Wed Sep 16 10:49:51 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Sep 2020 06:49:51 -0400 Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: Message-ID: <90E69F56-DAEC-46D2-8829-8342D1A5B864@oracle.com> > On Sep 14, 2020, at 12:05 PM, Kim Barrett wrote: > Note that it might be that such a limited warning disable is insufficient in > the long run. This seems like a problem that might arise in other places as > we make more use of constexpr. We might instead need to globally disable > that warning for some limited range of VS versions. Looks like I was right about that. I?ve privately received a report of a similar problem arising as a result of yesterday?s push of 8238956. (I expect a bug report to be showing up shortly.) This will be a constant threat when adding uses of constexpr. I think the only plausible solution is to disable that warning for VS versions prior to the one that fixes that bug (mentioned in the referenced bug report). I also think the change for 8253089 should be reverted. I don?t know why I never encountered this in my experiments with C++14 over the past more than a year. We (Oracle) only upgraded to VS2019 relatively recently (a few months). Though I think the I wrote the unit test that failed to build (in that private report) fairly recently, probably after we switched over to VS2019. From shade at openjdk.java.net Wed Sep 16 11:24:03 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 16 Sep 2020 11:24:03 GMT Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: <4UPtVQYiZjQfUiHe4BAClmg_IU5LqF36-ekIBxbUF-Y=.5eba4ba3-ae0e-4e8c-aebd-14f6bfc714a7@github.com> References: <4UPtVQYiZjQfUiHe4BAClmg_IU5LqF36-ekIBxbUF-Y=.5eba4ba3-ae0e-4e8c-aebd-14f6bfc714a7@github.com> Message-ID: On Mon, 14 Sep 2020 22:25:06 GMT, Ioi Lam wrote: >> It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at >> fixing this, I think we need to use `u2` consistently for hash code computations. `u2` is the final storage type for >> the hash code in `JVMFlagLookup::_hashes`. > > Marked as reviewed by iklam (Reviewer). I don't mind reverting this change, but it would break MSVC 2017 again, until we push the pragmas. Unfortunately, I lost the capability to build with MSVC 2017, so maybe we should ask SAP folks (who IIRC still have their pipelines with MSVC 2017) to verify this still works. ------------- PR: https://git.openjdk.java.net/jdk/pull/150 From zgu at openjdk.java.net Wed Sep 16 11:28:24 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 16 Sep 2020 11:28:24 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Wed, 16 Sep 2020 02:49:45 GMT, David Holmes wrote: >> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor >> before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues >> alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the >> thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to >> post_run() method. > > src/hotspot/share/runtime/thread.hpp line 762: > >> 760: public: >> 761: // Stack overflow support >> 762: address stack_base() const { return _stack_base; } > > Why did you remove the assertion? We want the assertion in general to ensure there are no improper uses of stack_base(). We now reset NonJavaThread's stack base and size, so _stack_base == NULL is possible, e.g. when generating hs_err report, as we should now see empty stack for "G1 Main Marker" thread in original bug report. You do have a valid point for the assertion. I think following assertion is more precise: assert(_stack_base != NULL || Thread::current() != NULL, "Sanity check"); What you think? Thanks ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From kim.barrett at oracle.com Wed Sep 16 11:42:01 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Sep 2020 07:42:01 -0400 Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: References: <4UPtVQYiZjQfUiHe4BAClmg_IU5LqF36-ekIBxbUF-Y=.5eba4ba3-ae0e-4e8c-aebd-14f6bfc714a7@github.com> Message-ID: <1C87A706-CE71-474D-AB7C-F33BFA7C3C26@oracle.com> > On Sep 16, 2020, at 7:24 AM, Aleksey Shipilev wrote: > > On Mon, 14 Sep 2020 22:25:06 GMT, Ioi Lam wrote: > >>> It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at >>> fixing this, I think we need to use `u2` consistently for hash code computations. `u2` is the final storage type for >>> the hash code in `JVMFlagLookup::_hashes`. >> >> Marked as reviewed by iklam (Reviewer). > > I don't mind reverting this change, but it would break MSVC 2017 again, until we push the pragmas. Unfortunately, I > lost the capability to build with MSVC 2017, so maybe we should ask SAP folks (who IIRC still have their pipelines with > MSVC 2017) to verify this still works. I?m guessing this was intended to be in response to my followup comment. I agree that we can?t revert that until we have something in place to deal with the VS2017 problem. It was the SAP folks who reported the new failure to me, so I?m assuming they will be looking at whatever is done for that problem. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/150 From shade at redhat.com Wed Sep 16 11:56:02 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 16 Sep 2020 13:56:02 +0200 Subject: RFR: 8253089: Windows (MSVC 2017) build fails after JDK-8243208 In-Reply-To: <1C87A706-CE71-474D-AB7C-F33BFA7C3C26@oracle.com> References: <4UPtVQYiZjQfUiHe4BAClmg_IU5LqF36-ekIBxbUF-Y=.5eba4ba3-ae0e-4e8c-aebd-14f6bfc714a7@github.com> <1C87A706-CE71-474D-AB7C-F33BFA7C3C26@oracle.com> Message-ID: <83fbe0a3-b443-fba5-189a-08bf5414ac90@redhat.com> On 9/16/20 1:42 PM, Kim Barrett wrote: >> On Sep 16, 2020, at 7:24 AM, Aleksey Shipilev wrote: >> >> On Mon, 14 Sep 2020 22:25:06 GMT, Ioi Lam wrote: >> >>>> It seems that MSVC 2017 is getting confused about the differences in `unsigned int` and `u2`. After a few attempts at >>>> fixing this, I think we need to use `u2` consistently for hash code computations. `u2` is the final storage type for >>>> the hash code in `JVMFlagLookup::_hashes`. >>> >>> Marked as reviewed by iklam (Reviewer). >> >> I don't mind reverting this change, but it would break MSVC 2017 again, until we push the pragmas. Unfortunately, I >> lost the capability to build with MSVC 2017, so maybe we should ask SAP folks (who IIRC still have their pipelines with >> MSVC 2017) to verify this still works. > > I?m guessing this was intended to be in response to my followup comment. Right. ml-bot confusion here. > I agree that we can?t revert that until we have something in place to deal with the VS2017 problem. It was the SAP > folks who reported the new failure to me, so I?m assuming they will be looking at whatever is done for that problem. I submitted this one yesterday, btw: https://bugs.openjdk.java.net/browse/JDK-8253154 -- Thanks, -Aleksey From zgu at openjdk.java.net Wed Sep 16 15:13:08 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 16 Sep 2020 15:13:08 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v2] In-Reply-To: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > post_run() method. Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Restore assertion ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/185/files - new: https://git.openjdk.java.net/jdk/pull/185/files/f4fbe38b..467db25a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=00-01 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/185.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/185/head:pull/185 PR: https://git.openjdk.java.net/jdk/pull/185 From zgu at openjdk.java.net Wed Sep 16 15:19:08 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 16 Sep 2020 15:19:08 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > post_run() method. Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Fix trailing whitespace ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/185/files - new: https://git.openjdk.java.net/jdk/pull/185/files/467db25a..a7113def Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/185.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/185/head:pull/185 PR: https://git.openjdk.java.net/jdk/pull/185 From dcubed at openjdk.java.net Wed Sep 16 15:42:53 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 16 Sep 2020 15:42:53 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v4] In-Reply-To: References: Message-ID: On Tue, 15 Sep 2020 10:17:45 GMT, Kim Barrett wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> kimbarrett CR - made minor changes to address Kim's code review. > > src/hotspot/share/runtime/objectMonitor.cpp line 244: > >> 242: } >> 243: >> 244: #ifdef ASSERT > > There would be less `#ifdef ASSERT` clutter if just the body of check_object_context were conditionalized. Then the > calls wouldn't need to be. Your call... I'll make that change. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Wed Sep 16 15:45:32 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 16 Sep 2020 15:45:32 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> Message-ID: On Wed, 16 Sep 2020 00:20:16 GMT, Serguei Spitsyn wrote: >> I've taken a first pass at creating a CSR: >> JDK-8253121 migrate ObjectMonitor::_object to OopStorage >> https://bugs.openjdk.java.net/browse/JDK-8253121 > > Just a minor comment. > The line 1754 can be deleted as the JVMTI_HEAP_REFERENCE_MONITOR reference type will be never encountered: > > 1750 static jvmtiHeapRootKind toJvmtiHeapRootKind(jvmtiHeapReferenceKind kind) { > 1751 switch (kind) { > 1752 case JVMTI_HEAP_REFERENCE_JNI_GLOBAL: return JVMTI_HEAP_ROOT_JNI_GLOBAL; > 1753 case JVMTI_HEAP_REFERENCE_SYSTEM_CLASS: return JVMTI_HEAP_ROOT_SYSTEM_CLASS; > 1754 case JVMTI_HEAP_REFERENCE_MONITOR: return JVMTI_HEAP_ROOT_MONITOR; > 1755 case JVMTI_HEAP_REFERENCE_STACK_LOCAL: return JVMTI_HEAP_ROOT_STACK_LOCAL; > 1756 case JVMTI_HEAP_REFERENCE_JNI_LOCAL: return JVMTI_HEAP_ROOT_JNI_LOCAL; > 1757 case JVMTI_HEAP_REFERENCE_THREAD: return JVMTI_HEAP_ROOT_THREAD; > 1758 case JVMTI_HEAP_REFERENCE_OTHER: return JVMTI_HEAP_ROOT_OTHER; > 1759 default: ShouldNotReachHere(); return JVMTI_HEAP_ROOT_OTHER; > 1760 } > 1761 } > > Other than that the update in this file looks okay to me. I cleaned that up. The only references to JVMTI_HEAP_REFERENCE_MONITOR and JVMTI_HEAP_ROOT_MONITOR that remain are in the spec: $ egrep -r 'JVMTI_HEAP_REFERENCE_MONITOR|JVMTI_HEAP_ROOT_MONITOR' src/hotspot src/hotspot/share/prims/jvmti.xml: src/hotspot/share/prims/jvmti.xml: src/hotspot/share/prims/jvmti.xml: JVMTI_HEAP_ROOT_MONITOR, ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Wed Sep 16 15:56:40 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 16 Sep 2020 15:56:40 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: <7dfBMb2-EUEqKgml97ffFb50rxEO_djF85-X8AKLfUg=.9deac832-6d24-4277-8651-b9bfa7d5a397@github.com> <1rj4zO-L65NEG1ZyUdi3YyJR3A6AOTeb5cBsmVOiJ4E=.40e51e90-4008-46ea-a451-526144312035@github.com> Message-ID: On Wed, 16 Sep 2020 00:07:08 GMT, Serguei Spitsyn wrote: >> I've taken a first pass at creating a CSR: >> JDK-8253121 migrate ObjectMonitor::_object to OopStorage >> https://bugs.openjdk.java.net/browse/JDK-8253121 > > I've looked at the CSR and added myself as a reviewer. > We had a slack chat with Dan, and I agree that with a weak handle it would be racy/unsafe for > JVMTI_HEAP_REFERENCE_MONITOR calls back to be called. The ObjectMonitors do not pin objects anymore (there are no > strong refs from them), so it has to be okay to skip reporting the JVMTI_HEAP_REFERENCE_MONITOR and and > JVMTI_HEAP_ROOT_MONITOR (old Heap Walking API) reference types. The JVMTI does not need an update as other VM > implementations can still report these reference types. Alan added a comment to the CSR saying that memory profiling > tools that use the JVMTI functions (FollowReferences with jvmtiHeapReferenceCallback or IterateOverReachableObjects > with jvmtiHeapRootCallback) to iterate over the heap should not have a compatibility impact as these reference types > are just informational but adding a release note can be useful. Slight clarification. Serguei and I were discussing whether we could continue to make JVMTI_HEAP_REFERENCE_MONITOR call backs or emit HPROF_GC_ROOT_MONITOR_USED entries in heap dump output as a way to ease the transition phase of getting used to these things going away. My answer was that we could do that but it would racy and unsafe due to the ObjectMonitor's object being GC'ed. It's also incorrect to make JVMTI_HEAP_REFERENCE_MONITOR call backs or emit HPROF_GC_ROOT_MONITOR_USED entries in heap dump once the ObjectMonitor's object ref becomes a weak handle. That weak handle no longer prevents the associated object from being GC'ed so it is no longer a strong root. See Erik's comment above: https://github.com/openjdk/jdk/pull/135#discussion_r487291746 ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From gziemski at openjdk.java.net Wed Sep 16 15:59:39 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 16 Sep 2020 15:59:39 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms Message-ID: hi all, Please review this change that refactors common POSIX code into a separate file. Currently there appears to be quite a bit of duplicated code among POSIX platforms, which makes it difficult to apply single fix to the signal code. With this fix, we will only need to touch single file for common POSIX code fixes from now on. ---------------------------------------------------------------------------- The APIs which moved from os/bsd/os_bsd.cpp to to os/posix/PosixSignals.cpp: //////////////////////////////////////////////////////////////////////////////// // signal support void os::Bsd::signal_sets_init() sigset_t* os::Bsd::unblocked_signals() sigset_t* os::Bsd::vm_signals() void os::Bsd::hotspot_sigmask(Thread* thread) //////////////////////////////////////////////////////////////////////////////// // sun.misc.Signal support static void UserHandler(int sig, void *siginfo, void *context) void* os::user_handler() void* os::signal(int signal_number, void* handler) void os::signal_raise(int signal_number) int os::sigexitnum_pd() static void jdk_misc_signal_init() void os::signal_notify(int sig) static int check_pending_signals() int os::signal_wait() //////////////////////////////////////////////////////////////////////////////// // suspend/resume support static void resume_clear_context(OSThread *osthread) static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, ucontext_t* context) static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) static int SR_initialize() static int sr_notify(OSThread* osthread) static bool do_suspend(OSThread* osthread) static void do_resume(OSThread* osthread) /////////////////////////////////////////////////////////////////////////////////// // signal handling (except suspend/resume) static void signalHandler(int sig, siginfo_t* info, void* uc) struct sigaction* os::Bsd::get_chained_signal_action(int sig) static bool call_chained_handler(struct sigaction *actp, int sig, siginfo_t *siginfo, void *context) bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) int os::Bsd::get_our_sigflags(int sig) void os::Bsd::set_our_sigflags(int sig, int flags) void os::Bsd::set_signal_handler(int sig, bool set_installed) void os::Bsd::install_signal_handlers() static const char* get_signal_handler_name(address handler, char* buf, int buflen) static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) void os::run_periodic_checks() void os::Bsd::check_signal_handler(int sig) ----------------------------------------------------------------------------- The APIs which moved from os/posix/os_posix.cpp to os/posix/PosixSignals.cpp: const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) int os::Posix::get_signal_number(const char* signal_name) int os::get_signal_number(const char* signal_name) bool os::Posix::is_valid_signal(int sig) bool os::Posix::is_sig_ignored(int sig) const char* os::exception_name(int sig, char* buf, size_t size) const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) oid os::Posix::print_sa_flags(outputStream* st, int flags) static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) void os::print_siginfo(outputStream* os, const void* si0) bool os::signal_thread(Thread* thread, int sig, const char* reason) int os::Posix::unblock_thread_signal_mask(const sigset_t *set) address os::Posix::ucontext_get_pc(const ucontext_t* ctx) void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) struct sigaction* os::Posix::get_preinstalled_handler(int sig) void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) -------------------------------------------------------- -------------------------------------------------------- DETAILS: -------------------------------------------------------- Public APIs which are now internal static PosixSignals:: sigset_t* os::Bsd::vm_signals() struct sigaction* os::Bsd::get_chained_signal_action(int sig) int os::Bsd::get_our_sigflags(int sig) void os::Bsd::set_our_sigflags(int sig, int flags) void os::Bsd::set_signal_handler(int sig, bool set_installed) void os::Bsd::check_signal_handler(int sig) const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) bool os::Posix::is_valid_signal(int sig) const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) oid os::Posix::print_sa_flags(outputStream* st, int flags) static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) ------------------------------------------------ Public APIs which moved to public PosixSignals:: void os::Bsd::signal_sets_init() void os::Bsd::hotspot_sigmask(Thread* thread) bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) void os::Bsd::install_signal_handlers() bool os::Posix::is_sig_ignored(int sig) int os::Posix::unblock_thread_signal_mask(const sigset_t *set) address os::Posix::ucontext_get_pc(const ucontext_t* ctx) void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) ---------------------------------------------------- Internal APIs which are now public in PosixSignals:: static void jdk_misc_signal_init() static int SR_initialize() static bool do_suspend(OSThread* osthread) static void do_resume(OSThread* osthread) static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) -------------------------- New APIs in PosixSignals:: static bool are_signal_handlers_installed(); ------------- Commit messages: - removed white spaces - Refactored common POSIX signal code into seperate file Changes: https://git.openjdk.java.net/jdk/pull/157/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252324 Stats: 5247 lines in 21 files changed: 1740 ins; 3400 del; 107 mod Patch: https://git.openjdk.java.net/jdk/pull/157.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/157/head:pull/157 PR: https://git.openjdk.java.net/jdk/pull/157 From dcubed at openjdk.java.net Wed Sep 16 16:02:16 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 16 Sep 2020 16:02:16 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 08:23:03 GMT, Erik ?sterlund wrote: >> Marked as reviewed by sspitsyn (Reviewer). > > I added a release note (https://bugs.openjdk.java.net/browse/JDK-8253225) describing that these roots are now weak, and > hence won't be reported. Please have a look at that, to make sure what I am describing makes sense. @kimbarrett - Thanks for the cleanup suggestion. @sspitsyn - Thanks for the review. I've gone ahead and "resolved" the comments that were addressed by previous commits. I've made changes for both Kim and Serguei's reviews. Building those bits now. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Wed Sep 16 16:13:23 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 16 Sep 2020 16:13:23 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v5] In-Reply-To: References: Message-ID: On Tue, 15 Sep 2020 10:17:59 GMT, Kim Barrett wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> rkennke, coleenp, fisk CR - delete random assert() that knows too much about markWords. > > Marked as reviewed by kbarrett (Reviewer). @kimbarrett and @sspitsyn - Your most recent comments should be resolved via https://github.com/openjdk/jdk/pull/135/commits/215084ac7ef481713560d14498ce420a40ca813a. ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From dcubed at openjdk.java.net Wed Sep 16 16:13:23 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 16 Sep 2020 16:13:23 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v6] In-Reply-To: References: Message-ID: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: Minor changes/deletions to address kimbarrett and sspitsyn CR comments. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/135/files - new: https://git.openjdk.java.net/jdk/pull/135/files/eeb9d761..215084ac Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=135&range=04-05 Stats: 9 lines in 2 files changed: 2 ins; 7 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/135.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/135/head:pull/135 PR: https://git.openjdk.java.net/jdk/pull/135 From sspitsyn at openjdk.java.net Wed Sep 16 16:47:59 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 16 Sep 2020 16:47:59 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v6] In-Reply-To: References: Message-ID: <4iJsXcOdlCYUMtI9bwYTqG8YEgyV1CHa1VC0ZWkcQ9M=.470d0741-5aba-433b-8917-90a8cc019e0d@github.com> On Wed, 16 Sep 2020 16:13:23 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Minor changes/deletions to address kimbarrett and sspitsyn CR comments. Marked as reviewed by sspitsyn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From eosterlund at openjdk.java.net Wed Sep 16 17:02:44 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 16 Sep 2020 17:02:44 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v6] In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 16:13:23 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Minor changes/deletions to address kimbarrett and sspitsyn CR comments. Marked as reviewed by eosterlund (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From hseigel at openjdk.java.net Wed Sep 16 18:25:47 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 16 Sep 2020 18:25:47 GMT Subject: RFR: 8253125 vmTestbase/nsk/stress/stack/stack017.java timed out Message-ID: Please review this small change that reduces the amount of thread stack space used by the test in order to reduce the time required to execute it. Before and after timings showed that the amount of time needed to execute the test dropped significantly with this change. The changes to this test are similar to the changes done to test stack018.java, and others. The modified test was run on Linux X64, MacOS, and Windows. ------------- Commit messages: - 8253125 vmTestbase/nsk/stress/stack/stack017.java timed out Changes: https://git.openjdk.java.net/jdk/pull/209/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=209&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253125 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/209.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/209/head:pull/209 PR: https://git.openjdk.java.net/jdk/pull/209 From dcubed at openjdk.java.net Wed Sep 16 18:47:55 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 16 Sep 2020 18:47:55 GMT Subject: RFR: 8253125 vmTestbase/nsk/stress/stack/stack017.java timed out In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 18:19:24 GMT, Harold Seigel wrote: > Please review this small change that reduces the amount of thread stack space used by the test in order to reduce the > time required to execute it. Before and after timings showed that the amount of time needed to execute the test > dropped significantly with this change. The changes to this test are similar to the changes done to test > stack018.java, and others. > The modified test was run on Linux X64, MacOS, and Windows. Marked as reviewed by dcubed (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/209 From dcubed at openjdk.java.net Wed Sep 16 18:47:55 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 16 Sep 2020 18:47:55 GMT Subject: RFR: 8253125 vmTestbase/nsk/stress/stack/stack017.java timed out In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 18:44:48 GMT, Daniel D. Daugherty wrote: >> Please review this small change that reduces the amount of thread stack space used by the test in order to reduce the >> time required to execute it. Before and after timings showed that the amount of time needed to execute the test >> dropped significantly with this change. The changes to this test are similar to the changes done to test >> stack018.java, and others. >> The modified test was run on Linux X64, MacOS, and Windows. > > Marked as reviewed by dcubed (Reviewer). Looks good and looks trivial. ------------- PR: https://git.openjdk.java.net/jdk/pull/209 From hseigel at openjdk.java.net Wed Sep 16 19:00:40 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 16 Sep 2020 19:00:40 GMT Subject: Integrated: 8253125 vmTestbase/nsk/stress/stack/stack017.java timed out In-Reply-To: References: Message-ID: <1ML6hxIZUk_szDH2Rjh0pihFHUm0BBDGigP6Rov-fqg=.426e25dd-30cb-4f0f-a1f7-b227b9aeb33f@github.com> On Wed, 16 Sep 2020 18:19:24 GMT, Harold Seigel wrote: > Please review this small change that reduces the amount of thread stack space used by the test in order to reduce the > time required to execute it. Before and after timings showed that the amount of time needed to execute the test > dropped significantly with this change. The changes to this test are similar to the changes done to test > stack018.java, and others. > The modified test was run on Linux X64, MacOS, and Windows. This pull request has now been integrated. Changeset: ce93cbce Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/ce93cbce Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8253125: vmTestbase/nsk/stress/stack/stack017.java timed out Reviewed-by: dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/209 From iklam at openjdk.java.net Wed Sep 16 20:53:34 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 16 Sep 2020 20:53:34 GMT Subject: RFR: 8253261: Disable CDS full module graph until JDK-8253081 is fixed Message-ID: Please review this trivial patch. [JDK-8244778](https://bugs.openjdk.java.net/browse/JDK-8244778) (Archive full module graph in CDS) is causing failures in GC ([JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081)) This patch disables the *use* of the CDS module graph at run time, which seems to avoid the GC problems. The module graph is still being archived into CDS, but that seems to be harmless. ------------- Commit messages: - 8253261: Disable CDS full module graph until JDK-8253081 is fixed Changes: https://git.openjdk.java.net/jdk/pull/213/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=213&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253261 Stats: 3 lines in 2 files changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/213.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/213/head:pull/213 PR: https://git.openjdk.java.net/jdk/pull/213 From richard.reingruber at sap.com Wed Sep 16 21:05:32 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 16 Sep 2020 21:05:32 +0000 Subject: Question on JavaThread::is_ext_suspend_completed() Message-ID: Hi, I've got a question on JavaThread::is_ext_suspend_completed(): Current thread C loads the thread state of target thread T [1]. Assume it observes _thread_in_native and also that T has a walkable stack. In this case is_ext_suspend_completed() returns true [2]. If called by JavaThread::java_suspend(), then this method will also return. I don't see the synchronization that shields against C seeing stale values of T's thread state and frame anchor. To me it looks as if T could be executing java bytecodes while C observes a stale state making the wrong conclusion that T is effectively suspended. What am I missing? I'd think that a sleep just before returning true could trigger the issue, can't it? [4] Thanks, Richard. [1] Loading thread state https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L671 [2] JavaThread::is_ext_suspend_completed() Returns true if save_state == _thread_in_native && frame_anchor()->walkable() https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L686 [3] JavaThread::java_suspend() returns if JavaThread::is_ext_suspend_completed() returns true. https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L2512 [4] Can a sleep before returning (see link below) trigger the issue? https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L691 From richard.reingruber at sap.com Wed Sep 16 22:00:21 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 16 Sep 2020 22:00:21 +0000 Subject: Question on JavaThread::is_ext_suspend_completed() In-Reply-To: References: Message-ID: I think I can answer this myself now :) C sets _external_suspend in T's _suspend_flags then it reads T's thread state (with a store-load barrier between write and read). If C sees _thread_in_native than this read happened before the thread state change. This means T will see _external_suspend when checking _suspend_flags after changing the thread state to _thread_in_native_trans (again with store-load barrier). So if C sees _thread_in_native it can be sure that T is effectively suspended. Thanks, Richard. -----Original Message----- From: Reingruber, Richard Sent: Mittwoch, 16. September 2020 23:06 To: Hotspot dev runtime Subject: Question on JavaThread::is_ext_suspend_completed() Hi, I've got a question on JavaThread::is_ext_suspend_completed(): Current thread C loads the thread state of target thread T [1]. Assume it observes _thread_in_native and also that T has a walkable stack. In this case is_ext_suspend_completed() returns true [2]. If called by JavaThread::java_suspend(), then this method will also return. I don't see the synchronization that shields against C seeing stale values of T's thread state and frame anchor. To me it looks as if T could be executing java bytecodes while C observes a stale state making the wrong conclusion that T is effectively suspended. What am I missing? I'd think that a sleep just before returning true could trigger the issue, can't it? [4] Thanks, Richard. [1] Loading thread state https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L671 [2] JavaThread::is_ext_suspend_completed() Returns true if save_state == _thread_in_native && frame_anchor()->walkable() https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L686 [3] JavaThread::java_suspend() returns if JavaThread::is_ext_suspend_completed() returns true. https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L2512 [4] Can a sleep before returning (see link below) trigger the issue? https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L691 From ccheung at openjdk.java.net Wed Sep 16 22:28:20 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 16 Sep 2020 22:28:20 GMT Subject: RFR: 8253261: Disable CDS full module graph until JDK-8253081 is fixed In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 20:38:20 GMT, Ioi Lam wrote: > Please review this trivial patch. > > [JDK-8244778](https://bugs.openjdk.java.net/browse/JDK-8244778) (Archive full module graph in CDS) is causing failures > in GC ([JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081)) > This patch disables the *use* of the CDS module graph at run time, which seems to avoid the GC problems. The module > graph is still being archived into CDS, but that seems to be harmless. Looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/213 From iklam at openjdk.java.net Wed Sep 16 22:47:10 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 16 Sep 2020 22:47:10 GMT Subject: Integrated: 8253261: Disable CDS full module graph until JDK-8253081 is fixed In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 20:38:20 GMT, Ioi Lam wrote: > Please review this trivial patch. > > [JDK-8244778](https://bugs.openjdk.java.net/browse/JDK-8244778) (Archive full module graph in CDS) is causing failures > in GC ([JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081)) > This patch disables the *use* of the CDS module graph at run time, which seems to avoid the GC problems. The module > graph is still being archived into CDS, but that seems to be harmless. This pull request has now been integrated. Changeset: 9a7dcdcd Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/9a7dcdcd Stats: 3 lines in 2 files changed: 0 ins; 3 del; 0 mod 8253261: Disable CDS full module graph until JDK-8253081 is fixed Reviewed-by: ccheung ------------- PR: https://git.openjdk.java.net/jdk/pull/213 From iklam at openjdk.java.net Wed Sep 16 23:20:05 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 16 Sep 2020 23:20:05 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> On Tue, 15 Sep 2020 20:15:52 GMT, Zhengyu Gu wrote: >> Thanks for the fix. I've done some testing on the patch. It passed tier1 and also passed running the >> appcds/dynamicArchive/methodHandles/MethodHandlesAsCollectorTest.java test 30 times on Windows and linux. One question >> : why the JavaThread::post_run doesn't need to set the following? >> set_stack_base(NULL); >> set_stack_size(0); > >> Thanks for the fix. I've done some testing on the patch. It passed tier1 and also passed running the >> appcds/dynamicArchive/methodHandles/MethodHandlesAsCollectorTest.java test 30 times on Windows and linux. One question >> : why the JavaThread::post_run doesn't need to set the following? set_stack_base(NULL); >> set_stack_size(0); > > Thanks for reviewing, Calvin. > > The last statement of JavaThread::post_run() deletes 'thread' object, so there is no point to reset its states here. > While NonJavaThread object lives pass lifespan of actual thread, its state does matter, I believe. This code in [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) still looks buggy to me: if (reserved_rgn->same_region(base_addr, size)) { reserved_rgn->set_call_stack(stack); reserved_rgn->set_flag(flag); return true; } else { assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); // Overlapped reservation. // It can happen when the regions are thread stacks, as JNI // thread does not detach from VM before exits, and leads to // leak JavaThread object if (reserved_rgn->flag() == mtThreadStack) { guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); // Overwrite with new region Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the exact size, I think we will get the same assert as in this bug report. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From zgu at openjdk.java.net Thu Sep 17 00:07:50 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 17 Sep 2020 00:07:50 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> Message-ID: On Wed, 16 Sep 2020 23:17:39 GMT, Ioi Lam wrote: > This code in > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) > still looks buggy to me: ``` > if (reserved_rgn->same_region(base_addr, size)) { > reserved_rgn->set_call_stack(stack); > reserved_rgn->set_flag(flag); > return true; > } else { > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); > > // Overlapped reservation. > // It can happen when the regions are thread stacks, as JNI > // thread does not detach from VM before exits, and leads to > // leak JavaThread object > if (reserved_rgn->flag() == mtThreadStack) { > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); > // Overwrite with new region > ``` > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> pd_reserve() -> os::reserve() > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the > exact size, I think we will get the same assert as in this bug report. Right. CheckJNICalls should catch that. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From iklam at openjdk.java.net Thu Sep 17 00:29:57 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 00:29:57 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> Message-ID: On Thu, 17 Sep 2020 00:05:30 GMT, Zhengyu Gu wrote: > > This code in > > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) > > still looks buggy to me: ``` > > if (reserved_rgn->same_region(base_addr, size)) { > > reserved_rgn->set_call_stack(stack); > > reserved_rgn->set_flag(flag); > > return true; > > } else { > > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); > > > > // Overlapped reservation. > > // It can happen when the regions are thread stacks, as JNI > > // thread does not detach from VM before exits, and leads to > > // leak JavaThread object > > if (reserved_rgn->flag() == mtThreadStack) { > > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); > > // Overwrite with new region > > ``` > > > > > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? > > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> > pd_reserve() -> os::reserve() > > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the > > exact size, I think we will get the same assert as in this bug report. > > Right, there is possibility. We can put the same guarantee there. I think the order of the checks should be changed: if (reserved_rgn->overlap_region(base_addr, size)) { check for overlapped reservation } if (reserved_rgn->same_region(base_addr, size)) { check for recursive reservation In fact, if the code had been written this way, this bug would not have happened. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From dholmes at openjdk.java.net Thu Sep 17 00:40:40 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 17 Sep 2020 00:40:40 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v6] In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 16:13:23 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Minor changes/deletions to address kimbarrett and sspitsyn CR comments. Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From zgu at openjdk.java.net Thu Sep 17 00:51:19 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 17 Sep 2020 00:51:19 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> Message-ID: On Thu, 17 Sep 2020 00:27:29 GMT, Ioi Lam wrote: > > > This code in > > > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) > > > still looks buggy to me: ``` > > > if (reserved_rgn->same_region(base_addr, size)) { > > > reserved_rgn->set_call_stack(stack); > > > reserved_rgn->set_flag(flag); > > > return true; > > > } else { > > > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); > > > > > > // Overlapped reservation. > > > // It can happen when the regions are thread stacks, as JNI > > > // thread does not detach from VM before exits, and leads to > > > // leak JavaThread object > > > if (reserved_rgn->flag() == mtThreadStack) { > > > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); > > > // Overwrite with new region > > > ``` > > > > > > > > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? > > > > > > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> > > pd_reserve() -> os::reserve() > > > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the > > > exact size, I think we will get the same assert as in this bug report. > > > > > > Right, there is possibility. We can put the same guarantee there. > > I think the order of the checks should be changed: > > ``` > if (reserved_rgn->overlap_region(base_addr, size)) { > check for overlapped reservation > } > if (reserved_rgn->same_region(base_addr, size)) { > check for recursive reservation > ``` > > In fact, if the code had been written this way, this bug would not have happened. Well, I don't know which bug you refer to, but overlap_region() covers same_region(), so reversing order is a no go. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From david.holmes at oracle.com Thu Sep 17 00:59:30 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Sep 2020 10:59:30 +1000 Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: <261d8a81-8961-a546-9d03-8f07327e6665@oracle.com> Hi Zhengyu, On 16/09/2020 9:28 pm, Zhengyu Gu wrote: > On Wed, 16 Sep 2020 02:49:45 GMT, David Holmes wrote: > >>> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor >>> before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues >>> alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the >>> thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to >>> post_run() method. >> >> src/hotspot/share/runtime/thread.hpp line 762: >> >>> 760: public: >>> 761: // Stack overflow support >>> 762: address stack_base() const { return _stack_base; } >> >> Why did you remove the assertion? We want the assertion in general to ensure there are no improper uses of stack_base(). > > We now reset NonJavaThread's stack base and size, so _stack_base == NULL is possible, e.g. when generating hs_err > report, as we should now see empty stack for "G1 Main Marker" thread in original bug report. Resetting the stack base and size seems pointless given we follow that immediately with Thread::clear_thread_current(); And once there is no thread-current for this thread its base and size can never be queried, so I don't think this change, or the change to the assertion in stack_base() is needed. > You do have a valid point for the assertion. I think following assertion is more precise: > > assert(_stack_base != NULL || Thread::current() != NULL, "Sanity check"); > > What you think? That assertion doesn't make sense to me as the stack_base() being queried doesn't have to belong to the current thread. Thanks, David ----- > Thanks > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/185 > From zgu at openjdk.java.net Thu Sep 17 01:18:22 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 17 Sep 2020 01:18:22 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> Message-ID: On Thu, 17 Sep 2020 00:46:33 GMT, Zhengyu Gu wrote: >>> > This code in >>> > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) >>> > still looks buggy to me: ``` >>> > if (reserved_rgn->same_region(base_addr, size)) { >>> > reserved_rgn->set_call_stack(stack); >>> > reserved_rgn->set_flag(flag); >>> > return true; >>> > } else { >>> > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); >>> > >>> > // Overlapped reservation. >>> > // It can happen when the regions are thread stacks, as JNI >>> > // thread does not detach from VM before exits, and leads to >>> > // leak JavaThread object >>> > if (reserved_rgn->flag() == mtThreadStack) { >>> > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); >>> > // Overwrite with new region >>> > ``` >>> > >>> > >>> > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? >>> >>> The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> >>> pd_reserve() -> os::reserve() >>> > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the >>> > exact size, I think we will get the same assert as in this bug report. >>> >>> Right, there is possibility. We can put the same guarantee there. >> >> I think the order of the checks should be changed: >> >> if (reserved_rgn->overlap_region(base_addr, size)) { >> check for overlapped reservation >> } >> if (reserved_rgn->same_region(base_addr, size)) { >> check for recursive reservation >> In fact, if the code had been written this way, this bug would not have happened. > >> > > This code in >> > > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) >> > > still looks buggy to me: ``` >> > > if (reserved_rgn->same_region(base_addr, size)) { >> > > reserved_rgn->set_call_stack(stack); >> > > reserved_rgn->set_flag(flag); >> > > return true; >> > > } else { >> > > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); >> > > >> > > // Overlapped reservation. >> > > // It can happen when the regions are thread stacks, as JNI >> > > // thread does not detach from VM before exits, and leads to >> > > // leak JavaThread object >> > > if (reserved_rgn->flag() == mtThreadStack) { >> > > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); >> > > // Overwrite with new region >> > > ``` >> > > >> > > >> > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? >> > >> > >> > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> >> > pd_reserve() -> os::reserve() >> > > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the >> > > exact size, I think we will get the same assert as in this bug report. >> > >> > >> > Right, there is possibility. We can put the same guarantee there. >> >> I think the order of the checks should be changed: >> >> ``` >> if (reserved_rgn->overlap_region(base_addr, size)) { >> check for overlapped reservation >> } >> if (reserved_rgn->same_region(base_addr, size)) { >> check for recursive reservation >> ``` >> >> In fact, if the code had been written this way, this bug would not have happened. > > Well, I don't know which bug you refer to, but overlap_region() covers same_region(), so reversing order is a no go. > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on > [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ > Hi Zhengyu, > > On 16/09/2020 9:28 pm, Zhengyu Gu wrote: > > > On Wed, 16 Sep 2020 02:49:45 GMT, David Holmes wrote: > > > > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > > > > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > > > > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > > > > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > > > > post_run() method. > > > > > > > > > src/hotspot/share/runtime/thread.hpp line 762: > > > > 760: public: > > > > 761: // Stack overflow support > > > > 762: address stack_base() const { return _stack_base; } > > > > > > > > > Why did you remove the assertion? We want the assertion in general to ensure there are no improper uses of stack_base(). > > > > > > We now reset NonJavaThread's stack base and size, so _stack_base == NULL is possible, e.g. when generating hs_err > > report, as we should now see empty stack for "G1 Main Marker" thread in original bug report. > > Resetting the stack base and size seems pointless given we follow that > immediately with > > Thread::clear_thread_current(); > > And once there is no thread-current for this thread its base and size > can never be queried, so I don't think this change, or the change to the > assertion in stack_base() is needed. > > > You do have a valid point for the assertion. I think following assertion is more precise: > > assert(_stack_base != NULL || Thread::current() != NULL, "Sanity check"); > > What you think? > > That assertion doesn't make sense to me as the stack_base() being > queried doesn't have to belong to the current thread. In regular case, yes. But there is an exception, during error reporting (Threads::print_on_error()). In the original bug report, "G1 Main Marker" thread already exited and should not have stack, but hs_err file shows: 0x000000a47fdee080 ConcurrentGCThread "G1 Main Marker" [stack: 0x000000a41cf40000,0x000000a41d040000] [id=17068] Thanks, -Zhengyu > > Thanks, > David > ----- ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From iklam at openjdk.java.net Thu Sep 17 01:26:32 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 01:26:32 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> Message-ID: <-Pr3A8mRFkpUTWPHoGAq2WPsi9snBkJKhRgxqp07Vmk=.9e009eda-799b-4fa8-a2c8-66363451a9eb@github.com> On Thu, 17 Sep 2020 00:46:33 GMT, Zhengyu Gu wrote: > > > > This code in > > > > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) > > > > still looks buggy to me: ``` > > > > if (reserved_rgn->same_region(base_addr, size)) { > > > > reserved_rgn->set_call_stack(stack); > > > > reserved_rgn->set_flag(flag); > > > > return true; > > > > } else { > > > > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); > > > > > > > > // Overlapped reservation. > > > > // It can happen when the regions are thread stacks, as JNI > > > > // thread does not detach from VM before exits, and leads to > > > > // leak JavaThread object > > > > if (reserved_rgn->flag() == mtThreadStack) { > > > > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); > > > > // Overwrite with new region > > > > ``` > > > > > > > > > > > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? > > > > > > > > > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> > > > pd_reserve() -> os::reserve() > > > > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the > > > > exact size, I think we will get the same assert as in this bug report. > > > > > > > > > Right, there is possibility. We can put the same guarantee there. > > > > > > I think the order of the checks should be changed: > > ``` > > if (reserved_rgn->overlap_region(base_addr, size)) { > > check for overlapped reservation > > } > > if (reserved_rgn->same_region(base_addr, size)) { > > check for recursive reservation > > ``` > > > > > > In fact, if the code had been written this way, this bug would not have happened. > > Well, I don't know which bug you refer to, but overlap_region() covers same_region(), so reversing order is a no go. I am talking the original assert reported by [JDK-8252921](https://bugs.openjdk.java.net/browse/JDK-8252921) assert((flag() == mtNone || flag() == f)) failed: Overwrite memory type for region [0x000000a41cf40000-0x000000a41d040000), 3->23. We get there because we have a old stack region which matches exactly as a newly reserved region. This case should also be handled by this code // Overlapped reservation. // It can happen when the regions are thread stacks, as JNI // thread does not detach from VM before exits, and leads to The current code assumes that if you have an exact match, it must be from a recursive reveration. This is wrong. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From zgu at openjdk.java.net Thu Sep 17 01:36:27 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 17 Sep 2020 01:36:27 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: <-Pr3A8mRFkpUTWPHoGAq2WPsi9snBkJKhRgxqp07Vmk=.9e009eda-799b-4fa8-a2c8-66363451a9eb@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> <-Pr3A8mRFkpUTWPHoGAq2WPsi9snBkJKhRgxqp07Vmk=.9e009eda-799b-4fa8-a2c8-66363451a9eb@github.com> Message-ID: On Thu, 17 Sep 2020 01:23:32 GMT, Ioi Lam wrote: > > > > > This code in > > > > > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) > > > > > still looks buggy to me: ``` > > > > > if (reserved_rgn->same_region(base_addr, size)) { > > > > > reserved_rgn->set_call_stack(stack); > > > > > reserved_rgn->set_flag(flag); > > > > > return true; > > > > > } else { > > > > > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); > > > > > > > > > > // Overlapped reservation. > > > > > // It can happen when the regions are thread stacks, as JNI > > > > > // thread does not detach from VM before exits, and leads to > > > > > // leak JavaThread object > > > > > if (reserved_rgn->flag() == mtThreadStack) { > > > > > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); > > > > > // Overwrite with new region > > > > > ``` > > > > > > > > > > > > > > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? > > > > > > > > > > > > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> > > > > pd_reserve() -> os::reserve() > > > > > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the > > > > > exact size, I think we will get the same assert as in this bug report. > > > > > > > > > > > > Right, there is possibility. We can put the same guarantee there. > > > > > > > > > I think the order of the checks should be changed: > > > ``` > > > if (reserved_rgn->overlap_region(base_addr, size)) { > > > check for overlapped reservation > > > } > > > if (reserved_rgn->same_region(base_addr, size)) { > > > check for recursive reservation > > > ``` > > > > > > > > > In fact, if the code had been written this way, this bug would not have happened. > > > > > > Well, I don't know which bug you refer to, but overlap_region() covers same_region(), so reversing order is a no go. > > I am talking the original assert reported by [JDK-8252921](https://bugs.openjdk.java.net/browse/JDK-8252921) > > ``` > assert((flag() == mtNone || flag() == f)) failed: Overwrite memory type for region > [0x000000a41cf40000-0x000000a41d040000), 3->23. ``` > > We get there because we have a old stack region which matches exactly as a newly reserved region. This case should also > be handled by this code > ``` > // Overlapped reservation. > // It can happen when the regions are thread stacks, as JNI > // thread does not detach from VM before exits, and leads to > ``` > > The current code assumes that if you have an exact match, it must be from a recursive reveration. This is wrong. In that case, it just hides the real problem. Because the thread stack region can be remapped to any other types and maybe different size. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From iklam at openjdk.java.net Thu Sep 17 03:41:03 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 03:41:03 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> <-Pr3A8mRFkpUTWPHoGAq2WPsi9snBkJKhRgxqp07Vmk=.9e009eda-799b-4fa8-a2c8-66363451a9eb@github.com> Message-ID: On Thu, 17 Sep 2020 01:34:05 GMT, Zhengyu Gu wrote: > > > > > > This code in > > > > > > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) > > > > > > still looks buggy to me: ``` > > > > > > if (reserved_rgn->same_region(base_addr, size)) { > > > > > > reserved_rgn->set_call_stack(stack); > > > > > > reserved_rgn->set_flag(flag); > > > > > > return true; > > > > > > } else { > > > > > > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); > > > > > > > > > > > > // Overlapped reservation. > > > > > > // It can happen when the regions are thread stacks, as JNI > > > > > > // thread does not detach from VM before exits, and leads to > > > > > > // leak JavaThread object > > > > > > if (reserved_rgn->flag() == mtThreadStack) { > > > > > > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); > > > > > > // Overwrite with new region > > > > > > ``` > > > > > > > > > > > > > > > > > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? > > > > > > > > > > > > > > > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> > > > > > pd_reserve() -> os::reserve() > > > > > > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the > > > > > > exact size, I think we will get the same assert as in this bug report. > > > > > > > > > > > > > > > Right, there is possibility. We can put the same guarantee there. > > > > > > > > > > > > I think the order of the checks should be changed: > > > > ``` > > > > if (reserved_rgn->overlap_region(base_addr, size)) { > > > > check for overlapped reservation > > > > } > > > > if (reserved_rgn->same_region(base_addr, size)) { > > > > check for recursive reservation > > > > ``` > > > > > > > > > > > > In fact, if the code had been written this way, this bug would not have happened. > > > > > > > > > Well, I don't know which bug you refer to, but overlap_region() covers same_region(), so reversing order is a no go. > > > > > > I am talking the original assert reported by [JDK-8252921](https://bugs.openjdk.java.net/browse/JDK-8252921) > > ``` > > assert((flag() == mtNone || flag() == f)) failed: Overwrite memory type for region > > [0x000000a41cf40000-0x000000a41d040000), 3->23. ``` > > > > > > We get there because we have a old stack region which matches exactly as a newly reserved region. This case should also > > be handled by this code ``` > > // Overlapped reservation. > > // It can happen when the regions are thread stacks, as JNI > > // thread does not detach from VM before exits, and leads to > > ``` > > > > > > The current code assumes that if you have an exact match, it must be from a recursive reveration. This is wrong. > > In that case, it just hides the real problem. Because the thread stack region can be remapped to any other types and > maybe different size. Let me clarify. I think your fix is good, but it only fixes part of the problems found by this bug. I am requesting that the following should **also** be fixed, so that the behavior is the same whether the new region partially overlaps with an old stack, or is exactly the same size as an old stack. Here's my proposed additional fix. I think your fix alone will not solve the problem where a JNI-attached thread was not properly detached before the thread exits. if (reserved_rgn == NULL) { VirtualMemorySummary::record_reserved_memory(size, flag); return _reserved_regions->add(rgn) != NULL; } else { if (reserved_rgn->overlap_region(base_addr, size) && reserved_rgn->flag() == mtThreadStack) { // Overlapped reservation. // It can happen when the regions are thread stacks, as JNI // thread does not detach from VM before exits, and leads to // leak JavaThread object guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); // Overwrite with new region // Release old region VirtualMemorySummary::record_uncommitted_memory(reserved_rgn->committed_size(), reserved_rgn->flag()); VirtualMemorySummary::record_released_memory(reserved_rgn->size(), reserved_rgn->flag()); // Add new region VirtualMemorySummary::record_reserved_memory(rgn.size(), flag); *reserved_rgn = rgn; return true; } else if (reserved_rgn->same_region(base_addr, size)) { reserved_rgn->set_call_stack(stack); reserved_rgn->set_flag(flag); return true; } else { assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); // CDS mapping region. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From david.holmes at oracle.com Thu Sep 17 03:45:31 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Sep 2020 13:45:31 +1000 Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> Message-ID: On 17/09/2020 11:18 am, Zhengyu Gu wrote: > On Thu, 17 Sep 2020 00:46:33 GMT, Zhengyu Gu wrote: >>>> src/hotspot/share/runtime/thread.hpp line 762: >>>>> 760: public: >>>>> 761: // Stack overflow support >>>>> 762: address stack_base() const { return _stack_base; } >>>> >>>> >>>> Why did you remove the assertion? We want the assertion in general to ensure there are no improper uses of stack_base(). >>> >>> >>> We now reset NonJavaThread's stack base and size, so _stack_base == NULL is possible, e.g. when generating hs_err >>> report, as we should now see empty stack for "G1 Main Marker" thread in original bug report. >> >> Resetting the stack base and size seems pointless given we follow that >> immediately with >> >> Thread::clear_thread_current(); >> >> And once there is no thread-current for this thread its base and size >> can never be queried, so I don't think this change, or the change to the >> assertion in stack_base() is needed. >> >>> You do have a valid point for the assertion. I think following assertion is more precise: >>> assert(_stack_base != NULL || Thread::current() != NULL, "Sanity check"); >>> What you think? >> >> That assertion doesn't make sense to me as the stack_base() being >> queried doesn't have to belong to the current thread. > > In regular case, yes. But there is an exception, during error reporting (Threads::print_on_error()). > In the original bug report, "G1 Main Marker" thread already exited and should not have stack, but hs_err file shows: > > 0x000000a47fdee080 ConcurrentGCThread "G1 Main Marker" [stack: 0x000000a41cf40000,0x000000a41d040000] [id=17068] In the original bug report the crash is in the VMThread! Subsequent snippets from Ioi show something happening with a G1 thread but I can't see the actual hs_err files for those. But if hs_err shows a stack it means that _thread was non-NULL, which means that Thread::current() was non-NULL, which means we haven't executed the Thread::clear_thread_current() nor cleared the stack_base or size. The only time there could be a problem is if we were to crash in Thread::clear_thread_current() itself - which is unlikely enough that I am not concerned about that one extreme case versus the change being made. David ----- > Thanks, > > -Zhengyu > > >> >> Thanks, >> David >> ----- > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/185 > From david.holmes at oracle.com Thu Sep 17 04:13:58 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Sep 2020 14:13:58 +1000 Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> Message-ID: On 17/09/2020 1:45 pm, David Holmes wrote: > On 17/09/2020 11:18 am, Zhengyu Gu wrote: >> On Thu, 17 Sep 2020 00:46:33 GMT, Zhengyu Gu wrote: >>>>> src/hotspot/share/runtime/thread.hpp line 762: >>>>>> 760:? public: >>>>>> 761:?? // Stack overflow support >>>>>> 762:?? address stack_base() const?????????? { return _stack_base; } >>>>> >>>>> >>>>> Why did you remove the assertion? We want the assertion in general >>>>> to ensure there are no improper uses of stack_base(). >>>> >>>> >>>> We now reset NonJavaThread's stack base and size,? so _stack_base == >>>> NULL is possible, e.g. when generating hs_err >>>> report, as we should now see empty stack for "G1 Main Marker" thread >>>> in original bug report. >>> >>> Resetting the stack base and size seems pointless given we follow that >>> immediately with >>> >>> Thread::clear_thread_current(); >>> >>> And once there is no thread-current for this thread its base and size >>> can never be queried, so I don't think this change, or the change to the >>> assertion in stack_base() is needed. >>> >>>> You do have a valid point for the assertion. I think following >>>> assertion is more precise: >>>> assert(_stack_base != NULL || Thread::current() != NULL, "Sanity >>>> check"); >>>> What you think? >>> >>> That assertion doesn't make sense to me as the stack_base() being >>> queried doesn't have to belong to the current thread. >> >> In regular case, yes. But there is an exception, during error >> reporting (Threads::print_on_error()). >> In the original bug report, "G1 Main Marker" thread already exited and >> should not have stack, but hs_err file shows: >> >> 0x000000a47fdee080 ConcurrentGCThread "G1 Main Marker" [stack: >> 0x000000a41cf40000,0x000000a41d040000] [id=17068] > > In the original bug report the crash is in the VMThread! Subsequent > snippets from Ioi show something happening with a G1 thread but I can't > see the actual hs_err files for those. > > But if hs_err shows a stack it means that _thread was non-NULL, which > means that Thread::current() was non-NULL, which means we haven't > executed the Thread::clear_thread_current() nor cleared the stack_base > or size. The only time there could be a problem is if we were to crash > in Thread::clear_thread_current() itself - which is unlikely enough that > I am not concerned about that one extreme case versus the change being > made. Okay I see the problem now - sorry for the confusion. The issue is when printing the other threads. And that means the issue is in the GC thread management code as if the GC worker has terminated it should have already removed itself from the set of worker threads. Though as this is a crash, and we are not at a safepoint, then there could still be a race there. :( But note that if we actually deleted the thread we could still crash accessing it at that point. So this just seems to be a rare race condition that we are unlikely to hit - and if we did then the secondary assertion failure would not be too bad IMO. But the modified assertion still makes no sense as Thread::current() is not relevant AFAICS. I would prefer to keep the assertion to guard the regular case of accidental misuse of a thread before the stack_base() has been set. David ----- > > David > ----- > > >> Thanks, >> >> -Zhengyu >> >> >>> >>> Thanks, >>> David >>> ----- >> >> ------------- >> >> PR: https://git.openjdk.java.net/jdk/pull/185 >> From kbarrett at openjdk.java.net Thu Sep 17 04:15:19 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 17 Sep 2020 04:15:19 GMT Subject: RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v6] In-Reply-To: References: Message-ID: On Wed, 16 Sep 2020 16:13:23 GMT, Daniel D. Daugherty wrote: >> This RFE is to migrate the following field to OopStorage: >> >> class ObjectMonitor { >> >> void* volatile _object; // backward object pointer - strong root >> >> Unlike the previous patches in this series, there are a lot of collateral >> changes so this is not a trivial review. Sorry for the tedious parts of >> the review. Since Erik and I are both contributors to this patch, we >> would like at least 1 GC team reviewer and 1 Runtime team reviewer. >> >> This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing >> along with JDK-8252980 and JDK-8252981. I also ran it through my >> inflation stress kit for 48 hours on my Linux-X64 machine. > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Minor changes/deletions to address kimbarrett and sspitsyn CR comments. Marked as reviewed by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From shade at openjdk.java.net Thu Sep 17 11:42:49 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 17 Sep 2020 11:42:49 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect Message-ID: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> There are some jcstress failures with AArch64 Zero. It seems because to happen because `orderAccess_linux_zero.hpp` defaults to compiler-only barriers for most OrderAccess::* calls. We need to defer to the strongest barriers by default. The code also needs some rearrangement to make the mappings clear. ------------- Commit messages: - Mirror the same in bsd_zero.hpp - 8253284: Zero OrderAccess barrier mappings are incorrect Changes: https://git.openjdk.java.net/jdk/pull/224/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=224&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253284 Stats: 41 lines in 2 files changed: 14 ins; 16 del; 11 mod Patch: https://git.openjdk.java.net/jdk/pull/224.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/224/head:pull/224 PR: https://git.openjdk.java.net/jdk/pull/224 From rehn at openjdk.java.net Thu Sep 17 12:07:15 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 17 Sep 2020 12:07:15 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: References: Message-ID: > This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes > are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) > Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an > arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a > singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first > pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first > pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake > operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything > except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed > filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake > operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there > is now only one method to call when handshaking one thread. Some comments about the changes: > - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake > all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much > worse then this. > > - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that > thread. > > - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a > handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. > > - Added WB testing method. > > - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. > > - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. > > - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a > safepoint and half of them after, in many handshake all operations. > > - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. > > - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the > moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. > > - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to the NoSafepointVerifyer. > > - Added filtered queue and gtest for it. > > Passes multiple t1-8 runs. > Been through some pre-reviwing. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Fixed double checks Added NSV ProcessResult to enum Fixed logging Moved _active_handshaker to private ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/151/files - new: https://git.openjdk.java.net/jdk/pull/151/files/efbee6f0..86b83d05 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=00-01 Stats: 72 lines in 4 files changed: 32 ins; 15 del; 25 mod Patch: https://git.openjdk.java.net/jdk/pull/151.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/151/head:pull/151 PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Thu Sep 17 12:12:28 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 17 Sep 2020 12:12:28 GMT Subject: RFR: 8238761: Asynchronous handshakes In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 13:00:59 GMT, Robbin Ehn wrote: > This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes > are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) > Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an > arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a > singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first > pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first > pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake > operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything > except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed > filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake > operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there > is now only one method to call when handshaking one thread. Some comments about the changes: > - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake > all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much > worse then this. > > - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that > thread. > > - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a > handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. > > - Added WB testing method. > > - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. > > - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. > > - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a > safepoint and half of them after, in many handshake all operations. > > - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. > > - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the > moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. > > - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to the NoSafepointVerifyer. > > - Added filtered queue and gtest for it. > > Passes multiple t1-8 runs. > Been through some pre-reviwing. > src/hotspot/share/prims/jvmtiThreadState.cpp > src/hotspot/share/prims/jvmtiEnvBase.cpp Removed double checks. > src/hotspot/share/runtime/handshake.cpp > ... > I would just leave ProcessResult as an enum and log as before. Reverted to plain enum and updated logs. (better?) > src/hotspot/share/runtime/handshake.cpp > 387: NoSafepointVerifier nsv; > 388: process_self_inner(); I wanted a NSV to cover the process_self_inner method. So I added a second one in suggested place: > NoSafepointVerifier nsv; > _handshake_cl->do_thread(thread); > } > src/hotspot/share/runtime/interfaceSupport.inline.hpp > 156: // Threads shouldn't block if they are in the middle of > printing, but... > 157: ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); > What's the issue of having NoSafepointVerifier inside the handshake? Sorry, the issue is the lock rank. Right now the semaphore hides this issue. Please see commit 86b83d0. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From dholmes at openjdk.java.net Thu Sep 17 12:22:00 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 17 Sep 2020 12:22:00 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect In-Reply-To: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: On Thu, 17 Sep 2020 11:35:50 GMT, Aleksey Shipilev wrote: > There are some jcstress failures with AArch64 Zero. It seems because to happen because `orderAccess_linux_zero.hpp` > defaults to compiler-only barriers for most OrderAccess::* calls. We need to defer to the strongest barriers by > default. The code also needs some rearrangement to make the mappings clear. Looks reasonable other than the ifdef ARM issue. src/hotspot/os_cpu/bsd_zero/orderAccess_bsd_zero.hpp line 31: > 29: // Included in orderAccess.hpp header file. > 30: > 31: #ifdef ARM I assume this should have been changed to #if defined(ARM) ?? ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/224 From shade at openjdk.java.net Thu Sep 17 12:26:44 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 17 Sep 2020 12:26:44 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: > There are some jcstress failures with AArch64 Zero. It seems because to happen because `orderAccess_linux_zero.hpp` > defaults to compiler-only barriers for most OrderAccess::* calls. We need to defer to the strongest barriers by > default. The code also needs some rearrangement to make the mappings clear. Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Fix copy-paste omission in bsd_zero ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/224/files - new: https://git.openjdk.java.net/jdk/pull/224/files/012a59a3..add14aa2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=224&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=224&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/224.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/224/head:pull/224 PR: https://git.openjdk.java.net/jdk/pull/224 From shade at openjdk.java.net Thu Sep 17 12:26:45 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 17 Sep 2020 12:26:45 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: On Thu, 17 Sep 2020 12:14:54 GMT, David Holmes wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copy-paste omission in bsd_zero > > src/hotspot/os_cpu/bsd_zero/orderAccess_bsd_zero.hpp line 31: > >> 29: // Included in orderAccess.hpp header file. >> 30: >> 31: #ifdef ARM > > I assume this should have been changed to #if defined(ARM) ?? Dang! Right, copy-paste error. Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From zgu at openjdk.java.net Thu Sep 17 12:50:20 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 17 Sep 2020 12:50:20 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v3] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> <-Pr3A8mRFkpUTWPHoGAq2WPsi9snBkJKhRgxqp07Vmk=.9e009eda-799b-4fa8-a2c8-66363451a9eb@github.com> Message-ID: On Thu, 17 Sep 2020 03:38:41 GMT, Ioi Lam wrote: >>> > > > > This code in >>> > > > > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) >>> > > > > still looks buggy to me: ``` >>> > > > > if (reserved_rgn->same_region(base_addr, size)) { >>> > > > > reserved_rgn->set_call_stack(stack); >>> > > > > reserved_rgn->set_flag(flag); >>> > > > > return true; >>> > > > > } else { >>> > > > > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); >>> > > > > >>> > > > > // Overlapped reservation. >>> > > > > // It can happen when the regions are thread stacks, as JNI >>> > > > > // thread does not detach from VM before exits, and leads to >>> > > > > // leak JavaThread object >>> > > > > if (reserved_rgn->flag() == mtThreadStack) { >>> > > > > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); >>> > > > > // Overwrite with new region >>> > > > > ``` >>> > > > > >>> > > > > >>> > > > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? >>> > > > >>> > > > >>> > > > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> >>> > > > pd_reserve() -> os::reserve() >>> > > > > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the >>> > > > > exact size, I think we will get the same assert as in this bug report. >>> > > > >>> > > > >>> > > > Right, there is possibility. We can put the same guarantee there. >>> > > >>> > > >>> > > I think the order of the checks should be changed: >>> > > ``` >>> > > if (reserved_rgn->overlap_region(base_addr, size)) { >>> > > check for overlapped reservation >>> > > } >>> > > if (reserved_rgn->same_region(base_addr, size)) { >>> > > check for recursive reservation >>> > > ``` >>> > > >>> > > >>> > > In fact, if the code had been written this way, this bug would not have happened. >>> > >>> > >>> > Well, I don't know which bug you refer to, but overlap_region() covers same_region(), so reversing order is a no go. >>> >>> I am talking the original assert reported by [JDK-8252921](https://bugs.openjdk.java.net/browse/JDK-8252921) >>> >>> ``` >>> assert((flag() == mtNone || flag() == f)) failed: Overwrite memory type for region >>> [0x000000a41cf40000-0x000000a41d040000), 3->23. ``` >>> >>> We get there because we have a old stack region which matches exactly as a newly reserved region. This case should also >>> be handled by this code >>> ``` >>> // Overlapped reservation. >>> // It can happen when the regions are thread stacks, as JNI >>> // thread does not detach from VM before exits, and leads to >>> ``` >>> >>> The current code assumes that if you have an exact match, it must be from a recursive reveration. This is wrong. >> >> In that case, it just hides the real problem. Because the thread stack region can be remapped to any other types and >> maybe different size. > >> > > > > > This code in >> > > > > > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) >> > > > > > still looks buggy to me: ``` >> > > > > > if (reserved_rgn->same_region(base_addr, size)) { >> > > > > > reserved_rgn->set_call_stack(stack); >> > > > > > reserved_rgn->set_flag(flag); >> > > > > > return true; >> > > > > > } else { >> > > > > > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); >> > > > > > >> > > > > > // Overlapped reservation. >> > > > > > // It can happen when the regions are thread stacks, as JNI >> > > > > > // thread does not detach from VM before exits, and leads to >> > > > > > // leak JavaThread object >> > > > > > if (reserved_rgn->flag() == mtThreadStack) { >> > > > > > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); >> > > > > > // Overwrite with new region >> > > > > > ``` >> > > > > > >> > > > > > >> > > > > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? >> > > > > >> > > > > >> > > > > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> >> > > > > pd_reserve() -> os::reserve() >> > > > > > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the >> > > > > > exact size, I think we will get the same assert as in this bug report. >> > > > > >> > > > > >> > > > > Right, there is possibility. We can put the same guarantee there. >> > > > >> > > > >> > > > I think the order of the checks should be changed: >> > > > ``` >> > > > if (reserved_rgn->overlap_region(base_addr, size)) { >> > > > check for overlapped reservation >> > > > } >> > > > if (reserved_rgn->same_region(base_addr, size)) { >> > > > check for recursive reservation >> > > > ``` >> > > > >> > > > >> > > > In fact, if the code had been written this way, this bug would not have happened. >> > > >> > > >> > > Well, I don't know which bug you refer to, but overlap_region() covers same_region(), so reversing order is a no go. >> > >> > >> > I am talking the original assert reported by [JDK-8252921](https://bugs.openjdk.java.net/browse/JDK-8252921) >> > ``` >> > assert((flag() == mtNone || flag() == f)) failed: Overwrite memory type for region >> > [0x000000a41cf40000-0x000000a41d040000), 3->23. ``` >> > >> > >> > We get there because we have a old stack region which matches exactly as a newly reserved region. This case should also >> > be handled by this code ``` >> > // Overlapped reservation. >> > // It can happen when the regions are thread stacks, as JNI >> > // thread does not detach from VM before exits, and leads to >> > ``` >> > >> > >> > The current code assumes that if you have an exact match, it must be from a recursive reveration. This is wrong. >> >> In that case, it just hides the real problem. Because the thread stack region can be remapped to any other types and >> maybe different size. > > Let me clarify. I think your fix is good, but it only fixes part of the problems found by this bug. > > I am requesting that the following should **also** be fixed, so that the behavior is the same whether the new region > partially overlaps with an old stack, or is exactly the same size as an old stack. > Here's my proposed additional fix. I think your fix alone will not solve the problem where a JNI-attached thread was > not properly detached before the thread exits. > if (reserved_rgn == NULL) { > VirtualMemorySummary::record_reserved_memory(size, flag); > return _reserved_regions->add(rgn) != NULL; > } else { > if (reserved_rgn->overlap_region(base_addr, size) && reserved_rgn->flag() == mtThreadStack) { > // Overlapped reservation. > // It can happen when the regions are thread stacks, as JNI > // thread does not detach from VM before exits, and leads to > // leak JavaThread object > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); > // Overwrite with new region > > // Release old region > VirtualMemorySummary::record_uncommitted_memory(reserved_rgn->committed_size(), reserved_rgn->flag()); > VirtualMemorySummary::record_released_memory(reserved_rgn->size(), reserved_rgn->flag()); > > // Add new region > VirtualMemorySummary::record_reserved_memory(rgn.size(), flag); > > *reserved_rgn = rgn; > return true; > } else if (reserved_rgn->same_region(base_addr, size)) { > reserved_rgn->set_call_stack(stack); > reserved_rgn->set_flag(flag); > return true; > } else { > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); > > // CDS mapping region. > Okay I see the problem now - sorry for the confusion. The issue is when > printing the other threads. And that means the issue is in the GC thread > management code as if the GC worker has terminated it should have > already removed itself from the set of worker threads. Though as this is > a crash, and we are not at a safepoint, then there could still be a race > there. :( But note that if we actually deleted the thread we could still > crash accessing it at that point. So this just seems to be a rare race > condition that we are unlikely to hit - and if we did then the secondary > assertion failure would not be too bad IMO. > > But the modified assertion still makes no sense as Thread::current() is > not relevant AFAICS. I would prefer to keep the assertion to guard the > regular case of accidental misuse of a thread before the stack_base() > has been set. > The reported crash actually is at a safepoint, but it is irrelevant. The problem is that 'Thread' object can outlive thread itself. "G1 Main Marker" thread is not a worker, but a ConcurrentGCThread, a member of G1CollectedHeap. G1ConcurrentMarkThread* _cm_thread; and during error report Universe::heap()->gc_threads_do(&print_closure); For example, G1 always tries to report its internal states:

void G1CollectedHeap::gc_threads_do(ThreadClosure* tc) const {
  workers()->threads_do(tc);
  tc->do_thread(_cm_thread);   <====
  _cm->threads_do(tc);
  _cr->threads_do(tc);
  tc->do_thread(_service_thread);
  if (G1StringDedup::is_enabled()) {
    G1StringDedup::threads_do(tc);
  }
}
So it is more likely to hit the secondary assertion than you described. The modified assertion only suggests that you may query thread states from other thread, the result is the best effort and it keeps the semantics unchanged when queries from 'current' thread. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From rrich at openjdk.java.net Thu Sep 17 14:06:17 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 17 Sep 2020 14:06:17 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() Message-ID: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> After JDK-8252414 the safepoint/handshake code does not take _suspend_flags into accout anymore in its assessment if a thread is safepoint/handshake safe. This change updates the comment on JavaThread::java_suspend_self_with_safepoint_check(). I have (not yet) fixed the line breaks (fill-paragraph in emacs lingo) for a clearer diff. Also I could inline the (*) footnote. ------------- Commit messages: - 8253241: Update comment on java_suspend_self_with_safepoint_check() Changes: https://git.openjdk.java.net/jdk/pull/225/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=225&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253241 Stats: 8 lines in 1 file changed: 2 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/225.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/225/head:pull/225 PR: https://git.openjdk.java.net/jdk/pull/225 From zgu at openjdk.java.net Thu Sep 17 15:50:43 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 17 Sep 2020 15:50:43 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v4] In-Reply-To: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > post_run() method. Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Tighten recusive reserveration handling ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/185/files - new: https://git.openjdk.java.net/jdk/pull/185/files/a7113def..c49380ef Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=02-03 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/185.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/185/head:pull/185 PR: https://git.openjdk.java.net/jdk/pull/185 From zgu at openjdk.java.net Thu Sep 17 15:54:56 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 17 Sep 2020 15:54:56 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v4] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> <8vH-M5TZ4AIUKuZiJSqhOlr0FPHGqy9IineP7_y4YQk=.e3d58e0b-49bd-451d-8eb2-1847b630fa90@github.com> <-Pr3A8mRFkpUTWPHoGAq2WPsi9snBkJKhRgxqp07Vmk=.9e009eda-799b-4fa8-a2c8-66363451a9eb@github.com> Message-ID: On Thu, 17 Sep 2020 12:47:13 GMT, Zhengyu Gu wrote: >>> > > > > > This code in >>> > > > > > [virtualMemoryTracker.cpp](https://github.com/openjdk/jdk/blob/9a7dcdcdbad34e061a8988287fe691abfd4df305/src/hotspot/share/services/virtualMemoryTracker.cpp#L350) >>> > > > > > still looks buggy to me: ``` >>> > > > > > if (reserved_rgn->same_region(base_addr, size)) { >>> > > > > > reserved_rgn->set_call_stack(stack); >>> > > > > > reserved_rgn->set_flag(flag); >>> > > > > > return true; >>> > > > > > } else { >>> > > > > > assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); >>> > > > > > >>> > > > > > // Overlapped reservation. >>> > > > > > // It can happen when the regions are thread stacks, as JNI >>> > > > > > // thread does not detach from VM before exits, and leads to >>> > > > > > // leak JavaThread object >>> > > > > > if (reserved_rgn->flag() == mtThreadStack) { >>> > > > > > guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); >>> > > > > > // Overwrite with new region >>> > > > > > ``` >>> > > > > > >>> > > > > > >>> > > > > > Why is the "Overlapped reservation" fix not done when the two regions are exactly the same? >>> > > > > >>> > > > > >>> > > > > The same region branch was to deal with different scenario: recursive reservation, where os::reserve() -> >>> > > > > pd_reserve() -> os::reserve() >>> > > > > > If a JNI thread has exited without detaching, and its stack happens to be picked for a future memory reservation of the >>> > > > > > exact size, I think we will get the same assert as in this bug report. >>> > > > > >>> > > > > >>> > > > > Right, there is possibility. We can put the same guarantee there. >>> > > > >>> > > > >>> > > > I think the order of the checks should be changed: >>> > > > ``` >>> > > > if (reserved_rgn->overlap_region(base_addr, size)) { >>> > > > check for overlapped reservation >>> > > > } >>> > > > if (reserved_rgn->same_region(base_addr, size)) { >>> > > > check for recursive reservation >>> > > > ``` >>> > > > >>> > > > >>> > > > In fact, if the code had been written this way, this bug would not have happened. >>> > > >>> > > >>> > > Well, I don't know which bug you refer to, but overlap_region() covers same_region(), so reversing order is a no go. >>> > >>> > >>> > I am talking the original assert reported by [JDK-8252921](https://bugs.openjdk.java.net/browse/JDK-8252921) >>> > ``` >>> > assert((flag() == mtNone || flag() == f)) failed: Overwrite memory type for region >>> > [0x000000a41cf40000-0x000000a41d040000), 3->23. ``` >>> > >>> > >>> > We get there because we have a old stack region which matches exactly as a newly reserved region. This case should also >>> > be handled by this code ``` >>> > // Overlapped reservation. >>> > // It can happen when the regions are thread stacks, as JNI >>> > // thread does not detach from VM before exits, and leads to >>> > ``` >>> > >>> > >>> > The current code assumes that if you have an exact match, it must be from a recursive reveration. This is wrong. >>> >>> In that case, it just hides the real problem. Because the thread stack region can be remapped to any other types and >>> maybe different size. >> >> Let me clarify. I think your fix is good, but it only fixes part of the problems found by this bug. >> >> I am requesting that the following should **also** be fixed, so that the behavior is the same whether the new region >> partially overlaps with an old stack, or is exactly the same size as an old stack. >> Here's my proposed additional fix. I think your fix alone will not solve the problem where a JNI-attached thread was >> not properly detached before the thread exits. >> if (reserved_rgn == NULL) { >> VirtualMemorySummary::record_reserved_memory(size, flag); >> return _reserved_regions->add(rgn) != NULL; >> } else { >> if (reserved_rgn->overlap_region(base_addr, size) && reserved_rgn->flag() == mtThreadStack) { >> // Overlapped reservation. >> // It can happen when the regions are thread stacks, as JNI >> // thread does not detach from VM before exits, and leads to >> // leak JavaThread object >> guarantee(!CheckJNICalls, "Attached JNI thread exited without being detached"); >> // Overwrite with new region >> >> // Release old region >> VirtualMemorySummary::record_uncommitted_memory(reserved_rgn->committed_size(), reserved_rgn->flag()); >> VirtualMemorySummary::record_released_memory(reserved_rgn->size(), reserved_rgn->flag()); >> >> // Add new region >> VirtualMemorySummary::record_reserved_memory(rgn.size(), flag); >> >> *reserved_rgn = rgn; >> return true; >> } else if (reserved_rgn->same_region(base_addr, size)) { >> reserved_rgn->set_call_stack(stack); >> reserved_rgn->set_flag(flag); >> return true; >> } else { >> assert(reserved_rgn->overlap_region(base_addr, size), "Must be"); >> >> // CDS mapping region. > >> Okay I see the problem now - sorry for the confusion. The issue is when >> printing the other threads. And that means the issue is in the GC thread >> management code as if the GC worker has terminated it should have >> already removed itself from the set of worker threads. Though as this is >> a crash, and we are not at a safepoint, then there could still be a race >> there. :( But note that if we actually deleted the thread we could still >> crash accessing it at that point. So this just seems to be a rare race >> condition that we are unlikely to hit - and if we did then the secondary >> assertion failure would not be too bad IMO. >> >> But the modified assertion still makes no sense as Thread::current() is >> not relevant AFAICS. I would prefer to keep the assertion to guard the >> regular case of accidental misuse of a thread before the stack_base() >> has been set. >> > The reported crash actually is at a safepoint, but it is irrelevant. > The problem is that 'Thread' object can outlive thread itself. > > "G1 Main Marker" thread is not a worker, but a ConcurrentGCThread, a member of G1CollectedHeap. > G1ConcurrentMarkThread* _cm_thread; > and during error report > Universe::heap()->gc_threads_do(&print_closure); > For example, G1 always tries to report its internal states: >

> void G1CollectedHeap::gc_threads_do(ThreadClosure* tc) const {
>   workers()->threads_do(tc);
>   tc->do_thread(_cm_thread);   <====
>   _cm->threads_do(tc);
>   _cr->threads_do(tc);
>   tc->do_thread(_service_thread);
>   if (G1StringDedup::is_enabled()) {
>     G1StringDedup::threads_do(tc);
>   }
> }
> 
> > So it is more likely to hit the secondary assertion than you described. > > The modified assertion only suggests that you may query thread states from other thread, the result is the best effort > and it keeps the semantics unchanged when queries from 'current' thread. > Thanks. Updated to address JNI detach thread issue. Ran tier1, vmTestbase_nsk_monitoring, vmTestbase_nsk_jdi and vmTestbase_nsk_jdwp with NMT on. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From iklam at openjdk.java.net Thu Sep 17 17:26:26 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 17:26:26 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v4] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Thu, 17 Sep 2020 15:50:43 GMT, Zhengyu Gu wrote: >> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor >> before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues >> alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the >> thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to >> post_run() method. > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Tighten recusive reserveration handling Looks good! ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/185 From pchilanomate at openjdk.java.net Thu Sep 17 19:00:24 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Thu, 17 Sep 2020 19:00:24 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: References: Message-ID: On Thu, 17 Sep 2020 12:07:15 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Fixed double checks > Added NSV > ProcessResult to enum > Fixed logging > Moved _active_handshaker to private Changes look good, thanks for fixing! I added some comments on the changes. src/hotspot/share/runtime/handshake.cpp line 464: > 462: > 463: const char* executioner_name(Thread* current_thread, Thread* handshakee, bool current_is_requester) { > 464: if (current_thread == handshakee) return "self(JavaThread)"; I think we can remove this line since executioner_name() is only called by the handshaker. src/hotspot/share/runtime/handshake.cpp line 508: > 506: assert(op->_target == NULL || _handshakee == op->_target, "Wrong thread"); > 507: log_trace(handshake)("Processing handshake " INTPTR_FORMAT " by %s", p2i(op), > 508: executioner_name(current_thread, _handshakee, op == match_op)); With the above change we could even avoid factoring the code into executioner_name() and just do: log_trace(handshake)("Processing handshake " INTPTR_FORMAT " by %s%s", p2i(op), op == match_op ? "handshaker" : "cooperative", current_thread->is_VM_thread() ? "(VM Thread)" : "(JavaThread)"); src/hotspot/share/prims/jvmtiEnvBase.cpp line 908: > 906: #endif > 907: Thread* current_thread = Thread::current(); > 908: assert(current_thread == java_thread || One extra check here. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1164: > 1162: #ifdef ASSERT > 1163: Thread *current_thread = Thread::current(); > 1164: assert(current_thread == thr || One extra check here. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From iklam at openjdk.java.net Thu Sep 17 19:08:17 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 19:08:17 GMT Subject: RFR: 8251261: CDS dumping should not clear states in live classes Message-ID: We had an issue when CDS dumped a static archive (java -Xshare:dump), it would call `Klass::remove_unshareable_info()` too early. In one of the test failures, ZGC was still scanning the heap and stepped on a class whose mirror has been removed. The fix is to avoid modifying the states of the Java classes during -Xshare:dump. Instead, we call `Klass::remove_unshareable_info()` only on the **copy** of the classes which are written into the archive. It's safe to do so because these copies are visible only to the CDS dumping code. They aren't accessible by the GC or any other subsystems. It turns out that we were already doing this for the dynamic archive. So I just generalized the code in dynamicArchive.cpp and moved it to archiveBuilder.cpp. So this PR is one step forward for [JDK-8234693 Consolidate CDS static and dynamic archive dumping code](https://bugs.openjdk.java.net/browse/JDK-8234693). I also fixed another case where we modify the global VM state -- I removed `Universe::clear_basic_type_mirrors()`. ---- We are still modifying some global VM states (such as SystemDictionary::_well_known_klasses). They seem harmless now, but we might have to do more fixes in the future. ------------- Commit messages: - 8251261: CDS dumping should not clear states in live classes Changes: https://git.openjdk.java.net/jdk/pull/227/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=227&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8251261 Stats: 214 lines in 9 files changed: 75 ins; 128 del; 11 mod Patch: https://git.openjdk.java.net/jdk/pull/227.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/227/head:pull/227 PR: https://git.openjdk.java.net/jdk/pull/227 From rehn at openjdk.java.net Thu Sep 17 19:28:30 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 17 Sep 2020 19:28:30 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: References: Message-ID: <1NYgXmzZSEe4Bjew1P1JvHP-gtInrEU4FN6LzVPDpiw=.6b53dae2-2cbb-4262-ab7f-90d9f03dfa48@github.com> On Thu, 17 Sep 2020 18:28:20 GMT, Patricio Chilano Mateo wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed double checks >> Added NSV >> ProcessResult to enum >> Fixed logging >> Moved _active_handshaker to private > > src/hotspot/share/runtime/handshake.cpp line 464: > >> 462: >> 463: const char* executioner_name(Thread* current_thread, Thread* handshakee, bool current_is_requester) { >> 464: if (current_thread == handshakee) return "self(JavaThread)"; > > I think we can remove this line since executioner_name() is only called by the handshaker. Yes, fixing. > src/hotspot/share/runtime/handshake.cpp line 508: > >> 506: assert(op->_target == NULL || _handshakee == op->_target, "Wrong thread"); >> 507: log_trace(handshake)("Processing handshake " INTPTR_FORMAT " by %s", p2i(op), >> 508: executioner_name(current_thread, _handshakee, op == match_op)); > > With the above change we could even avoid factoring the code into executioner_name() and just do: > log_trace(handshake)("Processing handshake " INTPTR_FORMAT " by %s%s", p2i(op), > op == match_op ? "handshaker" : "cooperative", > current_thread->is_VM_thread() ? "(VM Thread)" : "(JavaThread)"); I added a second log line where I use that function also: log_trace(handshake)("Thread %s(" INTPTR_FORMAT ") executed %d ops for JavaThread: " INTPTR_FORMAT " %s target op: " INTPTR_FORMAT, executioner_name(current_thread, _handshakee, pr_ret == HandshakeState::_succeed), p2i(current_thread), executed, p2i(_handshakee), pr_ret == HandshakeState::_succeed ? "including" : "excluding", p2i(match_op)); > src/hotspot/share/prims/jvmtiEnvBase.cpp line 908: > >> 906: #endif >> 907: Thread* current_thread = Thread::current(); >> 908: assert(current_thread == java_thread || > > One extra check here. Fixing > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1164: > >> 1162: #ifdef ASSERT >> 1163: Thread *current_thread = Thread::current(); >> 1164: assert(current_thread == thr || > > One extra check here. Fixing ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Thu Sep 17 19:51:25 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 17 Sep 2020 19:51:25 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: <1NYgXmzZSEe4Bjew1P1JvHP-gtInrEU4FN6LzVPDpiw=.6b53dae2-2cbb-4262-ab7f-90d9f03dfa48@github.com> References: <1NYgXmzZSEe4Bjew1P1JvHP-gtInrEU4FN6LzVPDpiw=.6b53dae2-2cbb-4262-ab7f-90d9f03dfa48@github.com> Message-ID: On Thu, 17 Sep 2020 19:23:09 GMT, Robbin Ehn wrote: >> src/hotspot/share/runtime/handshake.cpp line 508: >> >>> 506: assert(op->_target == NULL || _handshakee == op->_target, "Wrong thread"); >>> 507: log_trace(handshake)("Processing handshake " INTPTR_FORMAT " by %s", p2i(op), >>> 508: executioner_name(current_thread, _handshakee, op == match_op)); >> >> With the above change we could even avoid factoring the code into executioner_name() and just do: >> log_trace(handshake)("Processing handshake " INTPTR_FORMAT " by %s%s", p2i(op), >> op == match_op ? "handshaker" : "cooperative", >> current_thread->is_VM_thread() ? "(VM Thread)" : "(JavaThread)"); > > I added a second log line where I use that function also: > log_trace(handshake)("Thread %s(" INTPTR_FORMAT ") executed %d ops for JavaThread: " INTPTR_FORMAT " %s target op: " > INTPTR_FORMAT, > executioner_name(current_thread, _handshakee, pr_ret == HandshakeState::_succeed), > p2i(current_thread), executed, p2i(_handshakee), > pr_ret == HandshakeState::_succeed ? "including" : "excluding", p2i(match_op)); Push commit 469f8fc, please have a look. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Thu Sep 17 19:51:24 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 17 Sep 2020 19:51:24 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: > This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes > are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) > Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an > arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a > singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first > pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first > pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake > operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything > except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed > filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake > operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there > is now only one method to call when handshaking one thread. Some comments about the changes: > - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake > all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much > worse then this. > > - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that > thread. > > - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a > handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. > > - Added WB testing method. > > - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. > > - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. > > - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a > safepoint and half of them after, in many handshake all operations. > > - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. > > - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the > moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. > > - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. > > - Added filtered queue and gtest for it. > > Passes multiple t1-8 runs. > Been through some pre-reviwing. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Removed double check, fix comment, removed not needed function, updated logs ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/151/files - new: https://git.openjdk.java.net/jdk/pull/151/files/86b83d05..469f8fc8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=01-02 Stats: 22 lines in 2 files changed: 1 ins; 12 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/151.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/151/head:pull/151 PR: https://git.openjdk.java.net/jdk/pull/151 From iveresov at openjdk.java.net Thu Sep 17 21:04:32 2020 From: iveresov at openjdk.java.net (Igor Veresov) Date: Thu, 17 Sep 2020 21:04:32 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v4] In-Reply-To: References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> Message-ID: On Wed, 16 Sep 2020 09:36:28 GMT, Jamsheed Mohammed C M wrote: >> Hi >> >> Moving the review that is based on mercurial repo to github. >> The history of conversation is >> [here](https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039861.html) >> Issue:[ JDK-8249451 ](https://bugs.openjdk.java.net/browse/JDK-8249451) >> >> @dholmes-ora could you please have a look. > > Jamsheed Mohammed C M has refreshed the contents of this pull request, and previous commits have been removed. The > incremental views will show differences compared to the previous content of the PR. The pull request contains one new > commit since the last revision: > comment modified wrt review feedback src/hotspot/share/prims/whitebox.cpp line 1056: > 1054: > 1055: // Compile method and check result > 1056: nmethod* nm = CompileBroker::compile_method(mh, bci, comp_level, mh, mh->invocation_count(), > CompileTask::Reason_Whitebox, CHECK_false); Shouldn't this be CHECK_NULL ? The function returns a pointer. ------------- PR: https://git.openjdk.java.net/jdk/pull/169 From iklam at openjdk.java.net Thu Sep 17 21:53:34 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 21:53:34 GMT Subject: Integrated: 8253313: xmlstream.hpp missing from vmIntrinsics.cpp Message-ID: Please review this quick/trivial fix for for build breakage. ------------- Commit messages: - 8253313: xmlstream.hpp missing from vmIntrinsics.cpp Changes: https://git.openjdk.java.net/jdk/pull/230/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=230&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253313 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/230.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/230/head:pull/230 PR: https://git.openjdk.java.net/jdk/pull/230 From mikael at openjdk.java.net Thu Sep 17 21:53:34 2020 From: mikael at openjdk.java.net (Mikael Vidstedt) Date: Thu, 17 Sep 2020 21:53:34 GMT Subject: Integrated: 8253313: xmlstream.hpp missing from vmIntrinsics.cpp In-Reply-To: References: Message-ID: <-B6lJ_7QaJ7Jnxi6Gj8aJYjeRkY0aO9rqo9hTt__6AY=.d2e5517c-63f4-4c1c-be65-a3d9f11ae048@github.com> On Thu, 17 Sep 2020 21:44:51 GMT, Ioi Lam wrote: > Please review this quick/trivial fix for for build breakage. Marked as reviewed by mikael (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/230 From iklam at openjdk.java.net Thu Sep 17 21:53:35 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 21:53:35 GMT Subject: Integrated: 8253313: xmlstream.hpp missing from vmIntrinsics.cpp In-Reply-To: References: Message-ID: On Thu, 17 Sep 2020 21:44:51 GMT, Ioi Lam wrote: > Please review this quick/trivial fix for for build breakage. This pull request has now been integrated. Changeset: 6c3e483b Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/6c3e483b Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8253313: xmlstream.hpp missing from vmIntrinsics.cpp Reviewed-by: mikael ------------- PR: https://git.openjdk.java.net/jdk/pull/230 From iklam at openjdk.java.net Thu Sep 17 22:34:52 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 22:34:52 GMT Subject: RFR: 8253314: precompiled.hpp missing from vmIntrinsics.cpp Message-ID: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> Sorry, please review yet another trivial fix for build breakage. ------------- Commit messages: - 8253314: precompiled.hpp missing from vmIntrinsics.cpp Changes: https://git.openjdk.java.net/jdk/pull/231/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=231&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253314 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/231.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/231/head:pull/231 PR: https://git.openjdk.java.net/jdk/pull/231 From mikael at openjdk.java.net Thu Sep 17 22:39:20 2020 From: mikael at openjdk.java.net (Mikael Vidstedt) Date: Thu, 17 Sep 2020 22:39:20 GMT Subject: RFR: 8253314: precompiled.hpp missing from vmIntrinsics.cpp In-Reply-To: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> References: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> Message-ID: On Thu, 17 Sep 2020 22:28:32 GMT, Ioi Lam wrote: > Sorry, please review yet another trivial fix for build breakage. Marked as reviewed by mikael (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/231 From iklam at openjdk.java.net Thu Sep 17 22:42:33 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 22:42:33 GMT Subject: Integrated: 8253314: precompiled.hpp missing from vmIntrinsics.cpp In-Reply-To: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> References: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> Message-ID: On Thu, 17 Sep 2020 22:28:32 GMT, Ioi Lam wrote: > Sorry, please review yet another trivial fix for build breakage. This pull request has now been integrated. Changeset: 2c3a37c6 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/2c3a37c6 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8253314: precompiled.hpp missing from vmIntrinsics.cpp Reviewed-by: mikael ------------- PR: https://git.openjdk.java.net/jdk/pull/231 From github.com+2249648+JohnTortugo at openjdk.java.net Thu Sep 17 22:55:40 2020 From: github.com+2249648+JohnTortugo at openjdk.java.net (John Tortugo) Date: Thu, 17 Sep 2020 22:55:40 GMT Subject: RFR: 8253314: precompiled.hpp missing from vmIntrinsics.cpp In-Reply-To: References: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> Message-ID: On Thu, 17 Sep 2020 22:36:22 GMT, Mikael Vidstedt wrote: >> Sorry, please review yet another trivial fix for build breakage. > > Marked as reviewed by mikael (Reviewer). @iklam - I had tested the PR that introduced this build break only using a release config (and more recently on a fastdebug). What other config did I miss beside `fastdebug`? Sorry for all this trouble. ------------- PR: https://git.openjdk.java.net/jdk/pull/231 From coleenp at openjdk.java.net Thu Sep 17 23:08:17 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 17 Sep 2020 23:08:17 GMT Subject: RFR: 8253314: precompiled.hpp missing from vmIntrinsics.cpp In-Reply-To: References: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> Message-ID: On Thu, 17 Sep 2020 22:53:02 GMT, John Tortugo wrote: >> Marked as reviewed by mikael (Reviewer). > > @iklam - I had tested the PR that introduced this build break only using a release config (and more recently on a > fastdebug). What other config did I miss beside `fastdebug`? > Sorry for all this trouble. Sorry I feel guilty because I suggested removing unneeded header files. Does the external submit still work? ------------- PR: https://git.openjdk.java.net/jdk/pull/231 From iklam at openjdk.java.net Thu Sep 17 23:08:17 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 23:08:17 GMT Subject: RFR: 8253314: precompiled.hpp missing from vmIntrinsics.cpp In-Reply-To: References: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> Message-ID: <8kfBRwGNGh8Q_FfVCM-HlmhihBSGh_07qTXgqqfNGiM=.f27c6467-5a8a-4753-a458-980732f21b71@github.com> On Thu, 17 Sep 2020 22:53:02 GMT, John Tortugo wrote: > @iklam - I had tested the PR that introduced this build break only using a release config (and more recently on a > fastdebug). What other config did I miss beside `fastdebug`? > Sorry for all this trouble. Our continuous integration pipeline tests a variety of builds. Some are configured with --disable-precompiled-headers, which detected [JDK-8253313](https://bugs.openjdk.java.net/browse/JDK-8253313). We also have Windows builds, which enable PCH and require that each file has precompiled.hpp. I was my fault. As a sponsor, I should have tested the changes before integrating the code. ------------- PR: https://git.openjdk.java.net/jdk/pull/231 From iklam at openjdk.java.net Thu Sep 17 23:11:40 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 17 Sep 2020 23:11:40 GMT Subject: RFR: 8253314: precompiled.hpp missing from vmIntrinsics.cpp In-Reply-To: References: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> Message-ID: On Thu, 17 Sep 2020 23:02:32 GMT, Coleen Phillimore wrote: > Sorry I feel guilty because I suggested removing unneeded header files. Does the external submit still work? The [pull request command](https://wiki.openjdk.java.net/display/SKARA/Pull+Request+Commands) `/test` should provide a way for external developers to do a basic build test, but I think it's not fully working yet. ------------- PR: https://git.openjdk.java.net/jdk/pull/231 From github.com+2249648+JohnTortugo at openjdk.java.net Thu Sep 17 23:11:40 2020 From: github.com+2249648+JohnTortugo at openjdk.java.net (John Tortugo) Date: Thu, 17 Sep 2020 23:11:40 GMT Subject: RFR: 8253314: precompiled.hpp missing from vmIntrinsics.cpp In-Reply-To: References: <9V2kGSwvW7edgPZfDzwwzXfC6DS1Bv0YgC_1DpF7pX8=.85bbadef-9e0f-4e7b-b142-6793a7959da8@github.com> Message-ID: On Thu, 17 Sep 2020 23:08:08 GMT, Ioi Lam wrote: >> Sorry I feel guilty because I suggested removing unneeded header files. Does the external submit still work? > >> Sorry I feel guilty because I suggested removing unneeded header files. Does the external submit still work? > > The [pull request command](https://wiki.openjdk.java.net/display/SKARA/Pull+Request+Commands) `/test` should provide a > way for external developers to do a basic build test, but I think it's not fully working yet. Thanks for clarifying. I should've tested better as well. I'll see with my peers here at MS to see if we can expand the range of configurations that we test. > The pull request command /test should provide a way for external developers to do a basic build test, but I think it's > not fully working yet. Thanks. I'll try that next time! ------------- PR: https://git.openjdk.java.net/jdk/pull/231 From andrew at openjdk.java.net Fri Sep 18 01:53:49 2020 From: andrew at openjdk.java.net (Andrew John Hughes) Date: Fri, 18 Sep 2020 01:53:49 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: On Thu, 17 Sep 2020 12:19:33 GMT, David Holmes wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copy-paste omission in bsd_zero > > Looks reasonable other than the ifdef ARM issue. Is the #if defined(ARM) referring to ARM in general here, or just 32-bit ARM? Because there appears to be no change to its definition, just rearrangement. The patch is quite hard to follow, and made even more difficult by multiple versions being attached to this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From shade at openjdk.java.net Fri Sep 18 05:03:55 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 18 Sep 2020 05:03:55 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: On Fri, 18 Sep 2020 01:51:19 GMT, Andrew John Hughes wrote: > Is the #if defined(ARM) referring to ARM in general here, or just 32-bit ARM? Because there appears to be no change to > its definition, just rearrangement. No, there is `ARM` (32) and then there is `AARCH64` (64). This is why AArch64 fails: it effectively uses compiler barriers only through the confusing application of #else branches. > The patch is quite hard to follow, and made even more difficult by multiple versions being attached to this PR. These are not multiple versions, those commits would be squashed into one on push -- that is how Skara workflow works. Look at the final version, which basically enables default (implicitly used by `ALPHA` and `AARCH64`) strong barriers -- that is the bug fix. It also keeps the light-weight barrier light for `X86` by calling it out explicitly. ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From jcm at openjdk.java.net Fri Sep 18 05:34:30 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Fri, 18 Sep 2020 05:34:30 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v4] In-Reply-To: References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> Message-ID: On Wed, 16 Sep 2020 18:17:41 GMT, Igor Veresov wrote: >> Jamsheed Mohammed C M has refreshed the contents of this pull request, and previous commits have been removed. The >> incremental views will show differences compared to the previous content of the PR. > > src/hotspot/share/prims/whitebox.cpp line 1056: > >> 1054: >> 1055: // Compile method and check result >> 1056: nmethod* nm = CompileBroker::compile_method(mh, bci, comp_level, mh, mh->invocation_count(), >> CompileTask::Reason_Whitebox, CHECK_false); > > Shouldn't this be CHECK_NULL ? The function returns a pointer. bool WhiteBox::compile_method returns a bool ------------- PR: https://git.openjdk.java.net/jdk/pull/169 From iveresov at openjdk.java.net Fri Sep 18 05:34:29 2020 From: iveresov at openjdk.java.net (Igor Veresov) Date: Fri, 18 Sep 2020 05:34:29 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v4] In-Reply-To: References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> Message-ID: On Wed, 16 Sep 2020 09:36:28 GMT, Jamsheed Mohammed C M wrote: >> Hi >> >> Moving the review that is based on mercurial repo to github. >> The history of conversation is >> [here](https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039861.html) >> Issue:[ JDK-8249451 ](https://bugs.openjdk.java.net/browse/JDK-8249451) >> >> @dholmes-ora could you please have a look. > > Jamsheed Mohammed C M has refreshed the contents of this pull request, and previous commits have been removed. The > incremental views will show differences compared to the previous content of the PR. Marked as reviewed by iveresov (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/169 From iveresov at openjdk.java.net Fri Sep 18 05:34:30 2020 From: iveresov at openjdk.java.net (Igor Veresov) Date: Fri, 18 Sep 2020 05:34:30 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v4] In-Reply-To: References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> Message-ID: On Fri, 18 Sep 2020 05:29:36 GMT, Jamsheed Mohammed C M wrote: >> src/hotspot/share/prims/whitebox.cpp line 1056: >> >>> 1054: >>> 1055: // Compile method and check result >>> 1056: nmethod* nm = CompileBroker::compile_method(mh, bci, comp_level, mh, mh->invocation_count(), >>> CompileTask::Reason_Whitebox, CHECK_false); >> >> Shouldn't this be CHECK_NULL ? The function returns a pointer. > > bool WhiteBox::compile_method returns a bool Yes, you're right of course. Reviewed. ------------- PR: https://git.openjdk.java.net/jdk/pull/169 From jcm at openjdk.java.net Fri Sep 18 05:51:07 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Fri, 18 Sep 2020 05:51:07 GMT Subject: Integrated: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. In-Reply-To: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> Message-ID: <9JLPHYjgFC7rjN0KZsRYO-biZ8dAGpqvaSiPLUr8hTw=.e0c6159b-0c6f-4f0c-92b9-cdf3a97553b7@github.com> On Tue, 15 Sep 2020 08:35:01 GMT, Jamsheed Mohammed C M wrote: > Hi > > Moving the review that is based on mercurial repo to github. > The history of conversation is > [here](https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039861.html) > Issue:[ JDK-8249451 ](https://bugs.openjdk.java.net/browse/JDK-8249451) > > @dholmes-ora could you please have a look. This pull request has now been integrated. Changeset: 73c9088b Author: Jamsheed Mohammed C M URL: https://git.openjdk.java.net/jdk/commit/73c9088b Stats: 218 lines in 21 files changed: 39 ins; 104 del; 75 mod 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. Reviewed-by: dholmes, iveresov ------------- PR: https://git.openjdk.java.net/jdk/pull/169 From aph at redhat.com Fri Sep 18 08:27:14 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 18 Sep 2020 09:27:14 +0100 Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: <55de54be-2fa7-330b-864f-be99de59264f@redhat.com> On 18/09/2020 06:03, Aleksey Shipilev wrote: > On Fri, 18 Sep 2020 01:51:19 GMT, Andrew John Hughes wrote: >> Is the #if defined(ARM) referring to ARM in general here, or just >> 32-bit ARM? Because there appears to be no change to its >> definition, just rearrangement. > > No, there is `ARM` (32) and then there is `AARCH64` (64). This is > why AArch64 fails: it effectively uses compiler barriers only > through the confusing application of #else branches. > >> The patch is quite hard to follow, and made even more difficult by >> multiple versions being attached to this PR. > > These are not multiple versions, those commits would be squashed > into one on push -- that is how Skara workflow works. Look at the > final version, which basically enables default (implicitly used by > `ALPHA` and `AARCH64`) strong barriers -- that is the bug fix. It > also keeps the light-weight barrier light for `X86` by calling it > out explicitly. It looks good enough to me. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From shade at openjdk.java.net Fri Sep 18 10:24:00 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 18 Sep 2020 10:24:00 GMT Subject: RFR: 8253344: Remove unimplemented Arguments::check_gc_consistency Message-ID: <4S47Rnn-YEhIDhqtjUFepcvV1fO05rdJpw-DyxU1LT4=.2c24e52d-1bdc-433f-ad46-593dea763d68@github.com> Looks like a leftover from JDK-8199925 that removed the definition, but left the declaration back. Testing: - [x] Linux x86_64 fastdebug build - [x] Text search for check_gc_consistency in `src/` ------------- Commit messages: - 8253344: Remove unimplemented Arguments::check_gc_consistency Changes: https://git.openjdk.java.net/jdk/pull/240/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=240&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253344 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/240.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/240/head:pull/240 PR: https://git.openjdk.java.net/jdk/pull/240 From shade at openjdk.java.net Fri Sep 18 10:30:42 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 18 Sep 2020 10:30:42 GMT Subject: RFR: 8253345: Remove unimplemented Arguments::lookup_logging_aliases Message-ID: Renamed with JDK-8146800, but old declaration was left behind. Testing: - [x] Linux x86_64 fastdebug build - [x] Text search for lookup_logging_aliases in src/ ------------- Commit messages: - 8253345: Remove unimplemented Arguments::lookup_logging_aliases Changes: https://git.openjdk.java.net/jdk/pull/241/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=241&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253345 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/241.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/241/head:pull/241 PR: https://git.openjdk.java.net/jdk/pull/241 From tschatzl at openjdk.java.net Fri Sep 18 10:43:00 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 18 Sep 2020 10:43:00 GMT Subject: RFR: 8253344: Remove unimplemented Arguments::check_gc_consistency In-Reply-To: <4S47Rnn-YEhIDhqtjUFepcvV1fO05rdJpw-DyxU1LT4=.2c24e52d-1bdc-433f-ad46-593dea763d68@github.com> References: <4S47Rnn-YEhIDhqtjUFepcvV1fO05rdJpw-DyxU1LT4=.2c24e52d-1bdc-433f-ad46-593dea763d68@github.com> Message-ID: On Fri, 18 Sep 2020 10:17:23 GMT, Aleksey Shipilev wrote: > Looks like a leftover from JDK-8199925 that removed the definition, but left the declaration back. > > Testing: > - [x] Linux x86_64 fastdebug build > - [x] Text search for check_gc_consistency in `src/` Looks good and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/240 From tschatzl at openjdk.java.net Fri Sep 18 10:44:03 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 18 Sep 2020 10:44:03 GMT Subject: RFR: 8253345: Remove unimplemented Arguments::lookup_logging_aliases In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 10:23:50 GMT, Aleksey Shipilev wrote: > Renamed with JDK-8146800, but old declaration was left behind. > > Testing: > > - [x] Linux x86_64 fastdebug build > - [x] Text search for lookup_logging_aliases in src/ Lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/241 From shade at openjdk.java.net Fri Sep 18 10:53:00 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 18 Sep 2020 10:53:00 GMT Subject: RFR: 8253348: Remove unimplemented JNIHandles::initialize Message-ID: Seems to be a left-over from JDK-8227053 that removed the definition, but left the declaration behind. Testing: - [x] Linux x86_64 fastdebug build - [x] Text search for `JNIHandles::initialize` in `src/hotspot` ------------- Commit messages: - 8253348: Remove unimplemented JNIHandles::initialize Changes: https://git.openjdk.java.net/jdk/pull/243/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=243&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253348 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/243.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/243/head:pull/243 PR: https://git.openjdk.java.net/jdk/pull/243 From avoitylov at openjdk.java.net Fri Sep 18 10:59:21 2020 From: avoitylov at openjdk.java.net (Aleksei Voitylov) Date: Fri, 18 Sep 2020 10:59:21 GMT Subject: RFR: JDK-8247589: Implementation of Alpine Linux/x64 Port [v2] In-Reply-To: <6jqlCPXe69fPRvYFrytJsECkaa9tJ1hYWISNgyPP4Eg=.40944ef5-93b0-4db4-948b-80bb7898e9e8@github.com> References: <6jqlCPXe69fPRvYFrytJsECkaa9tJ1hYWISNgyPP4Eg=.40944ef5-93b0-4db4-948b-80bb7898e9e8@github.com> Message-ID: On Mon, 14 Sep 2020 06:30:50 GMT, Aleksei Voitylov wrote: >> Marked as reviewed by dholmes (Reviewer). > > thank you Alan, Erik, and David! When the JEP becomes Targeted, I'll use this PR to integrate the changes. I added the contributors that could be found in the portola project commits. If anyone knows some other contributors I missed, I'll be happy to stand corrected. ------------- PR: https://git.openjdk.java.net/jdk/pull/49 From coleenp at openjdk.java.net Fri Sep 18 12:10:32 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 18 Sep 2020 12:10:32 GMT Subject: RFR: 8253344: Remove unimplemented Arguments::check_gc_consistency In-Reply-To: <4S47Rnn-YEhIDhqtjUFepcvV1fO05rdJpw-DyxU1LT4=.2c24e52d-1bdc-433f-ad46-593dea763d68@github.com> References: <4S47Rnn-YEhIDhqtjUFepcvV1fO05rdJpw-DyxU1LT4=.2c24e52d-1bdc-433f-ad46-593dea763d68@github.com> Message-ID: On Fri, 18 Sep 2020 10:17:23 GMT, Aleksey Shipilev wrote: > Looks like a leftover from JDK-8199925 that removed the definition, but left the declaration back. > > Testing: > - [x] Linux x86_64 fastdebug build > - [x] Text search for check_gc_consistency in `src/` Looks trivial to me. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/240 From leo.korinth at oracle.com Fri Sep 18 12:14:44 2020 From: leo.korinth at oracle.com (Leo Korinth) Date: Fri, 18 Sep 2020 14:14:44 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: <2ec9ad4b-a139-898a-28f6-256dc9b27862@oracle.com> References: <2ec9ad4b-a139-898a-28f6-256dc9b27862@oracle.com> Message-ID: > > Finally, why was it chosen that each node could carry precisely two > chunk-roots? It seems somewhat easier to have a one-to-one relation > between chunk-root and VirtualSpaceNode (when we are already so close > to one). > Answering to myself: Because of compressed class space of course. Then we could no longer have one huge VirtualSpaceNode for compressed space, and VirtualSpaceList would need extra logic for allocating into the compressed class address space. Or we would need more levels for the buddy allocator in compressed class space so that it could still be one huge node (with a huge root-chunk). Each of which is a non-trivial change. Sorry for my confusion. Thanks, Leo From coleen.phillimore at oracle.com Fri Sep 18 12:17:31 2020 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 18 Sep 2020 08:17:31 -0400 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: References: <2ec9ad4b-a139-898a-28f6-256dc9b27862@oracle.com> Message-ID: <8a6c5f05-0f7f-81d4-ced3-f597531a1583@oracle.com> Using the entire MetaspaceArena (used to be SpaceManager) for the class metaspace was sort of the simplest thing to do because InstanceKlasses aren't fixed size.? If they were, we could have picked something much simpler! Coleen On 9/18/20 8:14 AM, Leo Korinth wrote: > >> >> Finally, why was it chosen that each node could carry precisely two >> chunk-roots? It seems somewhat easier to have a one-to-one relation >> between chunk-root and VirtualSpaceNode (when we are already so close >> to one). >> > > Answering to myself: Because of compressed class space of course. Then we > could no longer have one huge VirtualSpaceNode for compressed space, and > VirtualSpaceList would need extra logic for allocating into the > compressed > class address space. Or we would need more levels for the buddy allocator > in compressed class space so that it could still be one huge node (with a > huge root-chunk). Each of which is a non-trivial change. > > Sorry for my confusion. > > Thanks, > Leo From zgu at openjdk.java.net Fri Sep 18 12:21:30 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 18 Sep 2020 12:21:30 GMT Subject: RFR: 8253348: Remove unimplemented JNIHandles::initialize In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 10:45:16 GMT, Aleksey Shipilev wrote: > Seems to be a left-over from JDK-8227053 that removed the definition, but left the declaration behind. > > Testing: > - [x] Linux x86_64 fastdebug build > - [x] Text search for `JNIHandles::initialize` in `src/hotspot` Looks good and trivial ------------- Marked as reviewed by zgu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/243 From aph at openjdk.java.net Fri Sep 18 12:56:02 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 18 Sep 2020 12:56:02 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: On Thu, 17 Sep 2020 12:26:44 GMT, Aleksey Shipilev wrote: >> There are some jcstress failures with AArch64 Zero. It seems because to happen because `orderAccess_linux_zero.hpp` >> defaults to compiler-only barriers for most OrderAccess::* calls. We need to defer to the strongest barriers by >> default. The code also needs some rearrangement to make the mappings clear. >> >> jcstress seems to capture the bug, and seems to pass when bug is fixed. Since this is `zero`, we want only the >> interpreter (compiler configs would be excessive). Release builds capture more samples. >> $ wget https://builds.shipilev.net/jcstress/jcstress-tests-all-20200917.jar >> $ build/linux-x86_64-server-release/images/jdk/bin/java -jar jcstress-tests-all-20200917.jar --jvmArgs "-Xint" >> >> Testing: >> - [x] x86_64 Linux zero release jcstress run >> - [ ] x86_64 MacOS zero release jcstress run (volunteers, please?) >> - [x] AArch64 Linux zero release jcstress run >> - [ ] PPC Linux zero release jcstress run (I can try later, or... ping @TheRealMDoerr ?) >> - [ ] ARM32 Linux zero release jcstress run (don't have fast ARM32 hosts, ping @bulasevich ?) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix copy-paste omission in bsd_zero Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From shade at openjdk.java.net Fri Sep 18 13:28:25 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 18 Sep 2020 13:28:25 GMT Subject: Integrated: 8253344: Remove unimplemented Arguments::check_gc_consistency In-Reply-To: <4S47Rnn-YEhIDhqtjUFepcvV1fO05rdJpw-DyxU1LT4=.2c24e52d-1bdc-433f-ad46-593dea763d68@github.com> References: <4S47Rnn-YEhIDhqtjUFepcvV1fO05rdJpw-DyxU1LT4=.2c24e52d-1bdc-433f-ad46-593dea763d68@github.com> Message-ID: On Fri, 18 Sep 2020 10:17:23 GMT, Aleksey Shipilev wrote: > Looks like a leftover from JDK-8199925 that removed the definition, but left the declaration back. > > Testing: > - [x] Linux x86_64 fastdebug build > - [x] Text search for check_gc_consistency in `src/` This pull request has now been integrated. Changeset: 6e9efffc Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/6e9efffc Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod 8253344: Remove unimplemented Arguments::check_gc_consistency Reviewed-by: tschatzl, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/240 From thomas.stuefe at gmail.com Fri Sep 18 13:28:23 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 18 Sep 2020 15:28:23 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: References: <2ec9ad4b-a139-898a-28f6-256dc9b27862@oracle.com> Message-ID: Hi Leo, On Fri, Sep 18, 2020 at 2:14 PM Leo Korinth wrote: > > > > > Finally, why was it chosen that each node could carry precisely two > > chunk-roots? It seems somewhat easier to have a one-to-one relation > > between chunk-root and VirtualSpaceNode (when we are already so close > > to one). > > > > Answering to myself: Because of compressed class space of course. Then we > could no longer have one huge VirtualSpaceNode for compressed space, and > VirtualSpaceList would need extra logic for allocating into the compressed > class address space. Or we would need more levels for the buddy allocator > in compressed class space so that it could still be one huge node (with a > huge root-chunk). Each of which is a non-trivial change. > > Some more details: About the mappings (nodes): as you said, we have the two cases, one where we reserve them ourselves, with a default size, one where we clamp a node atop of a pre-existing mapping of arbitrary length. The latter is the compressed class space case. In the future we may expand or change the usage, so I think this kind of flexibility is good to have. Mapping (node) size influences virtual size waste and vma fragmentation. Larger mappings waste more address space, which still matters on 32bit. They also waste real memory when living in large paged memory (not a concern for now). Smaller mappings means we have more of them, therefore the number of VMAs goes up. So the mapping size is a compromise between these two concerns. I chose 8MB as default size for a node for no other reason than it is close to what old Metaspace did. But not much thinking went into this number. I think we could easily make the nodes larger, e.g. 6-10 root chunks (24-40MB). -- Why not one root chunk per mapping? Many buddy allocators do that, make the whole mapping one giant root chunk. But implementation is easier if you require the starting address of a chunk to be aligned to its size. So, if we make the whole mapping one giant root chunk, a 4G compressed class space would have to be aligned to a 4GB starting address. Also, each mapping would have to be power 2 sized, so we could only have a 2GB or 4GB compressed class space, but nothing in between. Other implementations are possible of course, but these restrictions make the implementation simple, and I liked that. So I did separate mapping (node) from root chunks, made them a 1:n relation. That way the only restriction is that a node starting address and size have to be root chunk aligned, currently 4MB, which as a restriction is not too harsh. -- Finally, the root chunk size of 4MB was simply chosen because it is comfortably larger than anything we may want to allocate from Metaspace. I believe we could make it even smaller, maybe 1MB would be enough (largest Klass structure can be around 600K I believe). > Sorry for my confusion. > You don't seem confused :) The idea to equal mapping to root chunk 1:1 is valid and would make matters easier. Had we not larger-than-default-mappings it would make sense. > > Thanks, > Leo > Cheers, Thomas From shade at openjdk.java.net Fri Sep 18 13:29:44 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 18 Sep 2020 13:29:44 GMT Subject: Integrated: 8253345: Remove unimplemented Arguments::lookup_logging_aliases In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 10:23:50 GMT, Aleksey Shipilev wrote: > Renamed with JDK-8146800, but old declaration was left behind. > > Testing: > > - [x] Linux x86_64 fastdebug build > - [x] Text search for lookup_logging_aliases in src/ This pull request has now been integrated. Changeset: 43019a0e Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/43019a0e Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8253345: Remove unimplemented Arguments::lookup_logging_aliases Reviewed-by: tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/241 From thomas.stuefe at gmail.com Fri Sep 18 13:43:45 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 18 Sep 2020 15:43:45 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: <8a6c5f05-0f7f-81d4-ced3-f597531a1583@oracle.com> References: <2ec9ad4b-a139-898a-28f6-256dc9b27862@oracle.com> <8a6c5f05-0f7f-81d4-ced3-f597531a1583@oracle.com> Message-ID: Hi Coleen, On Fri, Sep 18, 2020 at 2:17 PM Coleen Phillimore < coleen.phillimore at oracle.com> wrote: > > Using the entire MetaspaceArena (used to be SpaceManager) for the class > metaspace was sort of the simplest thing to do because InstanceKlasses > aren't fixed size. If they were, we could have picked something much > simpler! > Coleen > > I actually thought about this - what could change if Klass would be uniform-sized - and I think it would not even simplify that much. You would have an array of uniform Klass structures living in a pre-reserved mapping. You don't want to pay upfront so you'd still want to somehow commit on demand. If only by pushing up a commit boundary like old metaspace did. You also still need a freelist for unused Klass structures. I think you would end up with something like the ChunkHeaderPool with added manual committing, and you would lose the uncommit-on-demand feature the new metaspace has. Re-using Metaspace instead makes sense. It could even make sense were the Klass structures uniform in size, complexity-wise. Cheers, Thomas > On 9/18/20 8:14 AM, Leo Korinth wrote: > > > >> > >> Finally, why was it chosen that each node could carry precisely two > >> chunk-roots? It seems somewhat easier to have a one-to-one relation > >> between chunk-root and VirtualSpaceNode (when we are already so close > >> to one). > >> > > > > Answering to myself: Because of compressed class space of course. Then we > > could no longer have one huge VirtualSpaceNode for compressed space, and > > VirtualSpaceList would need extra logic for allocating into the > > compressed > > class address space. Or we would need more levels for the buddy allocator > > in compressed class space so that it could still be one huge node (with a > > huge root-chunk). Each of which is a non-trivial change. > > > > Sorry for my confusion. > > > > Thanks, > > Leo > > From andrew at openjdk.java.net Fri Sep 18 15:12:17 2020 From: andrew at openjdk.java.net (Andrew John Hughes) Date: Fri, 18 Sep 2020 15:12:17 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: <6-iMFMRC_RMWAqQh-jm0tyhILQ20HQWd1rIYGHRSe9Y=.d1bd9582-419c-4268-83ee-239f3df3c32e@github.com> On Fri, 18 Sep 2020 12:52:57 GMT, Andrew Haley wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copy-paste omission in bsd_zero > > Marked as reviewed by aph (Reviewer). > > Is the #if defined(ARM) referring to ARM in general here, or just 32-bit ARM? Because there appears to be no change to > > its definition, just rearrangement. > > No, there is `ARM` (32) and then there is `AARCH64` (64). This is why AArch64 fails: it effectively uses compiler > barriers only through the confusing application of #else branches. Ok, patch looks ok then. That seems to differ from the other architectures, where 'PPC' and 'X86' are referring to both 32- and 64-bit variants. > > > The patch is quite hard to follow, and made even more difficult by multiple versions being attached to this PR. > > These are not multiple versions, those commits would be squashed into one on push -- that is how Skara workflow works. > Look at the final version, which basically enables default (implicitly used by `ALPHA` and `AARCH64`) strong > barriers -- that is the bug fix. It also keeps the light-weight barrier light for `X86` by calling it out explicitly. If there are not multiple versions, how is there a 'final version'? :-) I eventually found the webrev with the full and final patch. One would normally rebase the patch with amendments, not pile commit on commit and then leave it to a bot to rebase. I don't like the idea that the final commit will look different to what is being reviewed. ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From pchilanomate at openjdk.java.net Fri Sep 18 15:28:32 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Fri, 18 Sep 2020 15:28:32 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: References: Message-ID: <-wc1ckdaxM1kpG5HN4pNbGrvUvFby33RuTFtJgZIiJ0=.6c930a8f-2fa6-4e0d-9e6e-4aa99cad66ce@github.com> On Thu, 17 Sep 2020 18:57:33 GMT, Patricio Chilano Mateo wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed double checks >> Added NSV >> ProcessResult to enum >> Fixed logging >> Moved _active_handshaker to private > > Changes look good, thanks for fixing! I added some comments on the changes. Update looks good, thanks Robbin! ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From andrew at openjdk.java.net Fri Sep 18 18:31:19 2020 From: andrew at openjdk.java.net (Andrew John Hughes) Date: Fri, 18 Sep 2020 18:31:19 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: On Thu, 17 Sep 2020 12:26:44 GMT, Aleksey Shipilev wrote: >> There are some jcstress failures with AArch64 Zero. It seems because to happen because `orderAccess_linux_zero.hpp` >> defaults to compiler-only barriers for most OrderAccess::* calls. We need to defer to the strongest barriers by >> default. The code also needs some rearrangement to make the mappings clear. >> >> jcstress seems to capture the bug, and seems to pass when bug is fixed. Since this is `zero`, we want only the >> interpreter (compiler configs would be excessive). Release builds capture more samples. >> $ wget https://builds.shipilev.net/jcstress/jcstress-tests-all-20200917.jar >> $ build/linux-x86_64-server-release/images/jdk/bin/java -jar jcstress-tests-all-20200917.jar --jvmArgs "-Xint" >> >> Testing: >> - [x] x86_64 Linux zero release jcstress run >> - [ ] x86_64 MacOS zero release jcstress run (volunteers, please?) >> - [x] AArch64 Linux zero release jcstress run >> - [ ] PPC Linux zero release jcstress run (I can try later, or... ping @TheRealMDoerr ?) >> - [ ] ARM32 Linux zero release jcstress run (don't have fast ARM32 hosts, ping @bulasevich ?) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix copy-paste omission in bsd_zero Marked as reviewed by andrew (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From andrew at openjdk.java.net Fri Sep 18 18:36:40 2020 From: andrew at openjdk.java.net (Andrew John Hughes) Date: Fri, 18 Sep 2020 18:36:40 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: <-W4jAzgVEmTR_6igxG8JADftd6h6cjLGTheGxchXACw=.0ad3c613-3306-482f-b36f-b888a10436f3@github.com> On Fri, 18 Sep 2020 18:28:32 GMT, Andrew John Hughes wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copy-paste omission in bsd_zero > > Marked as reviewed by andrew (Reviewer). Why is there this /reviewer command that doesn't work? ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From dcubed at openjdk.java.net Fri Sep 18 19:48:05 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 18 Sep 2020 19:48:05 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Thu, 17 Sep 2020 19:51:24 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Removed double check, fix comment, removed not needed function, updated logs src/hotspot/share/prims/jvmtiEnvBase.cpp line 653: > 651: JvmtiEnvBase::get_current_contended_monitor(JavaThread *calling_thread, JavaThread *java_thread, > jobject *monitor_ptr) { 652: Thread *current_thread = Thread::current(); > 653: assert(java_thread->is_handshake_safe_for(current_thread), I like how this assert reads now! ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From dcubed at openjdk.java.net Fri Sep 18 21:01:00 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 18 Sep 2020 21:01:00 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: References: Message-ID: <-G5jT6J3J3UQwMRtEB9-PCvPtyZKNLI6wwtCFg6pUDI=.c669352c-ac48-4eac-a1e2-6c86ec32b894@github.com> On Thu, 17 Sep 2020 12:07:15 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Fixed double checks > Added NSV > ProcessResult to enum > Fixed logging > Moved _active_handshaker to private src/hotspot/share/prims/jvmtiEventController.cpp line 340: > 338: } else { > 339: Handshake::execute(&hs, target); > 340: } This guarantee() that the handshake has executed doesn't have an equivalent in the rewritten code. Should there be some way of verifying this condition from this location? ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From dcubed at openjdk.java.net Fri Sep 18 21:00:59 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 18 Sep 2020 21:00:59 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Thu, 17 Sep 2020 19:51:24 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Removed double check, fix comment, removed not needed function, updated logs Thumbs up. I don't think I have anything that is in the must fix category. src/hotspot/share/prims/jvmtiEnvBase.cpp line 691: > 689: // Note: > 690: // calling_thread is the thread that requested the list of monitors for java_thread. > 691: // java_thread is thread owning the monitors. s/is thread/is the thread/ src/hotspot/share/prims/jvmtiEnvBase.cpp line 692: > 690: // calling_thread is the thread that requested the list of monitors for java_thread. > 691: // java_thread is thread owning the monitors. > 692: // current_thread is thread executint this code, can be a non-JavaThread (e.g. VM Thread). typo - s/executint/executing/ grammar - s/e.g./e.g.,/ src/hotspot/share/prims/jvmtiEnvBase.cpp line 693: > 691: // java_thread is thread owning the monitors. > 692: // current_thread is thread executint this code, can be a non-JavaThread (e.g. VM Thread). > 693: // And they all maybe different threads. typo - (in this context) - s/maybe/may be/ src/hotspot/share/prims/jvmtiEnvBase.hpp line 341: > 339: class JvmtiHandshakeClosure : public HandshakeClosure { > 340: protected: > 341: jvmtiError _result; Thanks for pushing the jvmtiError into common code for JVM/TI handshakes. src/hotspot/share/runtime/handshake.hpp line 108: > 106: _processed, > 107: _succeed, > 108: _number_states Why are these indented by 4 spaces instead of 2 spaces? src/hotspot/share/runtime/handshake.cpp line 70: > 68: : HandshakeOperation(cl, target), _start_time_ns(start_ns) {} > 69: virtual ~AsyncHandshakeOperation() { delete _handshake_cl; }; > 70: jlong start_time() { return _start_time_ns; } Should this be 'const'? Ignore it if it would fan out too much. src/hotspot/share/runtime/handshake.cpp line 349: > 347: target->handshake_state()->add_operation(op); > 348: } else { > 349: log_handshake_info(start_time_ns, op->name(), 0, 0, "(thread dead)"); It might be useful to also log the 'target' thread value here so: .... (thread= is dead)" Might be something like this: log_handshake_info(start_time_ns, op->name(), 0, 0, "(thread=" INTPTR_FORMAT " is dead)", p2i(target)); Although you'd probably have to form the string in a buffer and then pass it to the log_handshake_info() call... sigh... src/hotspot/share/runtime/handshake.cpp line 450: > 448: return false; > 449: } > 450: // Operations are added without lock and then the poll is armed. s/without lock/lock free/ src/hotspot/share/runtime/handshake.cpp line 479: > 477: } > 478: > 479: // If we own the mutex at this point and while owning the mutex grammar - s/owning the mutex/owning the mutex we/ src/hotspot/share/runtime/interfaceSupport.inline.hpp line 157: > 155: > 156: // Threads shouldn't block if they are in the middle of printing, but... > 157: ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); Can you explain why you had to add this? Did something show up in testing? src/hotspot/share/runtime/thread.cpp line 487: > 485: assert(!thread->is_Java_thread() || > 486: ((JavaThread *) thread)->is_handshake_safe_for(Thread::current()) || > 487: !((JavaThread *) thread)->on_thread_list() || Should use "thread->as_Java_thread()" instead of the cast here (2 places). src/hotspot/share/runtime/thread.hpp line 1360: > 1358: bool is_handshake_safe_for(Thread* th) const { > 1359: return _handshake.active_handshaker() == th || > 1360: this == th; I _think_ L1359-60 will fit on one line... src/hotspot/share/utilities/filterQueue.inline.hpp line 35: > 33: FilterQueueNode* head; > 34: FilterQueueNode* insnode = new FilterQueueNode(data); > 35: SpinYield yield(SpinYield::default_spin_limit * 10); // Very unlikely with mutiple failed CAS. Typo - s/mutiple/multiple/ src/hotspot/share/utilities/filterQueue.inline.hpp line 76: > 74: return (E)NULL; > 75: } > 76: SpinYield yield(SpinYield::default_spin_limit * 10); // Very unlikely with mutiple failed CAS. typo - s/mutiple/multiple/ src/hotspot/share/prims/whitebox.cpp line 2032: > 2030: void do_thread(Thread* th) { > 2031: assert(th->is_Java_thread(), "sanity"); > 2032: JavaThread* jt = (JavaThread*)th; Can whitebox.cpp code use the new as_Java_thread() call? src/hotspot/share/prims/whitebox.cpp line 2033: > 2031: assert(th->is_Java_thread(), "sanity"); > 2032: JavaThread* jt = (JavaThread*)th; > 2033: ResourceMark rm; It also might be interesting to print the "current thread" info here so that someone looking at the test output knows which thread handled the handshake (the target or a surrogate). src/hotspot/share/runtime/handshake.hpp line 45: > 43: // a single target/direct handshake or not, by the JavaThread that requested the > 44: // handshake or the VMThread respectively. > 45: class HandshakeClosure : public ThreadClosure, public CHeapObj { Just to be clear. You haven't added support for a handshake that must only be executed by the target thread yet, right? That's future work, if I remember correctly... ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/151 From dcubed at openjdk.java.net Fri Sep 18 21:52:28 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 18 Sep 2020 21:52:28 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() In-Reply-To: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> Message-ID: On Thu, 17 Sep 2020 13:57:04 GMT, Richard Reingruber wrote: > After JDK-8252414 the safepoint/handshake code does not take _suspend_flags into accout anymore in its assessment if a > thread is safepoint/handshake safe. This change updates the comment on > JavaThread::java_suspend_self_with_safepoint_check(). I have (not yet) fixed the line breaks (fill-paragraph in emacs > lingo) for a clearer diff. > Also I could inline the (*) footnote. The updated wording in the comment looks good to me. I still get a headache when thinking about the associated suspend/resume races, but that's my problem and not yours. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/225 From dcubed at openjdk.java.net Fri Sep 18 21:55:21 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 18 Sep 2020 21:55:21 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() In-Reply-To: References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> Message-ID: On Fri, 18 Sep 2020 21:49:46 GMT, Daniel D. Daugherty wrote: >> After JDK-8252414 the safepoint/handshake code does not take _suspend_flags into accout anymore in its assessment if a >> thread is safepoint/handshake safe. This change updates the comment on >> JavaThread::java_suspend_self_with_safepoint_check(). I have (not yet) fixed the line breaks (fill-paragraph in emacs >> lingo) for a clearer diff. >> Also I could inline the (*) footnote. > > The updated wording in the comment looks good to me. > > I still get a headache when thinking about the associated > suspend/resume races, but that's my problem and not yours. Please wait for @dholmes-ora to review this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/225 From daniel.daugherty at oracle.com Fri Sep 18 21:56:39 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 18 Sep 2020 17:56:39 -0400 Subject: Question on JavaThread::is_ext_suspend_completed() In-Reply-To: References: Message-ID: <7d5ccc22-b252-29a4-dcac-e41c61cf90a1@oracle.com> Since T is in _thread_in_native, it can't execute Java bytecode with transitioning back to thread_in_Java and it can't execute Java bytecode equivalents (raw monitor enter) without transitioning to thread_in_vm. In case of either of those transitions, the external suspend request should be observed and T should self-suspend at that point. Dan On 9/16/20 6:00 PM, Reingruber, Richard wrote: > I think I can answer this myself now :) > > C sets _external_suspend in T's _suspend_flags then it reads T's thread state > (with a store-load barrier between write and read). If C sees _thread_in_native > than this read happened before the thread state change. This means T will see > _external_suspend when checking _suspend_flags after changing the thread state > to _thread_in_native_trans (again with store-load barrier). > > So if C sees _thread_in_native it can be sure that T is effectively suspended. > > Thanks, Richard. > > -----Original Message----- > From: Reingruber, Richard > Sent: Mittwoch, 16. September 2020 23:06 > To: Hotspot dev runtime > Subject: Question on JavaThread::is_ext_suspend_completed() > > Hi, > > I've got a question on JavaThread::is_ext_suspend_completed(): > > Current thread C loads the thread state of target thread T [1]. Assume it > observes _thread_in_native and also that T has a walkable stack. In this case > is_ext_suspend_completed() returns true [2]. If called by JavaThread::java_suspend(), then this > method will also return. > > I don't see the synchronization that shields against C seeing stale values of T's > thread state and frame anchor. To me it looks as if T could be executing java > bytecodes while C observes a stale state making the wrong conclusion that T > is effectively suspended. > > What am I missing? > > I'd think that a sleep just before returning true could trigger the issue, can't it? [4] > > Thanks, Richard. > > [1] Loading thread state > https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L671 > > [2] JavaThread::is_ext_suspend_completed() Returns true if save_state == _thread_in_native && frame_anchor()->walkable() > https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L686 > > [3] JavaThread::java_suspend() returns if JavaThread::is_ext_suspend_completed() returns true. > https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L2512 > > [4] Can a sleep before returning (see link below) trigger the issue? > https://github.com/openjdk/jdk/blob/1c84cfa2364fa18fc028df89bdc4de207365784f/src/hotspot/share/runtime/thread.cpp#L691 From shade at openjdk.java.net Sat Sep 19 14:09:26 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Sat, 19 Sep 2020 14:09:26 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: <-W4jAzgVEmTR_6igxG8JADftd6h6cjLGTheGxchXACw=.0ad3c613-3306-482f-b36f-b888a10436f3@github.com> References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> <-W4jAzgVEmTR_6igxG8JADftd6h6cjLGTheGxchXACw=.0ad3c613-3306-482f-b36f-b888a10436f3@github.com> Message-ID: On Fri, 18 Sep 2020 18:34:10 GMT, Andrew John Hughes wrote: > Why is there this /reviewer command that doesn't work? Reviewer command is to tell the bot who are the off-GH reviewers are. In retrospect, I could have just mentioned you myself here... ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From rrich at openjdk.java.net Sat Sep 19 21:21:48 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Sat, 19 Sep 2020 21:21:48 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() In-Reply-To: References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> Message-ID: On Fri, 18 Sep 2020 21:49:46 GMT, Daniel D. Daugherty wrote: > The updated wording in the comment looks good to me. Thanks for looking over it. > I still get a headache when thinking about the associated > suspend/resume races, but that's my problem and not yours. I'd think the suspend/resume implementation could be simplified if the suspender would always direct handshake the suspendee. JavaThread::is_ext_suspend_completed() could be removed then. Likewise the suspend equivalent optimization. And I would not expect an impact on performance by doing it. ------------- PR: https://git.openjdk.java.net/jdk/pull/225 From rrich at openjdk.java.net Sat Sep 19 21:21:48 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Sat, 19 Sep 2020 21:21:48 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() In-Reply-To: References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> Message-ID: On Fri, 18 Sep 2020 21:52:31 GMT, Daniel D. Daugherty wrote: > Please wait for @dholmes-ora to review this change. Sure. ------------- PR: https://git.openjdk.java.net/jdk/pull/225 From iklam at openjdk.java.net Sun Sep 20 05:51:53 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sun, 20 Sep 2020 05:51:53 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding Message-ID: In product builds (-O3 without -g), when initializing a `PackageEntry`, gcc for x64 stores garbage into the structure padding slot behind `BasicHashtableEntry::_hash` in `BasicHashtable::new_entry()`. The garbage value turns out to be the upper bits of a `Symbol*`, which are different on every run of `java -Xshare:dump`. As a result, `java -Xshare:dump` cannot reproduce deterministic result. The fix is to avoid copying contents of the structure padding slots when copying `PackageEntry` and `ModuleEntry` objects into the CDS archive. ------------- Commit messages: - 8253079: runtime/cds/DeterministicDump.java fails due to garbage in structure padding Changes: https://git.openjdk.java.net/jdk/pull/267/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=267&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253079 Stats: 53 lines in 5 files changed: 51 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/267.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/267/head:pull/267 PR: https://git.openjdk.java.net/jdk/pull/267 From dholmes at openjdk.java.net Sun Sep 20 13:17:23 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 20 Sep 2020 13:17:23 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> <-W4jAzgVEmTR_6igxG8JADftd6h6cjLGTheGxchXACw=.0ad3c613-3306-482f-b36f-b888a10436f3@github.com> Message-ID: <2TPZlh0RarZ-kgwpkx01fkFMMfxwXu07OG0OgZSYEUM=.e5052c74-3c4e-435b-b710-e78312af5517@github.com> On Sat, 19 Sep 2020 14:06:52 GMT, Aleksey Shipilev wrote: >> Why is there this /reviewer command that doesn't work? > >> Why is there this /reviewer command that doesn't work? > > Reviewer command is to tell the bot who are the off-GH reviewers are. In retrospect, I could have just mentioned you > myself here... The final version of this is the "sum" of the applied commits. When viewing the "changed files" you can select which commit(s) to look at. There is no need to rebase (nor is it even desirable) as you will lose the context of review comments. Final update looks fine to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From dholmes at openjdk.java.net Sun Sep 20 13:17:23 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 20 Sep 2020 13:17:23 GMT Subject: RFR: 8253284: Zero OrderAccess barrier mappings are incorrect [v2] In-Reply-To: References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: On Thu, 17 Sep 2020 12:26:44 GMT, Aleksey Shipilev wrote: >> There are some jcstress failures with AArch64 Zero. It seems because to happen because `orderAccess_linux_zero.hpp` >> defaults to compiler-only barriers for most OrderAccess::* calls. We need to defer to the strongest barriers by >> default. The code also needs some rearrangement to make the mappings clear. >> >> jcstress seems to capture the bug, and seems to pass when bug is fixed. Since this is `zero`, we want only the >> interpreter (compiler configs would be excessive). Release builds capture more samples. >> $ wget https://builds.shipilev.net/jcstress/jcstress-tests-all-20200917.jar >> $ build/linux-x86_64-server-release/images/jdk/bin/java -jar jcstress-tests-all-20200917.jar --jvmArgs "-Xint" >> >> Testing: >> - [x] x86_64 Linux zero release jcstress run >> - [ ] x86_64 MacOS zero release jcstress run (volunteers, please?) >> - [x] AArch64 Linux zero release jcstress run >> - [x] PPC Linux zero release jcstress run >> - [ ] ARM32 Linux zero release jcstress run (don't have fast ARM32 hosts, ping @bulasevich ? It would seem ARM32 is >> broken already) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix copy-paste omission in bsd_zero Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From minqi at openjdk.java.net Sun Sep 20 22:46:12 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Sun, 20 Sep 2020 22:46:12 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding In-Reply-To: References: Message-ID: <2eT-WwuHrvPdtFuobl63gewSxKWSnvEVuOMTHaXxYrM=.abf2e8a7-184e-4509-93db-cd41ec3f301f@github.com> On Sun, 20 Sep 2020 05:37:33 GMT, Ioi Lam wrote: > In product builds (-O3 without -g), when initializing a `PackageEntry`, gcc for x64 stores garbage into the structure > padding slot behind `BasicHashtableEntry::_hash` in `BasicHashtable::new_entry()`. The garbage value turns out to be > the upper bits of a `Symbol*`, which are different on every run of `java -Xshare:dump`. As a result, > `java -Xshare:dump` cannot reproduce deterministic result. The fix is to avoid copying contents of the structure > padding slots when copying `PackageEntry` and `ModuleEntry` objects into the CDS archive. The implementation is quite complex, every derivatives from BasicHashTableEntry need to have a copy_from function to avoid such problem. Maybe a brutal one to avoid such padding issue for all cases is fill the object allocated in AllocateHeap with \0? ------------- PR: https://git.openjdk.java.net/jdk/pull/267 From zgu at openjdk.java.net Sun Sep 20 23:09:20 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Sun, 20 Sep 2020 23:09:20 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v5] In-Reply-To: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > post_run() method. Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Fix metaspace remapping on Windows ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/185/files - new: https://git.openjdk.java.net/jdk/pull/185/files/c49380ef..6b927b54 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/185.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/185/head:pull/185 PR: https://git.openjdk.java.net/jdk/pull/185 From dholmes at openjdk.java.net Mon Sep 21 03:44:59 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 21 Sep 2020 03:44:59 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v5] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Sun, 20 Sep 2020 23:09:20 GMT, Zhengyu Gu wrote: >> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor >> before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues >> alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the >> thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to >> post_run() method. > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Fix metaspace remapping on Windows Changes requested by dholmes (Reviewer). src/hotspot/share/runtime/thread.cpp line 1341: > 1339: unregister_thread_stack_with_NMT(); > 1340: set_stack_base(NULL); > 1341: set_stack_size(0); Sorry but I just don't see any need for clearing stack_base or stack_size, and then you don't need to mess with the assertion in stack_base(). ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From dholmes at openjdk.java.net Mon Sep 21 04:17:29 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 21 Sep 2020 04:17:29 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() In-Reply-To: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> Message-ID: On Thu, 17 Sep 2020 13:57:04 GMT, Richard Reingruber wrote: > After JDK-8252414 the safepoint/handshake code does not take _suspend_flags into accout anymore in its assessment if a > thread is safepoint/handshake safe. This change updates the comment on > JavaThread::java_suspend_self_with_safepoint_check(). I have (not yet) fixed the line breaks (fill-paragraph in emacs > lingo) for a clearer diff. > Also I could inline the (*) footnote. src/hotspot/share/runtime/thread.cpp line 2599: > 2597: // safepoint/handshake code will count it as safepoint/handshake safe. Also it allows > 2598: // another thread to continue if it is waiting in is_ext_suspend_completed() for this > 2599: // thread to change state from _thread_in_native_trans to the target state. The revised wording doesn't really convey the situation to me. We _have_ to set the thread-state to _thread_blocked so that a thread waiting in _is_ext_suspend_completed can proceed (there is no generic "target" state - it must be _thread_blocked). I would simplify and rephrase as follows: "We have to set the thread state directly to _thread_blocked so that it will be seen to be safepoint/handshake safe whilst suspended. This is also necessary to allow a thread in is_ext_suspend_completed, that observed the _thread_in_native_trans state, to proceed." ------------- PR: https://git.openjdk.java.net/jdk/pull/225 From ccheung at openjdk.java.net Mon Sep 21 04:39:37 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 21 Sep 2020 04:39:37 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v5] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Sun, 20 Sep 2020 23:09:20 GMT, Zhengyu Gu wrote: >> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor >> before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues >> alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the >> thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to >> post_run() method. > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Fix metaspace remapping on Windows I've tried this patch. It passed tier1 tests and 50 runs of the dynamicArchive/methodHandles/MethodHandlesAsCollectorTest.java test on Windows with vm options: -XX:MaxRAMPercentage=3 -XX:NativeMemoryTracking=detail ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From shade at openjdk.java.net Mon Sep 21 05:03:00 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 21 Sep 2020 05:03:00 GMT Subject: Integrated: 8253348: Remove unimplemented JNIHandles::initialize In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 10:45:16 GMT, Aleksey Shipilev wrote: > Seems to be a left-over from JDK-8227053 that removed the definition, but left the declaration behind. > > Testing: > - [x] Linux x86_64 fastdebug build > - [x] Text search for `JNIHandles::initialize` in `src/hotspot` This pull request has now been integrated. Changeset: 388c8f25 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/388c8f25 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod 8253348: Remove unimplemented JNIHandles::initialize Reviewed-by: zgu ------------- PR: https://git.openjdk.java.net/jdk/pull/243 From iklam at openjdk.java.net Mon Sep 21 05:23:57 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 21 Sep 2020 05:23:57 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding [v2] In-Reply-To: References: Message-ID: > In product builds (-O3 without -g), when initializing a `PackageEntry`, gcc for x64 stores garbage into the structure > padding slot behind `BasicHashtableEntry::_hash` in `BasicHashtable::new_entry()`. The garbage value turns out to be > the upper bits of a `Symbol*`, which are different on every run of `java -Xshare:dump`. As a result, > `java -Xshare:dump` cannot reproduce deterministic result. The fix is to avoid copying contents of the structure > padding slots when copying `PackageEntry` and `ModuleEntry` objects into the CDS archive. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Zero out memory for newly allocated HashtableEntry - Merge branch 'master' of https://github.com/iklam/jdk into 8253079-DeterministicDump-test-fails-product-build - Revert "8253079: runtime/cds/DeterministicDump.java fails due to garbage in structure padding" This reverts commit d380e0213283f5f28889d867bb2505a91781df6e. - 8253079: runtime/cds/DeterministicDump.java fails due to garbage in structure padding ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/267/files - new: https://git.openjdk.java.net/jdk/pull/267/files/d380e021..225c9272 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=267&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=267&range=00-01 Stats: 195 lines in 14 files changed: 58 ins; 103 del; 34 mod Patch: https://git.openjdk.java.net/jdk/pull/267.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/267/head:pull/267 PR: https://git.openjdk.java.net/jdk/pull/267 From iklam at openjdk.java.net Mon Sep 21 05:25:54 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 21 Sep 2020 05:25:54 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding In-Reply-To: <2eT-WwuHrvPdtFuobl63gewSxKWSnvEVuOMTHaXxYrM=.abf2e8a7-184e-4509-93db-cd41ec3f301f@github.com> References: <2eT-WwuHrvPdtFuobl63gewSxKWSnvEVuOMTHaXxYrM=.abf2e8a7-184e-4509-93db-cd41ec3f301f@github.com> Message-ID: On Sun, 20 Sep 2020 22:43:43 GMT, Yumin Qi wrote: > The implementation is quite complex, every derivatives from BasicHashTableEntry need to have a copy_from function to > avoid such problem. Maybe a brutal one to avoid such padding issue for all cases is fill the object allocated in > AllocateHeap with \0? You're right. My original analysis was wrong: set_hash() didn't write garbage into the padding. Instead, the garbage was there because AllocaHeap didn't initialize the new buffer in product builds. I reverted the original fix. Instead, I added code to call memset() when allocating a new hashtable entry (but only when DumpSharedSpaces is true). ------------- PR: https://git.openjdk.java.net/jdk/pull/267 From redestad at openjdk.java.net Mon Sep 21 05:30:50 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Mon, 21 Sep 2020 05:30:50 GMT Subject: RFR: 8253397: Ensure LogTag types are sorted Message-ID: - Sort LogTag type enum alphabetically - Assert that the tags are sorted instead of sorting ------------- Commit messages: - Ensure LogTag types are sorted Changes: https://git.openjdk.java.net/jdk/pull/274/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=274&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253397 Stats: 58 lines in 2 files changed: 24 ins; 22 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/274.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/274/head:pull/274 PR: https://git.openjdk.java.net/jdk/pull/274 From iklam at openjdk.java.net Mon Sep 21 05:37:38 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 21 Sep 2020 05:37:38 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v5] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Sun, 20 Sep 2020 23:09:20 GMT, Zhengyu Gu wrote: >> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor >> before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues >> alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the >> thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to >> post_run() method. > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Fix metaspace remapping on Windows Changes requested by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From iklam at openjdk.java.net Mon Sep 21 05:37:39 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 21 Sep 2020 05:37:39 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v5] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Mon, 21 Sep 2020 03:42:25 GMT, David Holmes wrote: >> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix metaspace remapping on Windows > > src/hotspot/share/runtime/thread.cpp line 1341: > >> 1339: unregister_thread_stack_with_NMT(); >> 1340: set_stack_base(NULL); >> 1341: set_stack_size(0); > > Sorry but I just don't see any need for clearing stack_base or stack_size, and then you don't need to mess with the > assertion in stack_base(). I agree with David that the calls to `set_stack_base(NULL); set_stack_size(0);` do not seem to be related to this bug. Also, previously only `set_stack_base(NULL);` was called -- on all Threads. Now: - NonJavaThread: `set_stack_base(NULL); set_stack_size(0);` - JavaThread: neither operation is done I am wondering: - why the behavior is changed - is zeroing the stack_base/size necessary for NMT code only, or is it necessary for non-NMT operations as well? Will any test cases fail if you remove line 1340/1341? ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From ccheung at openjdk.java.net Mon Sep 21 05:44:54 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 21 Sep 2020 05:44:54 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v5] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Mon, 21 Sep 2020 05:34:52 GMT, Ioi Lam wrote: >> src/hotspot/share/runtime/thread.cpp line 1341: >> >>> 1339: unregister_thread_stack_with_NMT(); >>> 1340: set_stack_base(NULL); >>> 1341: set_stack_size(0); >> >> Sorry but I just don't see any need for clearing stack_base or stack_size, and then you don't need to mess with the >> assertion in stack_base(). > > I agree with David that the calls to `set_stack_base(NULL); set_stack_size(0);` do not seem to be related to this bug. > Also, previously only `set_stack_base(NULL);` was called -- on all Threads. Now: > - NonJavaThread: `set_stack_base(NULL); set_stack_size(0);` > - JavaThread: neither operation is done > > I am wondering: > > - why the behavior is changed > - is zeroing the stack_base/size necessary for NMT code only, or is it necessary for non-NMT operations as well? > > Will any test cases fail if you remove line 1340/1341? I've tried without clearing stack_base and stack_size and also reverted the change in stack_base() in thread.hpp. It passed tier1 testing. I'll run more test. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From shade at openjdk.java.net Mon Sep 21 05:51:13 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 21 Sep 2020 05:51:13 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding [v2] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:23:57 GMT, Ioi Lam wrote: >> (EDITED) In product builds, when `PackageEntry` and `ModuleEntry` objects are allocated, the memory is not zeroed. As a >> result, the structure padding slots (such as the 32-bits after `BasicHashtableEntry::_hash`) may contain garbage values >> that are different on every run of `java -Xshare:dump`. As a result, `java -Xshare:dump` cannot reproduce deterministic >> result. The fix is to clear the memory for the newly allocated `HashtableEntry` objects when `DumpSharedSpaces == >> true`. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes > the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last > revision: > - Zero out memory for newly allocated HashtableEntry > - Merge branch 'master' of https://github.com/iklam/jdk into 8253079-DeterministicDump-test-fails-product-build > - Revert "8253079: runtime/cds/DeterministicDump.java fails due to garbage in structure padding" > > This reverts commit d380e0213283f5f28889d867bb2505a91781df6e. > - 8253079: runtime/cds/DeterministicDump.java fails due to garbage in structure padding src/hotspot/share/classfile/moduleEntry.cpp line 457: > 455: _location = ArchiveBuilder::get_relocated_symbol(_location); > 456: } > 457: JFR_ONLY(memset(trace_id_addr(), 0, sizeof(traceid))); `memset` looks dodgy here. Maybe `JFR_ONLY(set_trace_id(0))`? src/hotspot/share/classfile/packageEntry.cpp line 241: > 239: _qualified_exports = (GrowableArray*)archived_qualified_exports; > 240: _defined_by_cds_in_class_path = 0; > 241: JFR_ONLY(memset(trace_id_addr(), 0, sizeof(traceid))); Ditto, `JFR_ONLY(set_trace_id(0))`? ------------- PR: https://git.openjdk.java.net/jdk/pull/267 From dholmes at openjdk.java.net Mon Sep 21 06:16:31 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 21 Sep 2020 06:16:31 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Thu, 17 Sep 2020 19:51:24 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Removed double check, fix comment, removed not needed function, updated logs Hi Robbin, There is still a lack of motivation for this feature in the JBS issue. What kind of handshakes need to be asynchronous? Any async operation implies that the requester doesn't care about when or even if, the operation gets executed - they are by definition fire-and-forget actions. So what are the usecases being envisaged here? Many of the changes included here seem unrelated to, and not reliant on, async handshakes, and could be factored out to simplify the review and allow focus on the actual async handshake part e.g. the JVM TI cleanups seem they could be mostly standalone. Specific comments below. A general concern I have is where the current thread is no longer guaranteed to be a JavaThread (which is a step in the wrong direction in relation to some of the cleanups I have planned!) and I can't see why this would be changing. Thanks. src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 320: > 318: op.do_thread(_thread); > 319: } else { > 320: Handshake::execute(&op, _thread); We should still have the guarantee check that the target was alive. src/hotspot/share/prims/jvmtiEventController.cpp line 339: > 337: hs.do_thread(target); > 338: } else { > 339: Handshake::execute(&hs, target); We should still have the guarantee check that the target is still alive. src/hotspot/share/prims/jvmtiEnvBase.cpp line 692: > 690: // calling_thread is the thread that requested the list of monitors for java_thread. > 691: // java_thread is thread owning the monitors. > 692: // current_thread is thread executint this code, can be a non-JavaThread (e.g. VM Thread). Where is the use of the VMThread introduced as I cannot see it? The code was changed to use direct handshakes so that we know that either the caller or the target (both JavaThreads) must be executing it. I don't see any VMOperation added to call this code. src/hotspot/share/runtime/handshake.cpp line 230: > 228: log_trace(handshake)("Threads signaled, begin processing blocked threads by VMThread"); > 229: HandshakeSpinYield hsy(start_time_ns); > 230: int executed_by_driver = 0; driver?? Isn't this still the VMThread? src/hotspot/share/runtime/handshake.cpp line 313: > 311: } > 312: > 313: int executed_by_driver = 0; Again why driver?? Isn't it either the current thread or the target that will execute the op? src/hotspot/share/runtime/handshake.cpp line 453: > 451: // If all handshake operations for the handshakee are finished and someone > 452: // just adds an operation we may see it here. But if the handshakee is not > 453: // armed yet it is not safe to procced. s/procced/proceed/ src/hotspot/share/runtime/handshake.hpp line 44: > 42: // by the target JavaThread itself or, depending on whether the operation is > 43: // a single target/direct handshake or not, by the JavaThread that requested the > 44: // handshake or the VMThread respectively. This comment now indicates that all single target handshakes are executed as direct-handshakes and never by the VMThread - is that correct? src/hotspot/share/runtime/handshake.hpp line 107: > 105: _claim_failed, > 106: _processed, > 107: _succeed, grammatically should be _succeeded src/hotspot/share/runtime/thread.cpp line 481: > 479: #ifdef ASSERT > 480: // A JavaThread is considered "dangling" if it is not the current > 481: // thread, his not handshaking with current thread, as been added the Threads s/his/is/ ? s/as been added the/has been added to the/ But rather than describe all the conditions, few of which are actually visible in the assertion below, why not just rephrase in terms of the conditions checked i.e. // A JavaThread is considered dangling if it not handshake-safe with respect to the current thread, or // it is not on a ThreadsList. The is_handshake_safe_for method should describe all the conditions that make a target handshake safe. src/hotspot/share/runtime/thread.cpp line 851: > 849: bool > 850: JavaThread::is_thread_fully_suspended(bool wait_for_suspend, uint32_t *bits) { > 851: if (this != Thread::current()) { Why/how is a non-JavaThread calling this? src/hotspot/share/runtime/thread.cpp line 2441: > 2439: assert(Thread::current()->is_VM_thread() || > 2440: is_handshake_safe_for(Thread::current()), > 2441: "should be in the vm thread, self or handshakee"); This seems too general. This should either be a VMoperation or a direct handshake, but not both. src/hotspot/share/utilities/filterQueue.hpp line 57: > 55: > 56: // MT-safe > 57: // Since pops and adds are allowed while we add, we do not know if _first is same even if it's the same address. This comment seems out of context on the declaration of add, as it is describing a detail of the implementation - to what are we comparing _first to be the same ?? If you want to just document the MT properties abstractedly then describing this a lock-free would suffice. Though if pop requires locking then it's implementation is also constrained to work with the lock-free version of add. Overall it is unclear how documenting "external serialization needed" actually helps the user of this API. ?? src/hotspot/share/utilities/filterQueue.inline.hpp line 42: > 40: break; > 41: } > 42: yield.wait(); Was the need for spinwaits identified through benchmarking? Do you really expect this to be hot? src/hotspot/share/utilities/filterQueue.inline.hpp line 63: > 61: } > 62: > 63: // MT-Unsafe, external serialization needed. So IIUC this queue supports multiple concurrent add()s, but has to be restricted to a single pop() at a time (although that is allowed to execute concurrently with the add()s) - correct? test/hotspot/jtreg/runtime/handshake/AsyncHandshakeWalkStackTest.java line 30: > 28: * @build AsyncHandshakeWalkStackTest > 29: * @run driver ClassFileInstaller sun.hotspot.WhiteBox > 30: * sun.hotspot.WhiteBox$WhiteBoxPermission This is not needed as ClassFileInstaller already handles WhiteBox's nested classes. test/hotspot/jtreg/runtime/handshake/MixedHandshakeWalkStackTest.java line 30: > 28: * @build MixedHandshakeWalkStackTest > 29: * @run driver ClassFileInstaller sun.hotspot.WhiteBox > 30: * sun.hotspot.WhiteBox$WhiteBoxPermission Not needed - see earlier comment. test/hotspot/jtreg/runtime/handshake/MixedHandshakeWalkStackTest.java line 38: > 36: > 37: public class MixedHandshakeWalkStackTest { > 38: public static Thread tthreads[]; why tthreads? It looks like a typo :) ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From dholmes at openjdk.java.net Mon Sep 21 06:16:33 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 21 Sep 2020 06:16:33 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 20:51:17 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/prims/whitebox.cpp line 2032: > >> 2030: void do_thread(Thread* th) { >> 2031: assert(th->is_Java_thread(), "sanity"); >> 2032: JavaThread* jt = (JavaThread*)th; > > Can whitebox.cpp code use the new as_Java_thread() call? Yes it can. :) > src/hotspot/share/runtime/handshake.hpp line 45: > >> 43: // a single target/direct handshake or not, by the JavaThread that requested the >> 44: // handshake or the VMThread respectively. >> 45: class HandshakeClosure : public ThreadClosure, public CHeapObj { > > Just to be clear. You haven't added support for a handshake that > must only be executed by the target thread yet, right? That's > future work, if I remember correctly... AsyncHandshakeClosures have operations that must, by definition, be executed by the target thread (if they are executed at all). > src/hotspot/share/runtime/interfaceSupport.inline.hpp line 157: > >> 155: >> 156: // Threads shouldn't block if they are in the middle of printing, but... >> 157: ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); > > Can you explain why you had to add this? > Did something show up in testing? Yes please explain as this looks really bad. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From dholmes at openjdk.java.net Mon Sep 21 06:16:34 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 21 Sep 2020 06:16:34 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: References: Message-ID: On Thu, 17 Sep 2020 12:07:15 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Fixed double checks > Added NSV > ProcessResult to enum > Fixed logging > Moved _active_handshaker to private src/hotspot/share/runtime/handshake.cpp line 336: > 334: // and thus prevents reading stale data modified in the handshake closure > 335: // by the Handshakee. > 336: OrderAccess::acquire(); How/why is this deleted? Surely there are still single-thread VMops that use a handshake?? src/hotspot/share/runtime/interfaceSupport.inline.hpp line 136: > 134: assert(_thread->thread_state() == _thread_in_vm, "should only call when leaving VM after handshake"); > 135: > 136: _thread->set_thread_state(_original_state); Can you clarify why this is no longer needed? What states can we be returning to? ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From dholmes at openjdk.java.net Mon Sep 21 06:24:03 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 21 Sep 2020 06:24:03 GMT Subject: RFR: 8253397: Ensure LogTag types are sorted In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:11:32 GMT, Claes Redestad wrote: > - Sort LogTag type enum alphabetically > - Assert that the tags are sorted instead of sorting Seems fine. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/274 From david.holmes at oracle.com Mon Sep 21 06:33:39 2020 From: david.holmes at oracle.com (David Holmes) Date: Mon, 21 Sep 2020 16:33:39 +1000 Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: References: Message-ID: <9cb8feec-e50e-504e-4775-8dba8ed82f17@oracle.com> Correction ... On 21/09/2020 4:16 pm, David Holmes wrote: > On Thu, 17 Sep 2020 12:07:15 GMT, Robbin Ehn wrote: > >>> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >>> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >>> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >>> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >>> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >>> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >>> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >>> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >>> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >>> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >>> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >>> is now only one method to call when handshaking one thread. Some comments about the changes: >>> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >>> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >>> worse then this. >>> >>> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >>> thread. >>> >>> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >>> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >>> >>> - Added WB testing method. >>> >>> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >>> >>> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >>> >>> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >>> safepoint and half of them after, in many handshake all operations. >>> >>> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >>> >>> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >>> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >>> >>> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >>> >>> - Added filtered queue and gtest for it. >>> >>> Passes multiple t1-8 runs. >>> Been through some pre-reviwing. >> >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed double checks >> Added NSV >> ProcessResult to enum >> Fixed logging >> Moved _active_handshaker to private > > src/hotspot/share/runtime/handshake.cpp line 336: > >> 334: // and thus prevents reading stale data modified in the handshake closure >> 335: // by the Handshakee. >> 336: OrderAccess::acquire(); > > How/why is this deleted? Surely there are still single-thread VMops that use a handshake?? That comment was placed against the old line 336 which was the deletion of this method: bool Handshake::execute(HandshakeClosure* thread_cl, JavaThread* target) { (I'll file a skara/git bug). David ----- > src/hotspot/share/runtime/interfaceSupport.inline.hpp line 136: > >> 134: assert(_thread->thread_state() == _thread_in_vm, "should only call when leaving VM after handshake"); >> 135: >> 136: _thread->set_thread_state(_original_state); > > Can you clarify why this is no longer needed? What states can we be returning to? > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/151 > From jiefu at openjdk.java.net Mon Sep 21 07:13:09 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 21 Sep 2020 07:13:09 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding In-Reply-To: References: <2eT-WwuHrvPdtFuobl63gewSxKWSnvEVuOMTHaXxYrM=.abf2e8a7-184e-4509-93db-cd41ec3f301f@github.com> Message-ID: On Mon, 21 Sep 2020 05:23:27 GMT, Ioi Lam wrote: >> The implementation is quite complex, every derivatives from BasicHashTableEntry need to have a copy_from function to >> avoid such problem. Maybe a brutal one to avoid such padding issue for all cases is fill the object allocated in >> AllocateHeap with \0? > >> The implementation is quite complex, every derivatives from BasicHashTableEntry need to have a copy_from function to >> avoid such problem. Maybe a brutal one to avoid such padding issue for all cases is fill the object allocated in >> AllocateHeap with \0? > > You're right. My original analysis was wrong: set_hash() didn't write garbage into the padding. Instead, the garbage > was there because AllocaHeap didn't initialize the new buffer in product builds. > I reverted the original fix. Instead, I added code to call memset() when allocating a new hashtable entry (but only > when DumpSharedSpaces is true). It passed with this fix. Looks good to me. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/267 From rehn at openjdk.java.net Mon Sep 21 08:26:17 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 08:26:17 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 19:46:34 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 691: > >> 689: // Note: >> 690: // calling_thread is the thread that requested the list of monitors for java_thread. >> 691: // java_thread is thread owning the monitors. > > s/is thread/is the thread/ Fixed > src/hotspot/share/prims/jvmtiEnvBase.cpp line 692: > >> 690: // calling_thread is the thread that requested the list of monitors for java_thread. >> 691: // java_thread is thread owning the monitors. >> 692: // current_thread is thread executint this code, can be a non-JavaThread (e.g. VM Thread). > > typo - s/executint/executing/ > grammar - s/e.g./e.g.,/ Fixed > src/hotspot/share/prims/jvmtiEnvBase.cpp line 693: > >> 691: // java_thread is thread owning the monitors. >> 692: // current_thread is thread executint this code, can be a non-JavaThread (e.g. VM Thread). >> 693: // And they all maybe different threads. > > typo - (in this context) - s/maybe/may be/ Fixed > src/hotspot/share/prims/jvmtiEnvBase.hpp line 341: > >> 339: class JvmtiHandshakeClosure : public HandshakeClosure { >> 340: protected: >> 341: jvmtiError _result; > > Thanks for pushing the jvmtiError into common code for JVM/TI handshakes. Thanks > src/hotspot/share/prims/jvmtiEnvBase.cpp line 653: > >> 651: JvmtiEnvBase::get_current_contended_monitor(JavaThread *calling_thread, JavaThread *java_thread, >> jobject *monitor_ptr) { 652: Thread *current_thread = Thread::current(); >> 653: assert(java_thread->is_handshake_safe_for(current_thread), > > I like how this assert reads now! Great! ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 08:41:56 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 08:41:56 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: <-G5jT6J3J3UQwMRtEB9-PCvPtyZKNLI6wwtCFg6pUDI=.c669352c-ac48-4eac-a1e2-6c86ec32b894@github.com> References: <-G5jT6J3J3UQwMRtEB9-PCvPtyZKNLI6wwtCFg6pUDI=.c669352c-ac48-4eac-a1e2-6c86ec32b894@github.com> Message-ID: On Fri, 18 Sep 2020 20:01:02 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed double checks >> Added NSV >> ProcessResult to enum >> Fixed logging >> Moved _active_handshaker to private > > src/hotspot/share/prims/jvmtiEventController.cpp line 340: > >> 338: } else { >> 339: Handshake::execute(&hs, target); >> 340: } > > This guarantee() that the handshake has executed doesn't have an > equivalent in the rewritten code. Should there be some way of verifying > this condition from this location? Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 08:41:57 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 08:41:57 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 20:40:42 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/utilities/filterQueue.inline.hpp line 35: > >> 33: FilterQueueNode* head; >> 34: FilterQueueNode* insnode = new FilterQueueNode(data); >> 35: SpinYield yield(SpinYield::default_spin_limit * 10); // Very unlikely with mutiple failed CAS. > > Typo - s/mutiple/multiple/ Fixed > src/hotspot/share/utilities/filterQueue.inline.hpp line 76: > >> 74: return (E)NULL; >> 75: } >> 76: SpinYield yield(SpinYield::default_spin_limit * 10); // Very unlikely with mutiple failed CAS. > > typo - s/mutiple/multiple/ Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 08:51:26 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 08:51:26 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: <95uLEj8gYajeYzYsdHsJ7H93GMBEStaDSDSwhSDCWY4=.58779d68-f017-4def-8a8e-6ad0273800e5@github.com> On Fri, 18 Sep 2020 20:07:47 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/runtime/handshake.hpp line 108: > >> 106: _processed, >> 107: _succeed, >> 108: _number_states > > Why are these indented by 4 spaces instead of 2 spaces? Fixed > src/hotspot/share/runtime/handshake.cpp line 70: > >> 68: : HandshakeOperation(cl, target), _start_time_ns(start_ns) {} >> 69: virtual ~AsyncHandshakeOperation() { delete _handshake_cl; }; >> 70: jlong start_time() { return _start_time_ns; } > > Should this be 'const'? Ignore it if it would fan out too much. Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 09:54:19 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 09:54:19 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:32:54 GMT, David Holmes wrote: >> src/hotspot/share/runtime/interfaceSupport.inline.hpp line 157: >> >>> 155: >>> 156: // Threads shouldn't block if they are in the middle of printing, but... >>> 157: ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); >> >> Can you explain why you had to add this? >> Did something show up in testing? > > Yes please explain as this looks really bad. The tty lock allow vm thread to block and it is a no safepoint check lock. Meaning that you are not allowed to safepoint/handshake poll while holding it. Also if someone is holding it, you can't print on the tty inside a safepoint/handshake. The tty lock is also lower rank than Threads_lock (we used to block on that in safepoint) and lower rank than the handshake state lock. Compiler have code which takes the ttyLocker and prints on tty. Some of the information that it prints passes code which can safepoint/handshake. The way safepoint have gotten around this is by unlocking tty lock before a safepoint. Now that handshake use a mutex instead of semaphore, we thus get an error also. (even if you now don't get an error we could potentially deadlock, or at least we are very deep water). Until compiler is fixed to UL we need this unlocking of the tty lock. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 09:54:18 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 09:54:18 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 20:21:09 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/runtime/handshake.cpp line 349: > >> 347: target->handshake_state()->add_operation(op); >> 348: } else { >> 349: log_handshake_info(start_time_ns, op->name(), 0, 0, "(thread dead)"); > > It might be useful to also log the 'target' thread value here so: > > .... (thread= is dead)" > > Might be something like this: > > log_handshake_info(start_time_ns, op->name(), 0, 0, "(thread=" INTPTR_FORMAT " is dead)", p2i(target)); > > Although you'd probably have to form the string in a buffer and then pass it > to the log_handshake_info() call... sigh... Fixed (via buffert) > src/hotspot/share/runtime/handshake.cpp line 450: > >> 448: return false; >> 449: } >> 450: // Operations are added without lock and then the poll is armed. > > s/without lock/lock free/ Fixed > src/hotspot/share/runtime/handshake.cpp line 479: > >> 477: } >> 478: >> 479: // If we own the mutex at this point and while owning the mutex > > grammar - s/owning the mutex/owning the mutex we/ Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rrich at openjdk.java.net Mon Sep 21 09:55:19 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Mon, 21 Sep 2020 09:55:19 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() In-Reply-To: References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> Message-ID: On Mon, 21 Sep 2020 04:12:39 GMT, David Holmes wrote: >> After JDK-8252414 the safepoint/handshake code does not take _suspend_flags into accout anymore in its assessment if a >> thread is safepoint/handshake safe. This change updates the comment on >> JavaThread::java_suspend_self_with_safepoint_check(). I have (not yet) fixed the line breaks (fill-paragraph in emacs >> lingo) for a clearer diff. >> Also I could inline the (*) footnote. > > src/hotspot/share/runtime/thread.cpp line 2599: > >> 2597: // safepoint/handshake code will count it as safepoint/handshake safe. Also it allows >> 2598: // another thread to continue if it is waiting in is_ext_suspend_completed() for this >> 2599: // thread to change state from _thread_in_native_trans to the target state. > > The revised wording doesn't really convey the situation to me. We _have_ to set the thread-state to _thread_blocked so > that a thread waiting in _is_ext_suspend_completed can proceed (there is no generic "target" state - it must be > _thread_blocked). I would simplify and rephrase as follows: "We have to set the thread state directly to > _thread_blocked so that it will be seen to be safepoint/handshake safe whilst suspended. This is also necessary to > allow a thread in is_ext_suspend_completed, that observed the _thread_in_native_trans state, to proceed." Ok, I will use your version. ------------- PR: https://git.openjdk.java.net/jdk/pull/225 From rrich at openjdk.java.net Mon Sep 21 10:01:42 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Mon, 21 Sep 2020 10:01:42 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() [v2] In-Reply-To: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> Message-ID: > After JDK-8252414 the safepoint/handshake code does not take _suspend_flags into accout anymore in its assessment if a > thread is safepoint/handshake safe. This change updates the comment on > JavaThread::java_suspend_self_with_safepoint_check(). I have (not yet) fixed the line breaks (fill-paragraph in emacs > lingo) for a clearer diff. > Also I could inline the (*) footnote. Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Apply version proposed by dholmes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/225/files - new: https://git.openjdk.java.net/jdk/pull/225/files/177221cf..e1ac0b9c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=225&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=225&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/225.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/225/head:pull/225 PR: https://git.openjdk.java.net/jdk/pull/225 From rehn at openjdk.java.net Mon Sep 21 10:05:33 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 10:05:33 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:04:45 GMT, David Holmes wrote: >> src/hotspot/share/prims/whitebox.cpp line 2032: >> >>> 2030: void do_thread(Thread* th) { >>> 2031: assert(th->is_Java_thread(), "sanity"); >>> 2032: JavaThread* jt = (JavaThread*)th; >> >> Can whitebox.cpp code use the new as_Java_thread() call? > > Yes it can. :) Fixed :) ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 10:05:32 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 10:05:32 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 20:37:10 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/runtime/thread.cpp line 487: > >> 485: assert(!thread->is_Java_thread() || >> 486: ((JavaThread *) thread)->is_handshake_safe_for(Thread::current()) || >> 487: !((JavaThread *) thread)->on_thread_list() || > > Should use "thread->as_Java_thread()" instead of the cast here (2 places). Fixed > src/hotspot/share/runtime/thread.hpp line 1360: > >> 1358: bool is_handshake_safe_for(Thread* th) const { >> 1359: return _handshake.active_handshaker() == th || >> 1360: this == th; > > I _think_ L1359-60 will fit on one line... Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 10:10:44 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 10:10:44 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:27:00 GMT, David Holmes wrote: >> src/hotspot/share/runtime/handshake.hpp line 45: >> >>> 43: // a single target/direct handshake or not, by the JavaThread that requested the >>> 44: // handshake or the VMThread respectively. >>> 45: class HandshakeClosure : public ThreadClosure, public CHeapObj { >> >> Just to be clear. You haven't added support for a handshake that >> must only be executed by the target thread yet, right? That's >> future work, if I remember correctly... > > AsyncHandshakeClosures have operations that must, by definition, be executed by the target thread (if they are executed > at all). Yes as David says. The code that make sure this is case is: static bool processor_filter(HandshakeOperation* op) { return !op->is_asynch(); } Which is used e.g: return _queue.contains(processor_filter); While process_self_inner() uses no filter. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 10:10:45 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 10:10:45 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Fri, 18 Sep 2020 20:52:49 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/prims/whitebox.cpp line 2033: > >> 2031: assert(th->is_Java_thread(), "sanity"); >> 2032: JavaThread* jt = (JavaThread*)th; >> 2033: ResourceMark rm; > > It also might be interesting to print the "current thread" info here so > that someone looking at the test output knows which thread handled > the handshake (the target or a surrogate). It's always the thread it self, added asserts to make this clear. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From kim.barrett at oracle.com Mon Sep 21 10:35:58 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 21 Sep 2020 06:35:58 -0400 Subject: RFR: 8253397: Ensure LogTag types are sorted In-Reply-To: References: Message-ID: > On Sep 21, 2020, at 1:30 AM, Claes Redestad wrote: > > - Sort LogTag type enum alphabetically > - Assert that the tags are sorted instead of sorting > > ------------- > > Commit messages: > - Ensure LogTag types are sorted > > Changes: https://git.openjdk.java.net/jdk/pull/274/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=274&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8253397 > Stats: 58 lines in 2 files changed: 24 ins; 22 del; 12 mod > Patch: https://git.openjdk.java.net/jdk/pull/274.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/274/head:pull/274 > > PR: https://git.openjdk.java.net/jdk/pull/274 I think having the LOG_TAG_LIST in sorted order is good. But I think the checking can be improved. I think it can be done at compile-time using constexpr, completely eliminating the runtime execution and data. I'm currently prototyping; I'll let you know what I come up with. Make LogTag::_name constexpr (it should at least be const, and it's a bug that it isn't). This will require moving the definition into the header, but that's okay, it's small. (I think the sorting could be done at compile-time too, unless that requires constexpr features from C++17/20; I don't recall off-hand. But having the list start sorted removes the need for that complexity, and may be better for usability anyway. Grouping tags by usage doesn't entirely work, since tags may be used in multiple ways.) From rehn at openjdk.java.net Mon Sep 21 10:42:32 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 10:42:32 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 06:13:55 GMT, David Holmes wrote: > There is still a lack of motivation for this feature in the JBS issue. What kind of handshakes need to be asynchronous? > Any async operation implies that the requester doesn't care about when or even if, the operation gets executed - they > are by definition fire-and-forget actions. So what are the usecases being envisaged here? Added info in JBS issue: https://bugs.openjdk.java.net/browse/JDK-8238761 All uses-cases of _suspend_flag as initial targets. But some of them require more bits. > Many of the changes included here seem unrelated to, and not reliant on, async handshakes, and could be factored out to > simplify the review and allow focus on the actual async handshake part e.g. the JVM TI cleanups seem they could be > mostly standalone. Since I kept rebasing this doing somethings I did somethings to simplify the rebasing. I guess you are talking about the JvmtiHandshakeClosure? > Specific comments below. A general concern I have is where the current thread is no longer guaranteed to be a > JavaThread (which is a step in the wrong direction in relation to some of the cleanups I have planned!) and I can't see > why this would be changing. If the VM thread emits a "handshake all" it will continuously loop the JavaThreads until op is completed. We do not keep track which JavaThread have done which handshake. The VM thread will execute all handshakes it finds on every JavaThread's queue. If someone else adds a handshake to one of the JavaThreads the VM thread might execute it. I did not see any issues while looking at the code or in testing doing this. Many of the handshakes used to be safepoints, always executed by VM thread. We changed that to VM thread or JavaThread it self, and we now extend that to any JavaThread. Some of the JVM TI handshakes are a bit different, but since they must proper allocate resource in target JavaThread and not in current JavaThread, there is no issue executing the code with a non-JavaThread. At the moment we have no dependencies on that the 'driver' is a JavaThread for any of the handshakes. We can easily set a per Handshake typ filter (slight changes to Handshake Closure and filter function) and choose to only executed the handshake with target self and requester/any JavaThread/only VM thread. So if we think JVM TI handshakes should only be executed by requester or target it's an easy fix. If you think the default is wrong, it's also an easy change. (For others following there also a planned investigation on requester only executed handshake, which is not as easy) > Thanks. # > src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 320: > >> 318: op.do_thread(_thread); >> 319: } else { >> 320: Handshake::execute(&op, _thread); > > We should still have the guarantee check that the target was alive. Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 10:55:17 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 10:55:17 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: <__uAJJKEl-cvCeHySd7cSrEEkGefi4Yfui_Lpv6z53E=.2be65db7-a561-4296-8a8c-d8ca0f5901f7@github.com> On Mon, 21 Sep 2020 05:01:29 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/prims/jvmtiEventController.cpp line 339: > >> 337: hs.do_thread(target); >> 338: } else { >> 339: Handshake::execute(&hs, target); > > We should still have the guarantee check that the target is still alive. Fixed > src/hotspot/share/prims/jvmtiEnvBase.cpp line 692: > >> 690: // calling_thread is the thread that requested the list of monitors for java_thread. >> 691: // java_thread is thread owning the monitors. >> 692: // current_thread is thread executint this code, can be a non-JavaThread (e.g. VM Thread). > > Where is the use of the VMThread introduced as I cannot see it? The code was changed to use direct handshakes so that > we know that either the caller or the target (both JavaThreads) must be executing it. I don't see any VMOperation added > to call this code. The assumption that it must be an JavaThread was due to code such as "JavaThread::current();", any Thread can execute this. The default the first thread grabbing the handshake state lock for the targeted thread will execute all handshake operation it is allowed to. This might be the VM thread. > src/hotspot/share/runtime/handshake.cpp line 230: > >> 228: log_trace(handshake)("Threads signaled, begin processing blocked threads by VMThread"); >> 229: HandshakeSpinYield hsy(start_time_ns); >> 230: int executed_by_driver = 0; > > driver?? Isn't this still the VMThread? The driver is VM thread or a JavaThread. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 11:00:42 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 11:00:42 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: References: Message-ID: <7K8EUtQv1UsANs3q0n1V5PJ5zGmygDn160Cn9SXCPWc=.115b8e4a-12ae-4912-a7c8-b109506f77b7@github.com> On Mon, 21 Sep 2020 05:17:03 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed double checks >> Added NSV >> ProcessResult to enum >> Fixed logging >> Moved _active_handshaker to private > > src/hotspot/share/runtime/handshake.cpp line 336: > >> 334: // and thus prevents reading stale data modified in the handshake closure >> 335: // by the Handshakee. >> 336: OrderAccess::acquire(); > > How/why is this deleted? Surely there are still single-thread VMops that use a handshake?? As I said in PR request: "Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did." They are not VM ops, and default we do not promise who will execute it. Just that it will be executed on thread X. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 11:06:04 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 11:06:04 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:18:06 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/runtime/handshake.cpp line 313: > >> 311: } >> 312: >> 313: int executed_by_driver = 0; > > Again why driver?? Isn't it either the current thread or the target that will execute the op? The thread executing this handshake operation, what the current thread is doesn't matter. You can't use current threads resources or be dependent otherwise on it. Exception being locking issues in JVM TI, where we are dependent that requester have locked JVM TI state lock for us, but we are not dependent that the current thread is the owner. So checking that the lock is held by requester doesn't matter for how is the 'driver'. > src/hotspot/share/runtime/handshake.cpp line 453: > >> 451: // If all handshake operations for the handshakee are finished and someone >> 452: // just adds an operation we may see it here. But if the handshakee is not >> 453: // armed yet it is not safe to procced. > > s/procced/proceed/ Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 11:11:05 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 11:11:05 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: <4l2vRUllquBWTsz1t0brYAsPoBzNIKo1A7M9GBAcKNw=.da0e1992-ebff-4c31-bdc9-e6c293189276@github.com> On Mon, 21 Sep 2020 05:26:08 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/runtime/handshake.hpp line 44: > >> 42: // by the target JavaThread itself or, depending on whether the operation is >> 43: // a single target/direct handshake or not, by the JavaThread that requested the >> 44: // handshake or the VMThread respectively. > > This comment now indicates that all single target handshakes are executed as direct-handshakes and never by the > VMThread - is that correct? The concept of direct handshake do not exist in that way. (but can easily be implemented using the filter) You have operation that you need to be executed on a JavaThread, you add that to that JavaThread. Any thread ("driver") that succeed to claim that JavaThreads handshake state (lock and that JavaThread is safe) procced to execute from that handshake queue until empty (empty according to applied filter on queue). > src/hotspot/share/runtime/handshake.hpp line 107: > >> 105: _claim_failed, >> 106: _processed, >> 107: _succeed, > > grammatically should be _succeeded Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 11:14:29 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 11:14:29 GMT Subject: RFR: 8238761: Asynchronous handshakes [v2] In-Reply-To: References: Message-ID: <-FquO1wFCKkgjCoNBYg7I3pAHIiujI6BrPKqHrOex6k=.54ca4124-ecdb-4fbb-92b7-d787d9ccae08@github.com> On Mon, 21 Sep 2020 05:31:59 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed double checks >> Added NSV >> ProcessResult to enum >> Fixed logging >> Moved _active_handshaker to private > > src/hotspot/share/runtime/interfaceSupport.inline.hpp line 136: > >> 134: assert(_thread->thread_state() == _thread_in_vm, "should only call when leaving VM after handshake"); >> 135: >> 136: _thread->set_thread_state(_original_state); > > Can you clarify why this is no longer needed? What states can we be returning to? You can never call SafepointMechanism::process_if_requested() while in any 'safe-state' (as determine by safepoint/handshake safe routines). We thus know we are transitioning from an unsafe state to an unsafe state, which unsafe state might be seen by e.g. VM thread we care little about. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 11:49:30 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 11:49:30 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:42:29 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed double check, fix comment, removed not needed function, updated logs > > src/hotspot/share/runtime/thread.cpp line 481: > >> 479: #ifdef ASSERT >> 480: // A JavaThread is considered "dangling" if it is not the current >> 481: // thread, his not handshaking with current thread, as been added the Threads > > s/his/is/ ? > s/as been added the/has been added to the/ > > But rather than describe all the conditions, few of which are actually visible in the assertion below, why not just > rephrase in terms of the conditions checked i.e. // A JavaThread is considered dangling if it not handshake-safe with > respect to the current thread, or // it is not on a ThreadsList. > The is_handshake_safe_for method should describe all the conditions that make a target handshake safe. Fixed > src/hotspot/share/runtime/thread.cpp line 851: > >> 849: bool >> 850: JavaThread::is_thread_fully_suspended(bool wait_for_suspend, uint32_t *bits) { >> 851: if (this != Thread::current()) { > > Why/how is a non-JavaThread calling this? is_thread_fully_suspended() is used in JVM TI handshake and this change-set allow VM thread to execute them. This method have no dependency that caller is a JavaThread. > src/hotspot/share/runtime/thread.cpp line 2441: > >> 2439: assert(Thread::current()->is_VM_thread() || >> 2440: is_handshake_safe_for(Thread::current()), >> 2441: "should be in the vm thread, self or handshakee"); > > This seems too general. This should either be a VMoperation or a direct handshake, but not both. send_thread_stop() is only called from a handshake operation. Fixed. > src/hotspot/share/utilities/filterQueue.hpp line 57: > >> 55: >> 56: // MT-safe >> 57: // Since pops and adds are allowed while we add, we do not know if _first is same even if it's the same address. > > This comment seems out of context on the declaration of add, as it is describing a detail of the implementation - to > what are we comparing _first to be the same ?? If you want to just document the MT properties abstractedly then > describing this a lock-free would suffice. Though if pop requires locking then it's implementation is also constrained > to work with the lock-free version of add. Overall it is unclear how documenting "external serialization needed" > actually helps the user of this API. ?? Removed comment. Ok. Since it's same notation as posix uses I thought was clear that any method marked MT-Unsafe is not safe call to from a multithreaded program. Suggestion on how to separate the MT-safe from MT-usafe methods ? > src/hotspot/share/utilities/filterQueue.inline.hpp line 42: > >> 40: break; >> 41: } >> 42: yield.wait(); > > Was the need for spinwaits identified through benchmarking? Do you really expect this to be hot? No and no. It was added to guard against any pathologically case, such as the gtest stress-tests. > src/hotspot/share/utilities/filterQueue.inline.hpp line 63: > >> 61: } >> 62: >> 63: // MT-Unsafe, external serialization needed. > > So IIUC this queue supports multiple concurrent add()s, but has to be restricted to a single pop() at a time (although > that is allowed to execute concurrently with the add()s) - correct? Yes > test/hotspot/jtreg/runtime/handshake/MixedHandshakeWalkStackTest.java line 38: > >> 36: >> 37: public class MixedHandshakeWalkStackTest { >> 38: public static Thread tthreads[]; > > why tthreads? It looks like a typo :) Test threads, changed to testThreads. > test/hotspot/jtreg/runtime/handshake/AsyncHandshakeWalkStackTest.java line 30: > >> 28: * @build AsyncHandshakeWalkStackTest >> 29: * @run driver ClassFileInstaller sun.hotspot.WhiteBox >> 30: * sun.hotspot.WhiteBox$WhiteBoxPermission > > This is not needed as ClassFileInstaller already handles WhiteBox's nested classes. Fixed > test/hotspot/jtreg/runtime/handshake/MixedHandshakeWalkStackTest.java line 30: > >> 28: * @build MixedHandshakeWalkStackTest >> 29: * @run driver ClassFileInstaller sun.hotspot.WhiteBox >> 30: * sun.hotspot.WhiteBox$WhiteBoxPermission > > Not needed - see earlier comment. Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Mon Sep 21 12:19:09 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 21 Sep 2020 12:19:09 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: Message-ID: > This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes > are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) > Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an > arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a > singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first > pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first > pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake > operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything > except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed > filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake > operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there > is now only one method to call when handshaking one thread. Some comments about the changes: > - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake > all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much > worse then this. > > - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that > thread. > > - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a > handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. > > - Added WB testing method. > > - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. > > - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. > > - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a > safepoint and half of them after, in many handshake all operations. > > - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. > > - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the > moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. > > - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. > > - Added filtered queue and gtest for it. > > Passes multiple t1-8 runs. > Been through some pre-reviwing. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Update after Dan and David - Merge branch 'master' into 8238761-asynchrounous-handshakes - Removed double check, fix comment, removed not needed function, updated logs - Fixed double checks Added NSV ProcessResult to enum Fixed logging Moved _active_handshaker to private - Rebase version 1.0 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/151/files - new: https://git.openjdk.java.net/jdk/pull/151/files/469f8fc8..badefa47 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=02-03 Stats: 9168 lines in 299 files changed: 4700 ins; 3543 del; 925 mod Patch: https://git.openjdk.java.net/jdk/pull/151.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/151/head:pull/151 PR: https://git.openjdk.java.net/jdk/pull/151 From zgu at openjdk.java.net Mon Sep 21 12:35:22 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 21 Sep 2020 12:35:22 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v5] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Mon, 21 Sep 2020 05:42:00 GMT, Calvin Cheung wrote: >> I agree with David that the calls to `set_stack_base(NULL); set_stack_size(0);` do not seem to be related to this bug. >> Also, previously only `set_stack_base(NULL);` was called -- on all Threads. Now: >> - NonJavaThread: `set_stack_base(NULL); set_stack_size(0);` >> - JavaThread: neither operation is done >> >> I am wondering: >> >> - why the behavior is changed >> - is zeroing the stack_base/size necessary for NMT code only, or is it necessary for non-NMT operations as well? >> >> Will any test cases fail if you remove line 1340/1341? > > I've tried without clearing stack_base and stack_size and also reverted the change in stack_base() in thread.hpp. > It passed tier1 testing. I'll run more test. > Sorry but I just don't see any need for clearing stack_base or stack_size, and then you don't need to mess with the > assertion in stack_base(). Well, without cleaning stack_base and stack_size, the thread appears to be live and running, which contradicts NMT report (and maybe others, e.g. smap, etc.) My initial thought is to filter out terminated thread in Threads::print_on_error(), but it does not seem to have a way. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From zgu at openjdk.java.net Mon Sep 21 12:45:18 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 21 Sep 2020 12:45:18 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v5] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Mon, 21 Sep 2020 12:32:59 GMT, Zhengyu Gu wrote: >> I've tried without clearing stack_base and stack_size and also reverted the change in stack_base() in thread.hpp. >> It passed tier1 testing. I'll run more test. > >> Sorry but I just don't see any need for clearing stack_base or stack_size, and then you don't need to mess with the >> assertion in stack_base(). > > Well, without cleaning stack_base and stack_size, the thread appears to be live and running, which contradicts NMT > report (and maybe others, e.g. smap, etc.) > My initial thought is to filter out terminated thread in Threads::print_on_error(), but it does not seem to have a way. - JavaThread guarantees to delete 'this' before thread exits. - NonJavaThread object can outlive thread. Zeroing thread stack/size is a way to indicate liveness of the actual thread. I was chasing a race before I realized the thread actual dead, very confusing to see 'G1 Main Marker' as live thread in hs_err file. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From zgu at openjdk.java.net Mon Sep 21 14:19:44 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 21 Sep 2020 14:19:44 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v6] In-Reply-To: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > post_run() method. Zhengyu Gu has updated the pull request incrementally with two additional commits since the last revision: - Fix indents - Back out thread stack cleaning, to be addressed by JDK-8253429 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/185/files - new: https://git.openjdk.java.net/jdk/pull/185/files/6b927b54..7aaa383e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=185&range=04-05 Stats: 8 lines in 2 files changed: 0 ins; 7 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/185.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/185/head:pull/185 PR: https://git.openjdk.java.net/jdk/pull/185 From minqi at openjdk.java.net Mon Sep 21 16:45:10 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Mon, 21 Sep 2020 16:45:10 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding [v2] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:23:57 GMT, Ioi Lam wrote: >> (EDITED) In product builds, when `PackageEntry` and `ModuleEntry` objects are allocated, the memory is not zeroed. As a >> result, the structure padding slots (such as the 32-bits after `BasicHashtableEntry::_hash`) may contain garbage values >> that are different on every run of `java -Xshare:dump`. As a result, `java -Xshare:dump` cannot reproduce deterministic >> result. The fix is to clear the memory for the newly allocated `HashtableEntry` objects when `DumpSharedSpaces == >> true`. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes > the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last > revision: > - Zero out memory for newly allocated HashtableEntry > - Merge branch 'master' of https://github.com/iklam/jdk into 8253079-DeterministicDump-test-fails-product-build > - Revert "8253079: runtime/cds/DeterministicDump.java fails due to garbage in structure padding" > > This reverts commit d380e0213283f5f28889d867bb2505a91781df6e. > - 8253079: runtime/cds/DeterministicDump.java fails due to garbage in structure padding looks good! ------------- Marked as reviewed by minqi (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/267 From ccheung at openjdk.java.net Mon Sep 21 17:07:53 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 21 Sep 2020 17:07:53 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v6] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: <6_OsKJj8O_Fy9y2k-mtEX1Cex-RToeR8BLpPh3vrqgo=.a2243835-a817-4d08-9d30-9583ef3696b5@github.com> On Mon, 21 Sep 2020 14:19:44 GMT, Zhengyu Gu wrote: >> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor >> before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues >> alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the >> thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to >> post_run() method. > > Zhengyu Gu has updated the pull request incrementally with two additional commits since the last revision: > > - Fix indents > - Back out thread stack cleaning, to be addressed by JDK-8253429 This patch also passed the tests I ran before. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From iklam at openjdk.java.net Mon Sep 21 17:12:54 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 21 Sep 2020 17:12:54 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v6] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Mon, 21 Sep 2020 14:19:44 GMT, Zhengyu Gu wrote: >> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor >> before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues >> alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the >> thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to >> post_run() method. > > Zhengyu Gu has updated the pull request incrementally with two additional commits since the last revision: > > - Fix indents > - Back out thread stack cleaning, to be addressed by JDK-8253429 The latest version looks good to me. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/185 From vromero at openjdk.java.net Mon Sep 21 21:39:18 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Mon, 21 Sep 2020 21:39:18 GMT Subject: RFR: 8246774: Record Classes (final) implementation Message-ID: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Co-authored-by: Vicente Romero Co-authored-by: Harold Seigel Co-authored-by: Jonathan Gibbons Co-authored-by: Brian Goetz Co-authored-by: Maurizio Cimadamore Co-authored-by: Joe Darcy Co-authored-by: Chris Hegarty Co-authored-by: Jan Lahoda ------------- Commit messages: - 8246774: Record Classes (final) implementation Changes: https://git.openjdk.java.net/jdk/pull/290/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8246774 Stats: 495 lines in 95 files changed: 23 ins; 362 del; 110 mod Patch: https://git.openjdk.java.net/jdk/pull/290.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/290/head:pull/290 PR: https://git.openjdk.java.net/jdk/pull/290 From vromero at openjdk.java.net Mon Sep 21 21:39:18 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Mon, 21 Sep 2020 21:39:18 GMT Subject: RFR: 8246774: Record Classes (final) implementation In-Reply-To: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Mon, 21 Sep 2020 21:30:51 GMT, Vicente Romero wrote: > Co-authored-by: Vicente Romero > Co-authored-by: Harold Seigel > Co-authored-by: Jonathan Gibbons > Co-authored-by: Brian Goetz > Co-authored-by: Maurizio Cimadamore > Co-authored-by: Joe Darcy > Co-authored-by: Chris Hegarty > Co-authored-by: Jan Lahoda Please review the fix for [1]. The intention of this patch is to make records final removing the need to use --enable-preview in order to be able to include a record declaration in a source or for the VM to execute code compiled from record classes, Thanks [1] https://bugs.openjdk.java.net/browse/JDK-8246774 ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From darcy at openjdk.java.net Mon Sep 21 21:56:01 2020 From: darcy at openjdk.java.net (Joe Darcy) Date: Mon, 21 Sep 2020 21:56:01 GMT Subject: RFR: 8246774: Record Classes (final) implementation In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Mon, 21 Sep 2020 21:36:39 GMT, Vicente Romero wrote: >> Co-authored-by: Vicente Romero >> Co-authored-by: Harold Seigel >> Co-authored-by: Jonathan Gibbons >> Co-authored-by: Brian Goetz >> Co-authored-by: Maurizio Cimadamore >> Co-authored-by: Joe Darcy >> Co-authored-by: Chris Hegarty >> Co-authored-by: Jan Lahoda > > Please review the fix for [1]. The intention of this patch is to make records final removing the need to > use --enable-preview in order to be able to include a record declaration in a source or for the VM to execute code > compiled from record classes, Thanks > > [1] https://bugs.openjdk.java.net/browse/JDK-8246774 Hi Vicente, Please file a separate subtask for the javax.lang.model changes. This helps with the JSR 269 MR paperwork. Thanks, -Joe ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From dcubed at openjdk.java.net Mon Sep 21 22:15:34 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 21 Sep 2020 22:15:34 GMT Subject: Integrated: 8247281: migrate ObjectMonitor::_object to OopStorage In-Reply-To: References: Message-ID: On Fri, 11 Sep 2020 18:45:28 GMT, Daniel D. Daugherty wrote: > This RFE is to migrate the following field to OopStorage: > > class ObjectMonitor { > > void* volatile _object; // backward object pointer - strong root > > Unlike the previous patches in this series, there are a lot of collateral > changes so this is not a trivial review. Sorry for the tedious parts of > the review. Since Erik and I are both contributors to this patch, we > would like at least 1 GC team reviewer and 1 Runtime team reviewer. > > This changeset was tested with Mach5 Tier[1-3],4,5,6,7,8 testing > along with JDK-8252980 and JDK-8252981. I also ran it through my > inflation stress kit for 48 hours on my Linux-X64 machine. This pull request has now been integrated. Changeset: d8921ed5 Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/d8921ed5 Stats: 457 lines in 37 files changed: 246 ins; 113 del; 98 mod 8247281: migrate ObjectMonitor::_object to OopStorage Co-authored-by: Erik ?sterlund Co-authored-by: Daniel Daugherty Reviewed-by: eosterlund, coleenp, dholmes, stefank, kbarrett, rkennke, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/135 From iklam at openjdk.java.net Mon Sep 21 22:19:04 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 21 Sep 2020 22:19:04 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding [v3] In-Reply-To: References: Message-ID: > (EDITED) In product builds, when `PackageEntry` and `ModuleEntry` objects are allocated, the memory is not zeroed. As a > result, the structure padding slots (such as the 32-bits after `BasicHashtableEntry::_hash`) may contain garbage values > that are different on every run of `java -Xshare:dump`. As a result, `java -Xshare:dump` cannot reproduce deterministic > result. The fix is to clear the memory for the newly allocated `HashtableEntry` objects when `DumpSharedSpaces == > true`. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: also reset trace_id ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/267/files - new: https://git.openjdk.java.net/jdk/pull/267/files/225c9272..5706a821 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=267&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=267&range=01-02 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/267.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/267/head:pull/267 PR: https://git.openjdk.java.net/jdk/pull/267 From iklam at openjdk.java.net Mon Sep 21 22:22:30 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 21 Sep 2020 22:22:30 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:45:34 GMT, Aleksey Shipilev wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> also reset trace_id > > src/hotspot/share/classfile/moduleEntry.cpp line 457: > >> 455: _location = ArchiveBuilder::get_relocated_symbol(_location); >> 456: } >> 457: JFR_ONLY(memset(trace_id_addr(), 0, sizeof(traceid))); > > `memset` looks dodgy here. Maybe `JFR_ONLY(set_trace_id(0))`? I removed these from my previous commit, because the trace_id seemed to be predictable and wouldn't cause the archive content to change. Anyway, I've added the code you suggested in commit [5706a82](https://github.com/openjdk/jdk/pull/267/commits/5706a821a4b96c1c44c44ccf6d78c0659a6f2976). That way, if trace_ids become unpredictable due to future JFR implementation changes, the CDS code won't be affected. ------------- PR: https://git.openjdk.java.net/jdk/pull/267 From jiefu at openjdk.java.net Mon Sep 21 22:52:53 2020 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 21 Sep 2020 22:52:53 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 22:19:04 GMT, Ioi Lam wrote: >> (EDITED) In product builds, when `PackageEntry` and `ModuleEntry` objects are allocated, the memory is not zeroed. As a >> result, the structure padding slots (such as the 32-bits after `BasicHashtableEntry::_hash`) may contain garbage values >> that are different on every run of `java -Xshare:dump`. As a result, `java -Xshare:dump` cannot reproduce deterministic >> result. The fix is to clear the memory for the newly allocated `HashtableEntry` objects when `DumpSharedSpaces == >> true`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > also reset trace_id Marked as reviewed by jiefu (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/267 From coleenp at openjdk.java.net Mon Sep 21 23:00:28 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 21 Sep 2020 23:00:28 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 12:19:09 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev > excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since > the last revision: > - Update after Dan and David > - Merge branch 'master' into 8238761-asynchrounous-handshakes > - Removed double check, fix comment, removed not needed function, updated logs > - Fixed double checks > Added NSV > ProcessResult to enum > Fixed logging > Moved _active_handshaker to private > - Rebase version 1.0 Looks mostly good to me! src/hotspot/share/runtime/handshake.hpp line 55: > 53: }; > 54: > 55: class AsynchHandshakeClosure : public HandshakeClosure { Can you make this minor change? Asynch to english speakers looks like a-cinch and if left as Async is a-sink. Can you remove the 'h's ? I see David above left off the extra h, which is what one expects this to be named. src/hotspot/share/runtime/handshake.hpp line 78: > 76: FilterQueue _queue; > 77: Mutex _lock; > 78: Thread* _active_handshaker; Nit: can you line up the data member names for lock and _active_handshaker ? src/hotspot/share/runtime/handshake.cpp line 394: > 392: { > 393: NoSafepointVerifier nsv; > 394: process_self_inner(); Can you remove process_self_inner and just inline it here since this is it's only caller and both are short functions? If you don't want to, that's fine. I found myself searching for any other callers of this, that's all. ------------- Changes requested by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/151 From coleenp at openjdk.java.net Mon Sep 21 23:00:31 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 21 Sep 2020 23:00:31 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: Message-ID: <0buWd7Y3md8DLJOL6eIBdMu6vs8SVG9DzQy56OJkwsc=.6ce33107-0039-4d2c-885c-ee0961063d2a@github.com> On Mon, 21 Sep 2020 21:19:32 GMT, Coleen Phillimore wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev >> excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since >> the last revision: >> - Update after Dan and David >> - Merge branch 'master' into 8238761-asynchrounous-handshakes >> - Removed double check, fix comment, removed not needed function, updated logs >> - Fixed double checks >> Added NSV >> ProcessResult to enum >> Fixed logging >> Moved _active_handshaker to private >> - Rebase version 1.0 > > src/hotspot/share/runtime/handshake.hpp line 78: > >> 76: FilterQueue _queue; >> 77: Mutex _lock; >> 78: Thread* _active_handshaker; > > Nit: can you line up the data member names for lock and _active_handshaker ? FilterQueue _queue; JavaThread* _handshakee; Mutex _lock; Thread* _active_handshaker; Isn't this nicer? (it didn't keep the formatting in the comment) ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From coleenp at openjdk.java.net Mon Sep 21 23:00:31 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 21 Sep 2020 23:00:31 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 11:02:10 GMT, Robbin Ehn wrote: >> src/hotspot/share/runtime/handshake.cpp line 313: >> >>> 311: } >>> 312: >>> 313: int executed_by_driver = 0; >> >> Again why driver?? Isn't it either the current thread or the target that will execute the op? > > The thread executing this handshake operation, what the current thread is doesn't matter. > You can't use current threads resources or be dependent otherwise on it. > > Exception being locking issues in JVM TI, where we are dependent that requester have locked JVM TI state lock for us, > but we are not dependent that the current thread is the owner. So checking that the lock is held by requester doesn't > matter for how is the 'driver'. The "driver" concept is odd. Should it really be caller? Like the thread that called VMHandshake? ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From coleenp at openjdk.java.net Mon Sep 21 23:03:45 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 21 Sep 2020 23:03:45 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 12:19:09 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev > excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since > the last revision: > - Update after Dan and David > - Merge branch 'master' into 8238761-asynchrounous-handshakes > - Removed double check, fix comment, removed not needed function, updated logs > - Fixed double checks > Added NSV > ProcessResult to enum > Fixed logging > Moved _active_handshaker to private > - Rebase version 1.0 test/hotspot/jtreg/runtime/handshake/HandshakeDirectTest.java line 42: > 40: public class HandshakeDirectTest implements Runnable { > 41: static final int WORKING_THREADS = 32; > 42: static final int DIRECT_HANDSHAKES_MARK = 500000; Could this timeout? ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From minqi at openjdk.java.net Mon Sep 21 23:22:39 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Mon, 21 Sep 2020 23:22:39 GMT Subject: RFR: 8251261: CDS dumping should not clear states in live classes In-Reply-To: References: Message-ID: On Thu, 17 Sep 2020 18:53:55 GMT, Ioi Lam wrote: > We had an issue when CDS dumped a static archive (java -Xshare:dump), it would call `Klass::remove_unshareable_info()` > too early. In one of the test failures, ZGC was still scanning the heap and stepped on a class whose mirror has been > removed. The fix is to avoid modifying the states of the Java classes during -Xshare:dump. Instead, we call > `Klass::remove_unshareable_info()` only on the **copy** of the classes which are written into the archive. It's safe to > do so because these copies are visible only to the CDS dumping code. They aren't accessible by the GC or any other > subsystems. It turns out that we were already doing this for the dynamic archive. So I just generalized the code in > dynamicArchive.cpp and moved it to archiveBuilder.cpp. So this PR is one step forward for [JDK-8234693 Consolidate CDS > static and dynamic archive dumping code](https://bugs.openjdk.java.net/browse/JDK-8234693). I also fixed another case > where we modify the global VM state -- I removed `Universe::clear_basic_type_mirrors()`. > ---- > > We are still modifying some global VM states (such as SystemDictionary::_well_known_klasses). They seem harmless now, > but we might have to do more fixes in the future. Looks good to me. ------------- Marked as reviewed by minqi (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/227 From vromero at openjdk.java.net Mon Sep 21 23:24:57 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Mon, 21 Sep 2020 23:24:57 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v2] In-Reply-To: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: <_6qw2pA7HDrTWsZPIpQOAjSE14-9iULneDMN_o9ixNw=.0a72cfba-bb8c-454e-bfc7-44553f11bbe1@github.com> > Co-authored-by: Vicente Romero > Co-authored-by: Harold Seigel > Co-authored-by: Jonathan Gibbons > Co-authored-by: Brian Goetz > Co-authored-by: Maurizio Cimadamore > Co-authored-by: Joe Darcy > Co-authored-by: Chris Hegarty > Co-authored-by: Jan Lahoda Vicente Romero has updated the pull request incrementally with one additional commit since the last revision: removing the javax.lang.model related code to be moved to a separate bug ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/290/files - new: https://git.openjdk.java.net/jdk/pull/290/files/9eedb3ab..543e5054 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=00-01 Stats: 134 lines in 12 files changed: 130 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/290.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/290/head:pull/290 PR: https://git.openjdk.java.net/jdk/pull/290 From vromero at openjdk.java.net Mon Sep 21 23:24:58 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Mon, 21 Sep 2020 23:24:58 GMT Subject: RFR: 8246774: Record Classes (final) implementation In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Mon, 21 Sep 2020 21:53:05 GMT, Joe Darcy wrote: >> Please review the fix for [1]. The intention of this patch is to make records final removing the need to >> use --enable-preview in order to be able to include a record declaration in a source or for the VM to execute code >> compiled from record classes, Thanks >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8246774 > > Hi Vicente, > Please file a separate subtask for the javax.lang.model changes. This helps with the JSR 269 MR paperwork. > Thanks, > -Joe note: I have removed from the original patch the code related to javax.lang.model, I will publish them in a separate PR ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From ccheung at openjdk.java.net Mon Sep 21 23:58:21 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 21 Sep 2020 23:58:21 GMT Subject: RFR: 8251261: CDS dumping should not clear states in live classes In-Reply-To: References: Message-ID: On Thu, 17 Sep 2020 18:53:55 GMT, Ioi Lam wrote: > We had an issue when CDS dumped a static archive (java -Xshare:dump), it would call `Klass::remove_unshareable_info()` > too early. In one of the test failures, ZGC was still scanning the heap and stepped on a class whose mirror has been > removed. The fix is to avoid modifying the states of the Java classes during -Xshare:dump. Instead, we call > `Klass::remove_unshareable_info()` only on the **copy** of the classes which are written into the archive. It's safe to > do so because these copies are visible only to the CDS dumping code. They aren't accessible by the GC or any other > subsystems. It turns out that we were already doing this for the dynamic archive. So I just generalized the code in > dynamicArchive.cpp and moved it to archiveBuilder.cpp. So this PR is one step forward for [JDK-8234693 Consolidate CDS > static and dynamic archive dumping code](https://bugs.openjdk.java.net/browse/JDK-8234693). I also fixed another case > where we modify the global VM state -- I removed `Universe::clear_basic_type_mirrors()`. > ---- > > We are still modifying some global VM states (such as SystemDictionary::_well_known_klasses). They seem harmless now, > but we might have to do more fixes in the future. src/hotspot/share/memory/heapShared.cpp line 401: > 399: } > 400: > 401: assert(relocated_k->is_shared(), "must be a shared class"); Why is this assert removed? ------------- PR: https://git.openjdk.java.net/jdk/pull/227 From jcm at openjdk.java.net Tue Sep 22 01:56:31 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Tue, 22 Sep 2020 01:56:31 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 Message-ID: if ((thread->has_pending_exception() || thread->frames_to_pop_failed_realloc() > 0) && exec_mode != Unpack_uncommon_trap) { assert(thread->has_pending_exception(), "should have thrown OOME/Async"); introduced a buggy code checking, clearing pending exception and taking Unpack_exception route. This can have consequences as the deopt entries may have additional logic depending on bci's. and the change introduced in 8249451 doesn't honor deopt exception checking and forward logic. Thank you @fisk for pointing the bug in the code. Request for review. ------------- Commit messages: - fixing buggy code introduced in 8249451 Changes: https://git.openjdk.java.net/jdk/pull/292/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=292&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253447 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/292.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/292/head:pull/292 PR: https://git.openjdk.java.net/jdk/pull/292 From jcm at openjdk.java.net Tue Sep 22 02:05:01 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Tue, 22 Sep 2020 02:05:01 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 01:48:53 GMT, Jamsheed Mohammed C M wrote: > if ((thread->has_pending_exception() || thread->frames_to_pop_failed_realloc() > 0) && exec_mode != > Unpack_uncommon_trap) { > assert(thread->has_pending_exception(), "should have thrown OOME/Async"); > > introduced a buggy code checking, clearing pending exception and taking Unpack_exception route. > > This can have consequences as the deopt entries may have additional logic depending on bci's. and the change introduced > in 8249451 doesn't honor deopt exception checking and forward logic. Thank you @fisk for pointing the bug in the code. > Request for review. @dholmes-ora @veresov @fisk could you please have a look. ------------- PR: https://git.openjdk.java.net/jdk/pull/292 From jcm at openjdk.java.net Tue Sep 22 02:05:01 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Tue, 22 Sep 2020 02:05:01 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 [v2] In-Reply-To: References: Message-ID: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> > if ((thread->has_pending_exception() || thread->frames_to_pop_failed_realloc() > 0) && exec_mode != > Unpack_uncommon_trap) { > assert(thread->has_pending_exception(), "should have thrown OOME/Async"); > > introduced a buggy code checking, clearing pending exception and taking Unpack_exception route. > > This can have consequences as the deopt entries may have additional logic depending on bci's. and the change introduced > in 8249451 doesn't honor deopt exception checking and forward logic. Thank you @fisk for pointing the bug in the code. > Request for review. Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: fixing the assert message too ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/292/files - new: https://git.openjdk.java.net/jdk/pull/292/files/d81ce188..4eea9a95 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=292&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=292&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/292.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/292/head:pull/292 PR: https://git.openjdk.java.net/jdk/pull/292 From iveresov at openjdk.java.net Tue Sep 22 02:25:21 2020 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 22 Sep 2020 02:25:21 GMT Subject: RFR: 8249451: Unconditional exceptions clearing logic in compiler code should honor Async Exceptions. [v4] In-Reply-To: References: <2zjS36Nz0zH4AorRbppunfKPFkciaMD865WyBdMzOFI=.fc7a6fd1-96b4-4769-ab0b-b71e7f5bdc9b@github.com> Message-ID: <2cAu7aYUryynOP5VQhG5B7OR3LCDqjjOfyXvCmEK1cE=.66a98839-c47f-446e-acfc-e65ec1f70407@github.com> On Wed, 16 Sep 2020 09:36:28 GMT, Jamsheed Mohammed C M wrote: >> Hi >> >> Moving the review that is based on mercurial repo to github. >> The history of conversation is >> [here](https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-September/039861.html) >> Issue:[ JDK-8249451 ](https://bugs.openjdk.java.net/browse/JDK-8249451) >> >> @dholmes-ora could you please have a look. > > Jamsheed Mohammed C M has refreshed the contents of this pull request, and previous commits have been removed. The > incremental views will show differences compared to the previous content of the PR. The pull request contains one new > commit since the last revision: > comment modified wrt review feedback Looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/169 From iveresov at openjdk.java.net Tue Sep 22 02:26:07 2020 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 22 Sep 2020 02:26:07 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 [v2] In-Reply-To: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> References: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> Message-ID: On Tue, 22 Sep 2020 02:05:01 GMT, Jamsheed Mohammed C M wrote: >> if ((thread->has_pending_exception() || thread->frames_to_pop_failed_realloc() > 0) && exec_mode != >> Unpack_uncommon_trap) { >> assert(thread->has_pending_exception(), "should have thrown OOME/Async"); >> >> introduced a buggy code checking, clearing pending exception and taking Unpack_exception route. >> >> This can have consequences as the deopt entries may have additional logic depending on bci's. and the change introduced >> in 8249451 doesn't honor deopt exception checking and forward logic. Thank you @fisk for pointing the bug in the code. >> Request for review. > > Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: > > fixing the assert message too Marked as reviewed by iveresov (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/292 From dholmes at openjdk.java.net Tue Sep 22 02:37:56 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Sep 2020 02:37:56 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 [v2] In-Reply-To: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> References: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> Message-ID: On Tue, 22 Sep 2020 02:05:01 GMT, Jamsheed Mohammed C M wrote: >> if ((thread->has_pending_exception() || thread->frames_to_pop_failed_realloc() > 0) && exec_mode != >> Unpack_uncommon_trap) { >> assert(thread->has_pending_exception(), "should have thrown OOME/Async"); >> >> introduced a buggy code checking, clearing pending exception and taking Unpack_exception route. >> >> This can have consequences as the deopt entries may have additional logic depending on bci's. and the change introduced >> in 8249451 doesn't honor deopt exception checking and forward logic. Thank you @fisk for pointing the bug in the code. >> Request for review. > > Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: > > fixing the assert message too src/hotspot/share/runtime/deoptimization.cpp line 531: > 529: #endif > 530: > 531: if (thread->frames_to_pop_failed_realloc() > 0 && exec_mode != Unpack_uncommon_trap) { I'm not at all clear on whether an async-exception could be pending at this point. The original change indicated it could be, but now you are saying it can't. How is that known? ------------- PR: https://git.openjdk.java.net/jdk/pull/292 From minqi at openjdk.java.net Tue Sep 22 02:46:23 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 22 Sep 2020 02:46:23 GMT Subject: RFR: 8251261: CDS dumping should not clear states in live classes In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 23:54:39 GMT, Calvin Cheung wrote: >> We had an issue when CDS dumped a static archive (java -Xshare:dump), it would call `Klass::remove_unshareable_info()` >> too early. In one of the test failures, ZGC was still scanning the heap and stepped on a class whose mirror has been >> removed. The fix is to avoid modifying the states of the Java classes during -Xshare:dump. Instead, we call >> `Klass::remove_unshareable_info()` only on the **copy** of the classes which are written into the archive. It's safe to >> do so because these copies are visible only to the CDS dumping code. They aren't accessible by the GC or any other >> subsystems. It turns out that we were already doing this for the dynamic archive. So I just generalized the code in >> dynamicArchive.cpp and moved it to archiveBuilder.cpp. So this PR is one step forward for [JDK-8234693 Consolidate CDS >> static and dynamic archive dumping code](https://bugs.openjdk.java.net/browse/JDK-8234693). I also fixed another case >> where we modify the global VM state -- I removed `Universe::clear_basic_type_mirrors()`. >> ---- >> >> We are still modifying some global VM states (such as SystemDictionary::_well_known_klasses). They seem harmless now, >> but we might have to do more fixes in the future. > > src/hotspot/share/memory/heapShared.cpp line 401: > >> 399: } >> 400: >> 401: assert(relocated_k->is_shared(), "must be a shared class"); > > Why is this assert removed? The delete is reasonable since line 293 already assert that relocated_k is a relocated klass in shared region which must be shared class I think. ------------- PR: https://git.openjdk.java.net/jdk/pull/227 From iklam at openjdk.java.net Tue Sep 22 02:58:01 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 22 Sep 2020 02:58:01 GMT Subject: RFR: 8251261: CDS dumping should not clear states in live classes [v2] In-Reply-To: References: Message-ID: > We had an issue when CDS dumped a static archive (java -Xshare:dump), it would call `Klass::remove_unshareable_info()` > too early. In one of the test failures, ZGC was still scanning the heap and stepped on a class whose mirror has been > removed. The fix is to avoid modifying the states of the Java classes during -Xshare:dump. Instead, we call > `Klass::remove_unshareable_info()` only on the **copy** of the classes which are written into the archive. It's safe to > do so because these copies are visible only to the CDS dumping code. They aren't accessible by the GC or any other > subsystems. It turns out that we were already doing this for the dynamic archive. So I just generalized the code in > dynamicArchive.cpp and moved it to archiveBuilder.cpp. So this PR is one step forward for [JDK-8234693 Consolidate CDS > static and dynamic archive dumping code](https://bugs.openjdk.java.net/browse/JDK-8234693). I also fixed another case > where we modify the global VM state -- I removed `Universe::clear_basic_type_mirrors()`. > ---- > > We are still modifying some global VM states (such as SystemDictionary::_well_known_klasses). They seem harmless now, > but we might have to do more fixes in the future. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - restore assert for shared class - Merge branch 'master' into 8251261-cds-shouldnt-clear-states-of-live-classes - 8251261: CDS dumping should not clear states in live classes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/227/files - new: https://git.openjdk.java.net/jdk/pull/227/files/964ae342..03f74855 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=227&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=227&range=00-01 Stats: 7713 lines in 258 files changed: 4247 ins; 2785 del; 681 mod Patch: https://git.openjdk.java.net/jdk/pull/227.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/227/head:pull/227 PR: https://git.openjdk.java.net/jdk/pull/227 From iklam at openjdk.java.net Tue Sep 22 03:04:06 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 22 Sep 2020 03:04:06 GMT Subject: RFR: 8251261: CDS dumping should not clear states in live classes [v2] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 02:43:52 GMT, Yumin Qi wrote: >> src/hotspot/share/memory/heapShared.cpp line 401: >> >>> 399: } >>> 400: >>> 401: assert(relocated_k->is_shared(), "must be a shared class"); >> >> Why is this assert removed? > > The delete is reasonable since line 293 already assert that relocated_k is a relocated klass in shared region which > must be shared class I think. `Klass::set_is_shared()` is called in `Klass::remove_unshareable_info()`. This PR has moved the calls to `remove_unshareable_info()` to a later stage. So when this assert is executed, `relocated_k->is_shared()` is false. In the new commit [03f7485](https://github.com/openjdk/jdk/pull/227/commits/03f748555f304d4cc4d180ab989617bcebe508fb) I've restored the assert and changed it to the following: assert(ArchiveBuilder::singleton()->is_in_buffer_space(relocated_k), "must be a shared class"); ------------- PR: https://git.openjdk.java.net/jdk/pull/227 From dholmes at openjdk.java.net Tue Sep 22 03:14:32 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Sep 2020 03:14:32 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 22:57:55 GMT, Coleen Phillimore wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev >> excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since >> the last revision: >> - Update after Dan and David >> - Merge branch 'master' into 8238761-asynchrounous-handshakes >> - Removed double check, fix comment, removed not needed function, updated logs >> - Fixed double checks >> Added NSV >> ProcessResult to enum >> Fixed logging >> Moved _active_handshaker to private >> - Rebase version 1.0 > > Looks mostly good to me! Hi Robbin, I've gone back to refresh myself on the previous discussions and (internal) design walk-throughs to get a better sense of these changes. Really the "asynchronous handshake" aspect is only a small part of this. The fundamental change here is that handshakes are now maintained via a per-thread queue, and those handshake operations can, in the general case, be executed by any of the target thread, the requestor (active_handshaker) thread or the VMThread. Hence the removal of the various "JavaThread::current()" assumptions. Unless constrained otherwise, any handshake operation may be executed by the VMThread so we have to take extra care to ensure the code is written to allow this. I'm a little concerned that our detour into direct-handshakes actually lulled us into a false sense of security knowing that an operation would always execute in a JavaThread, and we have now reverted that and allowed the VMThread back in. I understand why, but the change in direction here caught me by surprise (as I had forgotten the bigger picture). It may not always be obvious that the transitive closure of the code from an operation can be safely executed by a non-JavaThread. Then on top of this generalized queuing mechanism there is a filter which allows some control over which thread may perform a given operation - at the moment the only filter isolates "async" operations which only the target thread can execute. In addition another nuance is that when processing a given thread's handshake operation queue, different threads have different criteria for when to stop processing the queue: - the target thread will drain the queue completely - the VMThread will drain the queue of all "non-async" operations** - the initiator for a "direct" operation will drain the queue up to, and including, the synchronous operation they submitted - the initiator for an "async" operation will not process any operation themselves and will simply add to the queue and then continue on their way (hence the "asynchronous") ** I do have some concerns about latency impact on the VMThread if it is used to execute operations that didn't need to be executed by the VMThread! I remain concerned about the terminology conflation that happens around "async handshakes". There are two aspects that need to be separated: - the behaviour of the thread initiating the handshake operation; and - which thread can execute the handshake operation When a thread initiates a handshake operation and waits until that operation is complete (regardless of which thread performed it, or whether the initiator processed any other operations) that is a synchronous handshake operation. When a thread initiates a handshake operation and does not wait for the operation to complete (it does the target->queue()->add(op); and continues on its way) that is an asynchronous handshake operation. The question of whether the operation must be executed by the target thread is orthogonal to whether the operation was submitted as a synchronous or asynchronous operation. So I have problem when you say that an asynchronous handshake operation is one that must be executed by the target thread, as this is not the right characterisation at all. It is okay to constrain things such that an async operation is always executed by the target, but that is not what makes it an async operation. In the general case there is no reason why an async operation might not be executed by the VMThread, or some other JavaThread performing a synchronous operation on the same target. I will go back through the actual handshake code to see if there are specific things I would like to see changed, but that will have to wait until tomorrow. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From jcm at openjdk.java.net Tue Sep 22 03:15:59 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Tue, 22 Sep 2020 03:15:59 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 [v2] In-Reply-To: References: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> Message-ID: <4UXzkGK0txAGnug89GEPNmT58axvNWBQ1aNwtb85uBI=.f08c488e-79e3-4c73-adc6-54788999eff7@github.com> On Tue, 22 Sep 2020 02:35:17 GMT, David Holmes wrote: >> Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: >> >> fixing the assert message too > > src/hotspot/share/runtime/deoptimization.cpp line 531: > >> 529: #endif >> 530: >> 531: if (thread->frames_to_pop_failed_realloc() > 0 && exec_mode != Unpack_uncommon_trap) { > > I'm not at all clear on whether an async-exception could be pending at this point. The original change indicated it > could be, but now you are saying it can't. How is that known? Async-exception can be pending for Deoptimization::load_class_by_index(Java code executed case). This can happen for C2/ and probably for JVMCI too. https://github.com/openjdk/jdk/blob/d1f9b8a8b54843f06a93078c4a058af86fcc2aac/src/hotspot/share/runtime/deoptimization.cpp#L1964 In all cases deopt entries are equipped to handle pending exceptions. In my previous code I incorrectly tried to handle it using Unpack_exception route. this can have implication that I am at method entry and I don't handle locks properly. now i simply leave it to the deopt entries to handle pending exceptions. ------------- PR: https://git.openjdk.java.net/jdk/pull/292 From dholmes at openjdk.java.net Tue Sep 22 04:16:30 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Sep 2020 04:16:30 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 [v2] In-Reply-To: <4UXzkGK0txAGnug89GEPNmT58axvNWBQ1aNwtb85uBI=.f08c488e-79e3-4c73-adc6-54788999eff7@github.com> References: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> <4UXzkGK0txAGnug89GEPNmT58axvNWBQ1aNwtb85uBI=.f08c488e-79e3-4c73-adc6-54788999eff7@github.com> Message-ID: <4AAQXW3Vy9cWsNdT55xzsW51g2voP0hktq_dGTwR6y0=.cab032ef-a8e3-441c-ad9c-ebf2a2fb8075@github.com> On Tue, 22 Sep 2020 03:13:04 GMT, Jamsheed Mohammed C M wrote: >> src/hotspot/share/runtime/deoptimization.cpp line 531: >> >>> 529: #endif >>> 530: >>> 531: if (thread->frames_to_pop_failed_realloc() > 0 && exec_mode != Unpack_uncommon_trap) { >> >> I'm not at all clear on whether an async-exception could be pending at this point. The original change indicated it >> could be, but now you are saying it can't. How is that known? > > Async-exception can be pending for Deoptimization::load_class_by_index(Java code executed case). This can happen for > C2/ and probably for JVMCI too. > https://github.com/openjdk/jdk/blob/d1f9b8a8b54843f06a93078c4a058af86fcc2aac/src/hotspot/share/runtime/deoptimization.cpp#L1964 > In all cases deopt entries are equipped to handle pending exceptions. > In my previous code I incorrectly tried to handle it using Unpack_exception route. this can have implication that I am > at method entry and I don't handle locks properly. > now i simply leave it to the deopt entries to handle pending exceptions. Okay. I'm not familiar with that code at all so will leave this for compiler folk. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/292 From ccheung at openjdk.java.net Tue Sep 22 04:42:14 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Tue, 22 Sep 2020 04:42:14 GMT Subject: RFR: 8251261: CDS dumping should not clear states in live classes [v2] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 02:58:01 GMT, Ioi Lam wrote: >> We had an issue when CDS dumped a static archive (java -Xshare:dump), it would call `Klass::remove_unshareable_info()` >> too early. In one of the test failures, ZGC was still scanning the heap and stepped on a class whose mirror has been >> removed. The fix is to avoid modifying the states of the Java classes during -Xshare:dump. Instead, we call >> `Klass::remove_unshareable_info()` only on the **copy** of the classes which are written into the archive. It's safe to >> do so because these copies are visible only to the CDS dumping code. They aren't accessible by the GC or any other >> subsystems. It turns out that we were already doing this for the dynamic archive. So I just generalized the code in >> dynamicArchive.cpp and moved it to archiveBuilder.cpp. So this PR is one step forward for [JDK-8234693 Consolidate CDS >> static and dynamic archive dumping code](https://bugs.openjdk.java.net/browse/JDK-8234693). I also fixed another case >> where we modify the global VM state -- I removed `Universe::clear_basic_type_mirrors()`. >> ---- >> >> We are still modifying some global VM states (such as SystemDictionary::_well_known_klasses). They seem harmless now, >> but we might have to do more fixes in the future. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes > the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last > revision: > - restore assert for shared class > - Merge branch 'master' into 8251261-cds-shouldnt-clear-states-of-live-classes > - 8251261: CDS dumping should not clear states in live classes Looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/227 From eosterlund at openjdk.java.net Tue Sep 22 06:23:34 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 22 Sep 2020 06:23:34 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 [v2] In-Reply-To: <4AAQXW3Vy9cWsNdT55xzsW51g2voP0hktq_dGTwR6y0=.cab032ef-a8e3-441c-ad9c-ebf2a2fb8075@github.com> References: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> <4UXzkGK0txAGnug89GEPNmT58axvNWBQ1aNwtb85uBI=.f08c488e-79e3-4c73-adc6-54788999eff7@github.com> <4AAQXW3Vy9cWsNdT55xzsW51g2voP0hktq_dGTwR6y0=.cab032ef-a8e3-441c-ad9c-ebf2a2fb8075@github.com> Message-ID: On Tue, 22 Sep 2020 04:13:26 GMT, David Holmes wrote: >> Async-exception can be pending for Deoptimization::load_class_by_index(Java code executed case). This can happen for >> C2/ and probably for JVMCI too. >> https://github.com/openjdk/jdk/blob/d1f9b8a8b54843f06a93078c4a058af86fcc2aac/src/hotspot/share/runtime/deoptimization.cpp#L1964 >> In all cases deopt entries are equipped to handle pending exceptions. >> In my previous code I incorrectly tried to handle it using Unpack_exception route. this can have implication that I am >> at method entry and I don't handle locks properly. >> now i simply leave it to the deopt entries to handle pending exceptions. > > Okay. I'm not familiar with that code at all so will leave this for compiler folk. Thanks. So basically when you get here through the uncommon trap path, you have just called a JRT_ENTRY function and returned from it by now. That means you can have an async exception installed as a pending exception. If you just leave it there, the deopt entry of the interpreter will check for it and throw it. So we were already equipped to deal with this, and that is what should happen. The code that clears the pending exception and sets the exception oop instead is for when you are unwinding due to exception throwing into a deoptimized frame. The deopt handler is returned as exception handler PC for such frames, and hence needs to quack like an exception handler. But that is not at all the scenario we are in when we go through the uncommon trap; we are not in the middle of throwing an exception. Conversely, we are just about to throw it - that difference is the crucial thing. So clearing the pending exception is likely to just make it disappear (or crash later). ------------- PR: https://git.openjdk.java.net/jdk/pull/292 From eosterlund at openjdk.java.net Tue Sep 22 06:23:34 2020 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 22 Sep 2020 06:23:34 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 [v2] In-Reply-To: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> References: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> Message-ID: On Tue, 22 Sep 2020 02:05:01 GMT, Jamsheed Mohammed C M wrote: >> if ((thread->has_pending_exception() || thread->frames_to_pop_failed_realloc() > 0) && exec_mode != >> Unpack_uncommon_trap) { >> assert(thread->has_pending_exception(), "should have thrown OOME/Async"); >> >> introduced a buggy code checking, clearing pending exception and taking Unpack_exception route. >> >> This can have consequences as the deopt entries may have additional logic depending on bci's. and the change introduced >> in 8249451 doesn't honor deopt exception checking and forward logic. Thank you @fisk for pointing the bug in the code. >> Request for review. > > Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: > > fixing the assert message too Looks good, thanks for fixing this. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/292 From jcm at openjdk.java.net Tue Sep 22 06:29:38 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Tue, 22 Sep 2020 06:29:38 GMT Subject: RFR: 8253447: Remove buggy code introduced by 8249451 [v2] In-Reply-To: References: <-1narCtDccARvrE8LbvdKMuaDtBU9f_08MJjysQ7xB0=.19e1536d-5d9d-4e4b-a1fc-9249e735fbd1@github.com> Message-ID: On Tue, 22 Sep 2020 06:20:30 GMT, Erik ?sterlund wrote: >> Jamsheed Mohammed C M has updated the pull request incrementally with one additional commit since the last revision: >> >> fixing the assert message too > > Looks good, thanks for fixing this. Thank you @dholmes-ora @veresov @fisk ------------- PR: https://git.openjdk.java.net/jdk/pull/292 From jcm at openjdk.java.net Tue Sep 22 06:29:39 2020 From: jcm at openjdk.java.net (Jamsheed Mohammed C M) Date: Tue, 22 Sep 2020 06:29:39 GMT Subject: Integrated: 8253447: Remove buggy code introduced by 8249451 In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 01:48:53 GMT, Jamsheed Mohammed C M wrote: > if ((thread->has_pending_exception() || thread->frames_to_pop_failed_realloc() > 0) && exec_mode != > Unpack_uncommon_trap) { > assert(thread->has_pending_exception(), "should have thrown OOME/Async"); > > introduced a buggy code checking, clearing pending exception and taking Unpack_exception route. > > This can have consequences as the deopt entries may have additional logic depending on bci's. and the change introduced > in 8249451 doesn't honor deopt exception checking and forward logic. Thank you @fisk for pointing the bug in the code. > Request for review. This pull request has now been integrated. Changeset: f7b1ce45 Author: Jamsheed Mohammed C M URL: https://git.openjdk.java.net/jdk/commit/f7b1ce45 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8253447: Remove buggy code introduced by 8249451 Reviewed-by: iveresov, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/292 From rehn at openjdk.java.net Tue Sep 22 07:02:30 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 22 Sep 2020 07:02:30 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: Message-ID: <_DibVw4wgsoLZ-zT_Shdu0JJ877r-6SRDlP_vBsrHe8=.81d41756-e490-4451-a24d-8fc89938e27e@github.com> On Tue, 22 Sep 2020 03:12:01 GMT, David Holmes wrote: >> Looks mostly good to me! > > Hi Robbin, > > I've gone back to refresh myself on the previous discussions and (internal) design walk-throughs to get a better sense > of these changes. Really the "asynchronous handshake" aspect is only a small part of this. The fundamental change here > is that handshakes are now maintained via a per-thread queue, and those handshake operations can, in the general case, > be executed by any of the target thread, the requestor (active_handshaker) thread or the VMThread. Hence the removal of > the various "JavaThread::current()" assumptions. Unless constrained otherwise, any handshake operation may be executed > by the VMThread so we have to take extra care to ensure the code is written to allow this. I'm a little concerned that > our detour into direct-handshakes actually lulled us into a false sense of security knowing that an operation would > always execute in a JavaThread, and we have now reverted that and allowed the VMThread back in. I understand why, but > the change in direction here caught me by surprise (as I had forgotten the bigger picture). It may not always be > obvious that the transitive closure of the code from an operation can be safely executed by a non-JavaThread. Then on > top of this generalized queuing mechanism there is a filter which allows some control over which thread may perform a > given operation - at the moment the only filter isolates "async" operations which only the target thread can execute. > In addition another nuance is that when processing a given thread's handshake operation queue, different threads have > different criteria for when to stop processing the queue: > - the target thread will drain the queue completely > - the VMThread will drain the queue of all "non-async" operations** > - the initiator for a "direct" operation will drain the queue up to, and including, the synchronous operation they > submitted > - the initiator for an "async" operation will not process any operation themselves and will simply add to the queue and > then continue on their way (hence the "asynchronous") > > ** I do have some concerns about latency impact on the VMThread if it is used to execute operations that didn't need to be > executed by the VMThread! > > I remain concerned about the terminology conflation that happens around "async handshakes". There are two aspects that > need to be separated: > - the behaviour of the thread initiating the handshake operation; and > - which thread can execute the handshake operation > > When a thread initiates a handshake operation and waits until that operation is complete (regardless of which thread > performed it, or whether the initiator processed any other operations) that is a synchronous handshake operation. When > a thread initiates a handshake operation and does not wait for the operation to complete (it does the > target->queue()->add(op); and continues on its way) that is an asynchronous handshake operation. The question of > whether the operation must be executed by the target thread is orthogonal to whether the operation was submitted as a > synchronous or asynchronous operation. So I have problem when you say that an asynchronous handshake operation is one > that must be executed by the target thread, as this is not the right characterisation at all. It is okay to constrain > things such that an async operation is always executed by the target, but that is not what makes it an async operation. > In the general case there is no reason why an async operation might not be executed by the VMThread, or some other > JavaThread performing a synchronous operation on the same target. I will go back through the actual handshake code to > see if there are specific things I would like to see changed, but that will have to wait until tomorrow. Thanks, David Hi David, you are correct and you did a fine job summarizing this, thanks! > I will go back through the actual handshake code to see if there are specific things I would like to see changed, but > that will have to wait until tomorrow. Great thanks! > > Thanks, > David # ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From shade at openjdk.java.net Tue Sep 22 07:19:21 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 22 Sep 2020 07:19:21 GMT Subject: RFR: 8253079: DeterministicDump.java fails due to garbage in structure padding [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 22:19:04 GMT, Ioi Lam wrote: >> (EDITED) In product builds, when `PackageEntry` and `ModuleEntry` objects are allocated, the memory is not zeroed. As a >> result, the structure padding slots (such as the 32-bits after `BasicHashtableEntry::_hash`) may contain garbage values >> that are different on every run of `java -Xshare:dump`. As a result, `java -Xshare:dump` cannot reproduce deterministic >> result. The fix is to clear the memory for the newly allocated `HashtableEntry` objects when `DumpSharedSpaces == >> true`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > also reset trace_id Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/267 From rehn at openjdk.java.net Tue Sep 22 07:34:16 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 22 Sep 2020 07:34:16 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 21:18:20 GMT, Coleen Phillimore wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev >> excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since >> the last revision: >> - Update after Dan and David >> - Merge branch 'master' into 8238761-asynchrounous-handshakes >> - Removed double check, fix comment, removed not needed function, updated logs >> - Fixed double checks >> Added NSV >> ProcessResult to enum >> Fixed logging >> Moved _active_handshaker to private >> - Rebase version 1.0 > > src/hotspot/share/runtime/handshake.hpp line 55: > >> 53: }; >> 54: >> 55: class AsynchHandshakeClosure : public HandshakeClosure { > > Can you make this minor change? Asynch to english speakers looks like a-cinch and if left as Async is a-sink. Can you > remove the 'h's ? I see David above left off the extra h, which is what one expects this to be named. Fixed > src/hotspot/share/runtime/handshake.cpp line 394: > >> 392: { >> 393: NoSafepointVerifier nsv; >> 394: process_self_inner(); > > Can you remove process_self_inner and just inline it here since this is it's only caller and both are short functions? > If you don't want to, that's fine. I found myself searching for any other callers of this, that's all. We might remove ThreadInVMForHandshake and NoSafepointVerifier in the future. ThreadInVMForHandshake is needed while we have the _suspend_flag and we have two different request to safepoint inside handshake due java heap allocation inside handshakes. Having them outside like now it's clearer if we do such changes, so I'll keep it this way. > test/hotspot/jtreg/runtime/handshake/HandshakeDirectTest.java line 42: > >> 40: public class HandshakeDirectTest implements Runnable { >> 41: static final int WORKING_THREADS = 32; >> 42: static final int DIRECT_HANDSHAKES_MARK = 500000; > > Could this timeout? I have seen no signs of it and have been through several mach5 runs with no issues. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Tue Sep 22 07:34:16 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 22 Sep 2020 07:34:16 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: <0buWd7Y3md8DLJOL6eIBdMu6vs8SVG9DzQy56OJkwsc=.6ce33107-0039-4d2c-885c-ee0961063d2a@github.com> References: <0buWd7Y3md8DLJOL6eIBdMu6vs8SVG9DzQy56OJkwsc=.6ce33107-0039-4d2c-885c-ee0961063d2a@github.com> Message-ID: On Mon, 21 Sep 2020 21:26:08 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/handshake.hpp line 78: >> >>> 76: FilterQueue _queue; >>> 77: Mutex _lock; >>> 78: Thread* _active_handshaker; >> >> Nit: can you line up the data member names for lock and _active_handshaker ? > > FilterQueue _queue; > JavaThread* _handshakee; > Mutex _lock; > Thread* _active_handshaker; > > Isn't this nicer? (it didn't keep the formatting in the comment) The order of members matter since C++ initialize them in declared order. My opinion when changing this was that it was easier to read when passing the only argument to the first member being initialized, thus _handshakee must be first member. But I should init _active_handshaker in constructor, so added that and lined-up. So before I do any such change please reflect over how the constructor will look like. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Tue Sep 22 07:48:36 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 22 Sep 2020 07:48:36 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: <_DibVw4wgsoLZ-zT_Shdu0JJ877r-6SRDlP_vBsrHe8=.81d41756-e490-4451-a24d-8fc89938e27e@github.com> References: <_DibVw4wgsoLZ-zT_Shdu0JJ877r-6SRDlP_vBsrHe8=.81d41756-e490-4451-a24d-8fc89938e27e@github.com> Message-ID: On Tue, 22 Sep 2020 07:00:06 GMT, Robbin Ehn wrote: >> Hi Robbin, >> >> I've gone back to refresh myself on the previous discussions and (internal) design walk-throughs to get a better sense >> of these changes. Really the "asynchronous handshake" aspect is only a small part of this. The fundamental change here >> is that handshakes are now maintained via a per-thread queue, and those handshake operations can, in the general case, >> be executed by any of the target thread, the requestor (active_handshaker) thread or the VMThread. Hence the removal of >> the various "JavaThread::current()" assumptions. Unless constrained otherwise, any handshake operation may be executed >> by the VMThread so we have to take extra care to ensure the code is written to allow this. I'm a little concerned that >> our detour into direct-handshakes actually lulled us into a false sense of security knowing that an operation would >> always execute in a JavaThread, and we have now reverted that and allowed the VMThread back in. I understand why, but >> the change in direction here caught me by surprise (as I had forgotten the bigger picture). It may not always be >> obvious that the transitive closure of the code from an operation can be safely executed by a non-JavaThread. Then on >> top of this generalized queuing mechanism there is a filter which allows some control over which thread may perform a >> given operation - at the moment the only filter isolates "async" operations which only the target thread can execute. >> In addition another nuance is that when processing a given thread's handshake operation queue, different threads have >> different criteria for when to stop processing the queue: >> - the target thread will drain the queue completely >> - the VMThread will drain the queue of all "non-async" operations** >> - the initiator for a "direct" operation will drain the queue up to, and including, the synchronous operation they >> submitted >> - the initiator for an "async" operation will not process any operation themselves and will simply add to the queue and >> then continue on their way (hence the "asynchronous") >> >> ** I do have some concerns about latency impact on the VMThread if it is used to execute operations that didn't need to be >> executed by the VMThread! >> >> I remain concerned about the terminology conflation that happens around "async handshakes". There are two aspects that >> need to be separated: >> - the behaviour of the thread initiating the handshake operation; and >> - which thread can execute the handshake operation >> >> When a thread initiates a handshake operation and waits until that operation is complete (regardless of which thread >> performed it, or whether the initiator processed any other operations) that is a synchronous handshake operation. When >> a thread initiates a handshake operation and does not wait for the operation to complete (it does the >> target->queue()->add(op); and continues on its way) that is an asynchronous handshake operation. The question of >> whether the operation must be executed by the target thread is orthogonal to whether the operation was submitted as a >> synchronous or asynchronous operation. So I have problem when you say that an asynchronous handshake operation is one >> that must be executed by the target thread, as this is not the right characterisation at all. It is okay to constrain >> things such that an async operation is always executed by the target, but that is not what makes it an async operation. >> In the general case there is no reason why an async operation might not be executed by the VMThread, or some other >> JavaThread performing a synchronous operation on the same target. I will go back through the actual handshake code to >> see if there are specific things I would like to see changed, but that will have to wait until tomorrow. Thanks, David > > Hi David, you are correct and you did a fine job summarizing this, thanks! > >> I will go back through the actual handshake code to see if there are specific things I would like to see changed, but >> that will have to wait until tomorrow. > > Great thanks! > >> >> Thanks, >> David > > # @coleenp I think you placed your comment: "The "driver" concept is odd. Should it really be caller? Like the thread that called VMHandshake?" On the commit or somewhere else, not a review comment, I can't reply. Anyhow I can reply to it here: The answer is caller makes me think of requester, but we don't know if it's the requester executing this. Therefore I referred to the thread executing handshakes on the target/handshakee as driver. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Tue Sep 22 07:54:57 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 22 Sep 2020 07:54:57 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: > This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes > are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) > Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an > arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a > singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first > pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first > pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake > operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything > except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed > filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake > operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there > is now only one method to call when handshaking one thread. Some comments about the changes: > - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake > all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much > worse then this. > > - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that > thread. > > - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a > handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. > > - Added WB testing method. > > - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. > > - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. > > - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a > safepoint and half of them after, in many handshake all operations. > > - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. > > - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the > moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. > > - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. > > - Added filtered queue and gtest for it. > > Passes multiple t1-8 runs. > Been through some pre-reviwing. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Update after Coleen ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/151/files - new: https://git.openjdk.java.net/jdk/pull/151/files/badefa47..cd784a75 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=03-04 Stats: 11 lines in 3 files changed: 1 ins; 0 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/151.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/151/head:pull/151 PR: https://git.openjdk.java.net/jdk/pull/151 From iklam at openjdk.java.net Tue Sep 22 08:07:36 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 22 Sep 2020 08:07:36 GMT Subject: Integrated: 8253079: DeterministicDump.java fails due to garbage in structure padding In-Reply-To: References: Message-ID: On Sun, 20 Sep 2020 05:37:33 GMT, Ioi Lam wrote: > (EDITED) In product builds, when `PackageEntry` and `ModuleEntry` objects are allocated, the memory is not zeroed. As a > result, the structure padding slots (such as the 32-bits after `BasicHashtableEntry::_hash`) may contain garbage values > that are different on every run of `java -Xshare:dump`. As a result, `java -Xshare:dump` cannot reproduce deterministic > result. The fix is to clear the memory for the newly allocated `HashtableEntry` objects when `DumpSharedSpaces == > true`. This pull request has now been integrated. Changeset: 284bbf02 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/284bbf02 Stats: 7 lines in 3 files changed: 0 ins; 6 del; 1 mod 8253079: DeterministicDump.java fails due to garbage in structure padding Reviewed-by: minqi, jiefu, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/267 From shade at openjdk.java.net Tue Sep 22 08:37:01 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 22 Sep 2020 08:37:01 GMT Subject: Integrated: 8253284: Zero OrderAccess barrier mappings are incorrect In-Reply-To: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> References: <2QTBQVs8OMrQ922hIOfb28qoiv7jF79stWpqFT4BqVg=.c2e64749-2050-46ed-9659-7d9034907577@github.com> Message-ID: On Thu, 17 Sep 2020 11:35:50 GMT, Aleksey Shipilev wrote: > There are some jcstress failures with AArch64 Zero. It seems because to happen because `orderAccess_linux_zero.hpp` > defaults to compiler-only barriers for most OrderAccess::* calls. We need to defer to the strongest barriers by > default. The code also needs some rearrangement to make the mappings clear. > > jcstress seems to capture the bug, and seems to pass when bug is fixed. Since this is `zero`, we want only the > interpreter (compiler configs would be excessive). Release builds capture more samples. > $ wget https://builds.shipilev.net/jcstress/jcstress-tests-all-20200917.jar > $ build/linux-x86_64-server-release/images/jdk/bin/java -jar jcstress-tests-all-20200917.jar --jvmArgs "-Xint" > > Testing: > - [x] x86_64 Linux zero release jcstress run > - [x] x86_64 MacOS zero release jcstress run > - [x] AArch64 Linux zero release jcstress run > - [x] PPC Linux zero release jcstress run > - [x] SKIPPED: ARM32 Linux zero release jcstress run (it would seem ARM32 is broken already: JDK-8253464) This pull request has now been integrated. Changeset: b9729cb4 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/b9729cb4 Stats: 37 lines in 2 files changed: 11 ins; 11 del; 15 mod 8253284: Zero OrderAccess barrier mappings are incorrect Reviewed-by: dholmes, aph, andrew ------------- PR: https://git.openjdk.java.net/jdk/pull/224 From dholmes at openjdk.java.net Tue Sep 22 09:07:45 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Sep 2020 09:07:45 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() [v2] In-Reply-To: References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> Message-ID: <0ZWYHd4cpXpMc2NQRTzTCFW8JCTYj1DplBRcMWixnpY=.d0a0549b-e5ce-42bb-86f7-c09c4feeb9c2@github.com> On Mon, 21 Sep 2020 10:01:42 GMT, Richard Reingruber wrote: >> After JDK-8252414 the safepoint/handshake code does not take _suspend_flags into accout anymore in its assessment if a >> thread is safepoint/handshake safe. This change updates the comment on >> JavaThread::java_suspend_self_with_safepoint_check(). I have (not yet) fixed the line breaks (fill-paragraph in emacs >> lingo) for a clearer diff. >> Also I could inline the (*) footnote. > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Apply version proposed by dholmes LGTM! :) Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/225 From rrich at openjdk.java.net Tue Sep 22 09:15:05 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Tue, 22 Sep 2020 09:15:05 GMT Subject: RFR: 8253241: Update comment on java_suspend_self_with_safepoint_check() [v2] In-Reply-To: <0ZWYHd4cpXpMc2NQRTzTCFW8JCTYj1DplBRcMWixnpY=.d0a0549b-e5ce-42bb-86f7-c09c4feeb9c2@github.com> References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> <0ZWYHd4cpXpMc2NQRTzTCFW8JCTYj1DplBRcMWixnpY=.d0a0549b-e5ce-42bb-86f7-c09c4feeb9c2@github.com> Message-ID: <2wFloy-KIEO5Q_rFPDKfgho_-qHeehUx89EqVK7rO_I=.42c97d78-953f-45f7-86a6-f4d1738aa561@github.com> On Tue, 22 Sep 2020 09:05:27 GMT, David Holmes wrote: >> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: >> >> Apply version proposed by dholmes > > LGTM! :) Thanks Thanks :) I'll integrate this after another 24h. ------------- PR: https://git.openjdk.java.net/jdk/pull/225 From dholmes at openjdk.java.net Tue Sep 22 09:16:09 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 22 Sep 2020 09:16:09 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v6] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Mon, 21 Sep 2020 14:19:44 GMT, Zhengyu Gu wrote: >> Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor >> before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues >> alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the >> thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to >> post_run() method. > > Zhengyu Gu has updated the pull request incrementally with two additional commits since the last revision: > > - Fix indents > - Back out thread stack cleaning, to be addressed by JDK-8253429 Thanks for restoring those bits. Dealing with the way non-JavaThreads terminate is a known issue (there may already be an open JBS issue for that) and that is best dealt with directly in that context rather than as part of this change. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/185 From chegar at openjdk.java.net Tue Sep 22 09:51:59 2020 From: chegar at openjdk.java.net (Chris Hegarty) Date: Tue, 22 Sep 2020 09:51:59 GMT Subject: RFR: 8246774: Record Classes (final) implementation In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Mon, 21 Sep 2020 23:21:18 GMT, Vicente Romero wrote: >> Hi Vicente, >> Please file a separate subtask for the javax.lang.model changes. This helps with the JSR 269 MR paperwork. >> Thanks, >> -Joe > > note: I have removed from the original patch the code related to javax.lang.model, I will publish them in a separate PR @vicente-romero-oracle I noticed that we can also remove the preview args from the record serialization tests and ObjectMethodsTest. I opened a PR against the branch in your fork. You should be able to just merge in the changes. See https://github.com/vicente-romero-oracle/jdk/pull/1 ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From simonis at openjdk.java.net Tue Sep 22 11:18:57 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Tue, 22 Sep 2020 11:18:57 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist Message-ID: Hi, can I please have a review (or an idea for a better fix) for this PR? If a tool like [cpuset](https://github.com/lpechacek/cpuset) is used to manually create and manage [cpusets](https://man7.org/linux/man-pages/man7/cpuset.7.html) the cgroups detections will be confused and crash in a debug build or behave unexpectedly in a product build. The problem is that the additionally mounted cpuset will be interpreted as if it was belonging to Cgroup controller: $ grep cgroup /proc/self/mountinfo 36 25 0:30 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 49 36 0:43 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,memory 50 36 0:44 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,rdma ... 43 36 0:37 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup. Thanks, Volker ------------- Commit messages: - 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist Changes: https://git.openjdk.java.net/jdk/pull/295/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=295&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253435 Stats: 8 lines in 1 file changed: 7 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/295.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/295/head:pull/295 PR: https://git.openjdk.java.net/jdk/pull/295 From zgu at openjdk.java.net Tue Sep 22 12:00:44 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 22 Sep 2020 12:00:44 GMT Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v6] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Tue, 22 Sep 2020 09:11:58 GMT, David Holmes wrote: > Thanks for restoring those bits. Dealing with the way non-JavaThreads terminate is a known issue (there may already be > an open JBS issue for that) and that is best dealt with directly in that context rather than as part of this change. > Thanks. Good to know. Is there a CR for tracking non-JavaThread termination? Thanks, David. ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From zgu at openjdk.java.net Tue Sep 22 12:00:45 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 22 Sep 2020 12:00:45 GMT Subject: Integrated: 8252921: NMT overwrite memory type for region assert when building dynamic archive In-Reply-To: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On Tue, 15 Sep 2020 14:37:44 GMT, Zhengyu Gu wrote: > Thread stack is currently unregistered with NMT in Thread's destructor. Apparently, only Java thread invokes destructor > before thread exits. For NonJavaThread, e.g. ConcurrentGCThread, thread may exit while its "Thread" object continues > alive, therefore, its thread stack is still "alive" from NMT perspective. Once thread exits, the virtual memory for the > thread stack can be reserved again, that confused NMT. The solution is to move thread stack unregistration code to > post_run() method. This pull request has now been integrated. Changeset: 8c02bdbf Author: Zhengyu Gu URL: https://git.openjdk.java.net/jdk/commit/8c02bdbf Stats: 25 lines in 3 files changed: 13 ins; 11 del; 1 mod 8252921: NMT overwrite memory type for region assert when building dynamic archive Reviewed-by: minqi, iklam, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/185 From coleenp at openjdk.java.net Tue Sep 22 12:24:11 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 22 Sep 2020 12:24:11 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 07:54:57 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Update after Coleen Just one comment that needs an answer if you can find it :) src/hotspot/share/prims/whitebox.cpp line 2050: > 2048: JavaThread* target = java_lang_Thread::thread(thread_oop); > 2049: TraceSelfClosure* tsc = new TraceSelfClosure(target); > 2050: Handshake::execute(tsc, target); I know it's a whitebox test, but should this delete TraceSelfClosure sometime? ------------- Changes requested by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/151 From coleenp at openjdk.java.net Tue Sep 22 12:24:11 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 22 Sep 2020 12:24:11 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: <0buWd7Y3md8DLJOL6eIBdMu6vs8SVG9DzQy56OJkwsc=.6ce33107-0039-4d2c-885c-ee0961063d2a@github.com> Message-ID: On Tue, 22 Sep 2020 07:20:30 GMT, Robbin Ehn wrote: >> FilterQueue _queue; >> JavaThread* _handshakee; >> Mutex _lock; >> Thread* _active_handshaker; >> >> Isn't this nicer? (it didn't keep the formatting in the comment) > > The order of members matter since C++ initialize them in declared order. > My opinion when changing this was that it was easier to read when passing the only argument to the first member being > initialized, thus _handshakee must be first member. > But I should init _active_handshaker in constructor, so added that and lined-up. > > So before I do any such change please reflect over how the constructor will look like. I don't understand, you'd have to rearrange the initializers in the constructor too, but I don't see any order dependance. Moving over _lock helps, so this is fine. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From shade at openjdk.java.net Tue Sep 22 12:47:00 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 22 Sep 2020 12:47:00 GMT Subject: RFR: 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence Message-ID: In `atomic_bsd_zero.hpp` and `atomic_linux_zero.hpp` there are uses of __sync_synchronize(). However, `orderAccess_*_zero.hpp` calls the kernel helper, because: /* * ARM Kernel helper for memory barrier. * Using __asm __volatile ("":::"memory") does not work reliable on ARM * and gcc __sync_synchronize(); implementation does not use the kernel * helper for all gcc versions so it is unreliable to use as well. */ We need to clean this up to use `OrderAccess::fence()` to gain access to the kernel helper. Attention @bulasevich. Testing: - [ ] ARM32 Zero jcstress - [ ] Mac OS x86_64 Zero jcstress ------------- Commit messages: - 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence Changes: https://git.openjdk.java.net/jdk/pull/298/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=298&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253469 Stats: 6 lines in 2 files changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/298.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/298/head:pull/298 PR: https://git.openjdk.java.net/jdk/pull/298 From shade at openjdk.java.net Tue Sep 22 13:57:53 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 22 Sep 2020 13:57:53 GMT Subject: RFR: 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence [v2] In-Reply-To: References: Message-ID: <2J6XEMYsziY47BD46NKAwGhtoxJKDk0-v5q2ve_Vo08=.99c37ea6-f65b-4259-8d70-9497da76b284@github.com> > In `atomic_bsd_zero.hpp` and `atomic_linux_zero.hpp` there are uses of __sync_synchronize(). However, > `orderAccess_*_zero.hpp` calls the kernel helper, because: > /* > * ARM Kernel helper for memory barrier. > * Using __asm __volatile ("":::"memory") does not work reliable on ARM > * and gcc __sync_synchronize(); implementation does not use the kernel > * helper for all gcc versions so it is unreliable to use as well. > */ > > We need to clean this up to use `OrderAccess::fence()` to gain access to the kernel helper. > > This depends on JDK-8253464 being fixed first. > > Attention @bulasevich. > > Testing: > - [ ] ARM32 Zero jcstress > - [ ] Mac OS x86_64 Zero jcstress Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Add comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/298/files - new: https://git.openjdk.java.net/jdk/pull/298/files/74324d18..041f9d78 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=298&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=298&range=00-01 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/298.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/298/head:pull/298 PR: https://git.openjdk.java.net/jdk/pull/298 From rehn at openjdk.java.net Tue Sep 22 14:09:50 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 22 Sep 2020 14:09:50 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 12:14:19 GMT, Coleen Phillimore wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/prims/whitebox.cpp line 2050: > >> 2048: JavaThread* target = java_lang_Thread::thread(thread_oop); >> 2049: TraceSelfClosure* tsc = new TraceSelfClosure(target); >> 2050: Handshake::execute(tsc, target); > > I know it's a whitebox test, but should this delete TraceSelfClosure sometime? Please have a look here, we delete it if we fail to install it: https://github.com/openjdk/jdk/blob/cd784a751a3153939b9284898f370160124ca610/src/hotspot/share/runtime/handshake.cpp#L352 Here we delete it after we processed it: https://github.com/openjdk/jdk/blob/cd784a751a3153939b9284898f370160124ca610/src/hotspot/share/runtime/handshake.cpp#L419 Since this as async op, the requester can't safely delete it. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From sgehwolf at openjdk.java.net Tue Sep 22 14:15:42 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 22 Sep 2020 14:15:42 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 11:11:36 GMT, Volker Simonis wrote: > Hi, > > can I please have a review (or an idea for a better fix) for this PR? > > If a tool like [cpuset](https://github.com/lpechacek/cpuset) is used to manually create and manage > [cpusets](https://man7.org/linux/man-pages/man7/cpuset.7.html) the cgroups detections will be confused and crash in a > debug build or behave unexpectedly in a product build. The problem is that the additionally mounted cpuset will be > interpreted as if it was belonging to Cgroup controller: $ grep cgroup /proc/self/mountinfo > 36 25 0:30 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 > 49 36 0:43 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,memory > 50 36 0:44 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,rdma > ... > 43 36 0:37 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset > 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset > The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet > another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup. > Thanks, > Volker Did you run container tests with this? ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From rehn at openjdk.java.net Tue Sep 22 14:25:26 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 22 Sep 2020 14:25:26 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: <0buWd7Y3md8DLJOL6eIBdMu6vs8SVG9DzQy56OJkwsc=.6ce33107-0039-4d2c-885c-ee0961063d2a@github.com> Message-ID: On Tue, 22 Sep 2020 12:20:36 GMT, Coleen Phillimore wrote: >> The order of members matter since C++ initialize them in declared order. >> My opinion when changing this was that it was easier to read when passing the only argument to the first member being >> initialized, thus _handshakee must be first member. >> But I should init _active_handshaker in constructor, so added that and lined-up. >> >> So before I do any such change please reflect over how the constructor will look like. > > I don't understand, you'd have to rearrange the initializers in the constructor too, but I don't see any order > dependance. Moving over _lock helps, so this is fine. You want a cosmetic change in the member declaration. I'm saying the constructor will look worse. Im asking if you want to trade a worse constructor for that? (all here is extremely subjective :) ) ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From shade at openjdk.java.net Tue Sep 22 14:28:27 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 22 Sep 2020 14:28:27 GMT Subject: RFR: 8253464: ARM32 Zero: atomic_copy64 is incorrect, breaking volatile stores Message-ID: There is a regression introduced by addition of ARMv7-specific block by JDK-8211387. It readily manifests as crash during jcstress initialization, and investigation points at broken volatile stores. Reverting JDK-8211387 from head JDK makes ARM32 start and run jcstress. The underlying reason seems to be the half-done `atomic_copy64`: it does the load with exclusive load, but then defers to the C++ store. Somewhere during handing over the value from the asm load to C++ store and/or C++ store itself, we garble the value. The way out is to implement the whole thing in asm. Also see `StubGenerator::generate_atomic_load_long` and `StubGenerator::generate_atomic_store_long` in `stubGenerator_arm.cpp`, that do roughly the same thing and were the basis for this implementation. Attention @theRealAph, @bulasevich. Testing: - [ ] ARM32 Linux zero release jcstress run ------------- Commit messages: - Whitespace fixes - Typo tmp_r -> tmp_w - 8253464: ARM32 Zero: atomic_copy64 is incorrect, breaking volatile stores Changes: https://git.openjdk.java.net/jdk/pull/299/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=299&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253464 Stats: 17 lines in 1 file changed: 8 ins; 0 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/299.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/299/head:pull/299 PR: https://git.openjdk.java.net/jdk/pull/299 From coleenp at openjdk.java.net Tue Sep 22 14:46:06 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 22 Sep 2020 14:46:06 GMT Subject: RFR: 8238761: Asynchronous handshakes [v4] In-Reply-To: References: <0buWd7Y3md8DLJOL6eIBdMu6vs8SVG9DzQy56OJkwsc=.6ce33107-0039-4d2c-885c-ee0961063d2a@github.com> Message-ID: On Tue, 22 Sep 2020 14:22:58 GMT, Robbin Ehn wrote: >> I don't understand, you'd have to rearrange the initializers in the constructor too, but I don't see any order >> dependance. Moving over _lock helps, so this is fine. > > You want a cosmetic change in the member declaration. > I'm saying the constructor will look worse. > > Im asking if you want to trade a worse constructor for that? > (all here is extremely subjective :) ) It is subjective. What you have is fine. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From coleenp at openjdk.java.net Tue Sep 22 14:51:12 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 22 Sep 2020 14:51:12 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 07:54:57 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Update after Coleen Ok, all questions addressed. Looks good! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/151 From coleenp at openjdk.java.net Tue Sep 22 14:51:12 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 22 Sep 2020 14:51:12 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: <-BS_ADrq7PRAtT5SNtB32bPlmGf6RF3IIXZjEkybsQc=.75ab871d-9c34-48be-a421-b4606c851a2c@github.com> On Tue, 22 Sep 2020 14:07:11 GMT, Robbin Ehn wrote: >> src/hotspot/share/prims/whitebox.cpp line 2050: >> >>> 2048: JavaThread* target = java_lang_Thread::thread(thread_oop); >>> 2049: TraceSelfClosure* tsc = new TraceSelfClosure(target); >>> 2050: Handshake::execute(tsc, target); >> >> I know it's a whitebox test, but should this delete TraceSelfClosure sometime? > > Please have a look here, we delete it if we fail to install it: > https://github.com/openjdk/jdk/blob/cd784a751a3153939b9284898f370160124ca610/src/hotspot/share/runtime/handshake.cpp#L352 > Here we delete it after we processed it: > https://github.com/openjdk/jdk/blob/cd784a751a3153939b9284898f370160124ca610/src/hotspot/share/runtime/handshake.cpp#L419 > > Since this as async op, the requester can't safely delete it. Ok, thanks for pointing it out. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From bob.vandette at oracle.com Tue Sep 22 14:59:16 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 22 Sep 2020 10:59:16 -0400 Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: <6043D8E3-F75D-4011-AE9A-C4D2EA1A739B@oracle.com> Yuk. I just fixed a bug which caused us to use the mount source for the cgroup type. Not fixing that bug would have hidden your problem. Are there any hints in /proc/self/cgroup or /proc/self/mounts that we could use to eliminate this manual mount? I?d be tempted to eliminate mountinfo entries that are 1) duplicate controllers and 2) not in ?/sys/fs/cgroup? mount point. Bob. > On Sep 22, 2020, at 7:18 AM, Volker Simonis wrote: > > Hi, > > can I please have a review (or an idea for a better fix) for this PR? > > If a tool like [cpuset](https://github.com/lpechacek/cpuset) is used to manually create and manage > [cpusets](https://man7.org/linux/man-pages/man7/cpuset.7.html) the cgroups detections will be confused and crash in a > debug build or behave unexpectedly in a product build. > > The problem is that the additionally mounted cpuset will be interpreted as if it was belonging to Cgroup controller: > $ grep cgroup /proc/self/mountinfo > 36 25 0:30 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 > 49 36 0:43 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,memory > 50 36 0:44 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,rdma > ... > 43 36 0:37 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset > 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset > The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet > another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup. > > Thanks, > Volker > > ------------- > > Commit messages: > - 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist > > Changes: https://git.openjdk.java.net/jdk/pull/295/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=295&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8253435 > Stats: 8 lines in 1 file changed: 7 ins; 0 del; 1 mod > Patch: https://git.openjdk.java.net/jdk/pull/295.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/295/head:pull/295 > > PR: https://git.openjdk.java.net/jdk/pull/295 From coleenp at openjdk.java.net Tue Sep 22 15:02:22 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 22 Sep 2020 15:02:22 GMT Subject: RFR: 8253457: Remove unimplemented register stack functions Message-ID: Please review removed functions left over from Itanium. Ran tier1 testing on Oracle platforms (linux-x64, macos-x64, windows-x64 and linux-aarch64) and built on linux-arm32,linux-ppc64le-debug,linux-s390x-debug,linux-x64-zero. Thanks, Coleen ------------- Commit messages: - 8253457: Remove unimplemented register stack functions Changes: https://git.openjdk.java.net/jdk/pull/300/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=300&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253457 Stats: 179 lines in 12 files changed: 0 ins; 169 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/300.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/300/head:pull/300 PR: https://git.openjdk.java.net/jdk/pull/300 From leo.korinth at oracle.com Tue Sep 22 15:11:33 2020 From: leo.korinth at oracle.com (Leo Korinth) Date: Tue, 22 Sep 2020 17:11:33 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: References: Message-ID: <34f0a5ab-4691-d438-e672-4a1b7829e467@oracle.com> Hi! I have a question regarding ChunkManager::purge(). In part: 1) purge virtual space list We try to purge empty VirtualSpaceNodes. We also remove the free chunks from the free list (_chunks) if we are successful. I can not see that we ever uncommit these chunks. Later, in part: 2) uncommit free chunks We iterate the free list and uncommits the free chunks, the problem is that we might have removed chunks from the free list in part 1 even though we did no uncommit. To me, it seems we are missing to uncommit memory here, I am probably missing something, but what? Thanks, Leo From iklam at openjdk.java.net Tue Sep 22 16:51:16 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 22 Sep 2020 16:51:16 GMT Subject: RFR: 8253457: Remove unimplemented register stack functions In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 14:55:39 GMT, Coleen Phillimore wrote: > Please review removed functions left over from Itanium. > > Ran tier1 testing on Oracle platforms (linux-x64, macos-x64, windows-x64 and linux-aarch64) and built on > linux-arm32,linux-ppc64le-debug,linux-s390x-debug,linux-x64-zero. > Thanks, > Coleen Looks good overall to me. I can see all these functions with no callers. Should they also be removed? I saw you removed only for thread_linux_arm.hpp. ./os_cpu/linux_aarch64/thread_linux_aarch64.hpp: void set_last_Java_fp(intptr_t* fp) { _anchor.set_last_Java_fp(fp); } ./os_cpu/linux_x86/thread_linux_x86.hpp: void set_last_Java_fp(intptr_t* fp) { _anchor.set_last_Java_fp(fp); } ./os_cpu/windows_x86/thread_windows_x86.hpp: void set_last_Java_fp(intptr_t* fp) { _anchor.set_last_Java_fp(fp); } ./os_cpu/linux_arm/thread_linux_arm.hpp: void set_last_Java_fp(intptr_t* fp) { _anchor.set_last_Java_fp(fp); } ./os_cpu/bsd_x86/thread_bsd_x86.hpp: void set_last_Java_fp(intptr_t* fp) { _anchor.set_last_Java_fp(fp); } ./cpu/aarch64/javaFrameAnchor_aarch64.hpp: void set_last_Java_fp(intptr_t* fp) { OrderAccess::release(); _last_Java_fp = fp; } ./cpu/arm/javaFrameAnchor_arm.hpp: void set_last_Java_fp(intptr_t* fp) { _last_Java_fp = fp; } ./cpu/x86/javaFrameAnchor_x86.hpp: void set_last_Java_fp(intptr_t* fp) { _last_Java_fp = fp; } ./share/runtime/javaFrameAnchor.hpp: void set_last_Java_pc(address pc) { _last_Java_pc = pc; } ./os_cpu/linux_arm/thread_linux_arm.hpp: void set_last_Java_pc(address pc) { _anchor.set_last_Java_pc(pc); } I am not sure about `last_Java_fp()` as I didn't check thoroughly. The other 5 removed function seem OK to me -- they don't do anything. The only exception is the *_zero.hpp versions that had some asserts, but I guess these asserts never caught anything since they don't exist on other ports anyway. src/hotspot/os_cpu/linux_arm/thread_linux_arm.hpp line 46: > 44: void set_last_Java_fp(intptr_t* fp) { _anchor.set_last_Java_fp(fp); } > 45: void set_last_Java_pc(address pc) { _anchor.set_last_Java_pc(pc); } > 46: Why is this remove from _arm, but not for thread_linux_aarch64.hpp ------------- Changes requested by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/300 From thomas.stuefe at gmail.com Tue Sep 22 17:20:11 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 22 Sep 2020 19:20:11 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: <34f0a5ab-4691-d438-e672-4a1b7829e467@oracle.com> References: <34f0a5ab-4691-d438-e672-4a1b7829e467@oracle.com> Message-ID: Hi Leo, On Tue, Sep 22, 2020 at 5:13 PM Leo Korinth wrote: > Hi! > > I have a question regarding ChunkManager::purge(). > > In part: 1) purge virtual space list > > We try to purge empty VirtualSpaceNodes. We also remove the free chunks > from the free list (_chunks) if we are successful. I can not see that > we ever uncommit these chunks. > > Later, in part: 2) uncommit free chunks > > We iterate the free list and uncommits the free chunks, the problem is > that we might have removed chunks from the free list in part 1 even though > we did no uncommit. > > To me, it seems we are missing to uncommit memory here, I am probably > missing something, but what? > > Good spotting. Yes, you do miss something, but it is obscure and I should comment this better. When we purge the vslist, we remove and delete all nodes which are purgeable (all chunks are free, and the node owns the underlying space). Deleting a node includes unmapping the underlying memory (ReservedSpace::release(), see ~VirtualSpaceNode()). So the underlying memory is gone, no need to individually uncommit each chunk. Note that (1) and (2) are completely independent from each other. (1) takes care of the rare chance of completely unmapping whole nodes. (2) takes care of uncommitting free chunks in nodes which had been unpurgeable. Arguably, (2) may be redundant since we uncommit chunks >= commit granule size already before, when returning them to the ChunkManager (see Chunkmanager::return_chunk_locked()). I left it in as a safeguard. Side node, (1) we did before in the old Metaspace, that purge mechanism is mostly unchanged. The problem with that approach is that we only rarely have the chance of purging a whole node, since all chunks in the node have to be free together. Thanks, > Leo > Thanks, Thomas From pchilanomate at openjdk.java.net Tue Sep 22 17:50:27 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Tue, 22 Sep 2020 17:50:27 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 07:54:57 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Update after Coleen Marked as reviewed by pchilanomate (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From coleenp at openjdk.java.net Tue Sep 22 18:32:52 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 22 Sep 2020 18:32:52 GMT Subject: RFR: 8253457: Remove unimplemented register stack functions In-Reply-To: References: Message-ID: <5hYrI1cwpm-YcmQtvjXcoEPpTEh8vhqJndJvih44IEs=.dac17133-6759-447c-a6a1-e503f09c7de2@github.com> On Tue, 22 Sep 2020 16:39:45 GMT, Ioi Lam wrote: >> Please review removed functions left over from Itanium. >> >> Ran tier1 testing on Oracle platforms (linux-x64, macos-x64, windows-x64 and linux-aarch64) and built on >> linux-arm32,linux-ppc64le-debug,linux-s390x-debug,linux-x64-zero. >> Thanks, >> Coleen > > src/hotspot/os_cpu/linux_arm/thread_linux_arm.hpp line 46: > >> 44: void set_last_Java_fp(intptr_t* fp) { _anchor.set_last_Java_fp(fp); } >> 45: void set_last_Java_pc(address pc) { _anchor.set_last_Java_pc(pc); } >> 46: > > Why is this remove from _arm, but not for thread_linux_aarch64.hpp Oops, I didn't mean to remove these ones. I was focusing on the stack related functions. but I can file an RFE to remove these next. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/300 From coleenp at openjdk.java.net Tue Sep 22 19:00:12 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 22 Sep 2020 19:00:12 GMT Subject: RFR: 8253457: Remove unimplemented register stack functions [v2] In-Reply-To: References: Message-ID: > Please review removed functions left over from Itanium. > > Ran tier1 testing on Oracle platforms (linux-x64, macos-x64, windows-x64 and linux-aarch64) and built on > linux-arm32,linux-ppc64le-debug,linux-s390x-debug,linux-x64-zero. > Thanks, > Coleen Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: added back unintentionally deleted functions ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/300/files - new: https://git.openjdk.java.net/jdk/pull/300/files/ab9514cc..50d74884 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=300&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=300&range=00-01 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/300.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/300/head:pull/300 PR: https://git.openjdk.java.net/jdk/pull/300 From iklam at openjdk.java.net Tue Sep 22 19:46:01 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 22 Sep 2020 19:46:01 GMT Subject: RFR: 8253457: Remove unimplemented register stack functions [v2] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 19:00:12 GMT, Coleen Phillimore wrote: >> Please review removed functions left over from Itanium. >> >> Ran tier1 testing on Oracle platforms (linux-x64, macos-x64, windows-x64 and linux-aarch64) and built on >> linux-arm32,linux-ppc64le-debug,linux-s390x-debug,linux-x64-zero. >> Thanks, >> Coleen > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > added back unintentionally deleted functions LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/300 From iklam at openjdk.java.net Tue Sep 22 21:12:33 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 22 Sep 2020 21:12:33 GMT Subject: RFR: 8253499: Problem list runtime/cds/DeterministicDump.java Message-ID: Please review this trivial fix to Problem list runtime/cds/DeterministicDump.java until [JDK-8253495](https://bugs.openjdk.java.net/browse/JDK-8253495) is fixed properly. ------------- Commit messages: - 8253499: Problem list runtime/cds/DeterministicDump.java Changes: https://git.openjdk.java.net/jdk/pull/310/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=310&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253499 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/310.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/310/head:pull/310 PR: https://git.openjdk.java.net/jdk/pull/310 From iklam at openjdk.java.net Tue Sep 22 21:28:45 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 22 Sep 2020 21:28:45 GMT Subject: RFR: 8253499: Problem list runtime/cds/DeterministicDump.java [v2] In-Reply-To: References: Message-ID: > Please review this trivial fix to Problem list runtime/cds/DeterministicDump.java until > [JDK-8253495](https://bugs.openjdk.java.net/browse/JDK-8253495) is fixed properly. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Use correct bug ID (8253495) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/310/files - new: https://git.openjdk.java.net/jdk/pull/310/files/7a2d4985..0d0af39e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=310&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=310&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/310.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/310/head:pull/310 PR: https://git.openjdk.java.net/jdk/pull/310 From dcubed at openjdk.java.net Tue Sep 22 21:28:45 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 22 Sep 2020 21:28:45 GMT Subject: RFR: 8253499: Problem list runtime/cds/DeterministicDump.java [v2] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 21:25:31 GMT, Ioi Lam wrote: >> Please review this trivial fix to Problem list runtime/cds/DeterministicDump.java until >> [JDK-8253495](https://bugs.openjdk.java.net/browse/JDK-8253495) is fixed properly. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Use correct bug ID (8253495) Changes requested by dcubed (Reviewer). Marked as reviewed by dcubed (Reviewer). test/hotspot/jtreg/ProblemList.txt line 85: > 83: # :hotspot_runtime > 84: > 85: runtime/cds/DeterministicDump.java 8253208 generic-all Did you want to use 8253208 or 8253495 for the bug ID? ------------- PR: https://git.openjdk.java.net/jdk/pull/310 From leo.korinth at oracle.com Tue Sep 22 21:40:41 2020 From: leo.korinth at oracle.com (Leo Korinth) Date: Tue, 22 Sep 2020 23:40:41 +0200 Subject: RFR: Implementation of JEP 387: Elastic Metaspace (round two) In-Reply-To: References: <34f0a5ab-4691-d438-e672-4a1b7829e467@oracle.com> Message-ID: On 22/09/2020 19:20, Thomas St?fe wrote: > Hi Leo, > > On Tue, Sep 22, 2020 at 5:13 PM Leo Korinth > wrote: > > Hi! > > I have a question regarding ChunkManager::purge(). > > In part: 1) purge virtual space list > > We try to purge empty VirtualSpaceNodes. We also remove the free chunks > from the free list (_chunks) if we are successful. I can not see that > we ever uncommit these chunks. > > Later, in part: 2) uncommit free chunks > > We iterate the free list and uncommits the free chunks, the problem is > that we might have removed chunks from the free list in part 1 even though > we did no uncommit. > > To me, it seems we are missing to uncommit memory here, I am probably missing something, but what? > > > Good spotting. Yes, you do miss something, but it is obscure and I should comment this better. > > When we purge the vslist, we remove and delete?all nodes which are purgeable (all chunks are free, and the node owns the underlying space). Deleting a node includes unmapping the underlying memory (ReservedSpace::release(), see ~VirtualSpaceNode()). So the underlying memory is gone, no need to individually uncommit each chunk. > > Note that (1) and (2) are completely independent from each other. (1) takes care of the rare chance of completely unmapping whole nodes. (2) takes care of uncommitting free chunks in nodes which had been unpurgeable. Arguably, (2) may be redundant since we uncommit chunks >= commit granule size already before, when returning them to the ChunkManager (see Chunkmanager::return_chunk_locked()). I left it in as a safeguard. > > Side node, (1) we did before in the old Metaspace, that purge mechanism is mostly unchanged. The problem with that approach is that we only rarely have the chance of purging a whole node, since all chunks in the node have to be free together. > I see, thanks for the explanation! /Leo From iklam at openjdk.java.net Tue Sep 22 22:24:57 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 22 Sep 2020 22:24:57 GMT Subject: RFR: 8253499: Problem list runtime/cds/DeterministicDump.java [v2] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 21:24:52 GMT, Daniel D. Daugherty wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Use correct bug ID (8253495) > > Marked as reviewed by dcubed (Reviewer). I ran tier1 and confirmed that runtime/cds/DeterministicDump.java is no longer executed. ------------- PR: https://git.openjdk.java.net/jdk/pull/310 From iklam at openjdk.java.net Tue Sep 22 22:24:58 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 22 Sep 2020 22:24:58 GMT Subject: Integrated: 8253499: Problem list runtime/cds/DeterministicDump.java In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 20:51:48 GMT, Ioi Lam wrote: > Please review this trivial fix to Problem list runtime/cds/DeterministicDump.java until > [JDK-8253495](https://bugs.openjdk.java.net/browse/JDK-8253495) is fixed properly. This pull request has now been integrated. Changeset: c68a31dd Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/c68a31dd Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8253499: Problem list runtime/cds/DeterministicDump.java Reviewed-by: dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/310 From david.holmes at oracle.com Wed Sep 23 01:50:04 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 23 Sep 2020 11:50:04 +1000 Subject: RFR: 8252921: NMT overwrite memory type for region assert when building dynamic archive [v6] In-Reply-To: References: <4D49ApoTrAOCqO6Km9aXF2ja_cg119ocjwEVa79I-lc=.36403a88-15f1-40e2-8739-2e5f43c84ddc@github.com> Message-ID: On 22/09/2020 10:00 pm, Zhengyu Gu wrote: > On Tue, 22 Sep 2020 09:11:58 GMT, David Holmes wrote: > >> Thanks for restoring those bits. Dealing with the way non-JavaThreads terminate is a known issue (there may already be >> an open JBS issue for that) and that is best dealt with directly in that context rather than as part of this change. >> Thanks. > > Good to know. Is there a CR for tracking non-JavaThread termination? No. The last time it was referenced as JDK-8240312. David > Thanks, David. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/185 > From dholmes at openjdk.java.net Wed Sep 23 02:06:19 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 23 Sep 2020 02:06:19 GMT Subject: RFR: 8253457: Remove unimplemented register stack functions [v2] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 19:00:12 GMT, Coleen Phillimore wrote: >> Please review removed functions left over from Itanium. >> >> Ran tier1 testing on Oracle platforms (linux-x64, macos-x64, windows-x64 and linux-aarch64) and built on >> linux-arm32,linux-ppc64le-debug,linux-s390x-debug,linux-x64-zero. >> Thanks, >> Coleen > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > added back unintentionally deleted functions Good cleanup! Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/300 From vicente.romero at oracle.com Wed Sep 23 03:30:13 2020 From: vicente.romero at oracle.com (Vicente Romero) Date: Tue, 22 Sep 2020 23:30:13 -0400 Subject: RFR: 8246774: Record Classes (final) implementation In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: good catch Chris, thanks for the patch, Vicente On 9/22/20 5:51 AM, Chris Hegarty wrote: > On Mon, 21 Sep 2020 23:21:18 GMT, Vicente Romero wrote: > >>> Hi Vicente, >>> Please file a separate subtask for the javax.lang.model changes. This helps with the JSR 269 MR paperwork. >>> Thanks, >>> -Joe >> note: I have removed from the original patch the code related to javax.lang.model, I will publish them in a separate PR > @vicente-romero-oracle I noticed that we can also remove the preview args from the record serialization tests and > ObjectMethodsTest. I opened a PR against the branch in your fork. You should be able to just merge in the changes. See > https://github.com/vicente-romero-oracle/jdk/pull/1 > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/290 From vromero at openjdk.java.net Wed Sep 23 03:34:29 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Wed, 23 Sep 2020 03:34:29 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v3] In-Reply-To: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: > Co-authored-by: Vicente Romero > Co-authored-by: Harold Seigel > Co-authored-by: Jonathan Gibbons > Co-authored-by: Brian Goetz > Co-authored-by: Maurizio Cimadamore > Co-authored-by: Joe Darcy > Co-authored-by: Chris Hegarty > Co-authored-by: Jan Lahoda Vicente Romero has updated the pull request incrementally with three additional commits since the last revision: - Merge pull request #1 from ChrisHegarty/record-serial-tests Remove preview args from JDK tests - Remove preview args from ObjectMethodsTest - Remove preview args from record serialization tests ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/290/files - new: https://git.openjdk.java.net/jdk/pull/290/files/543e5054..26b80775 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=01-02 Stats: 95 lines in 21 files changed: 0 ins; 35 del; 60 mod Patch: https://git.openjdk.java.net/jdk/pull/290.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/290/head:pull/290 PR: https://git.openjdk.java.net/jdk/pull/290 From dholmes at openjdk.java.net Wed Sep 23 04:13:05 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 23 Sep 2020 04:13:05 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: <4l2vRUllquBWTsz1t0brYAsPoBzNIKo1A7M9GBAcKNw=.da0e1992-ebff-4c31-bdc9-e6c293189276@github.com> References: <4l2vRUllquBWTsz1t0brYAsPoBzNIKo1A7M9GBAcKNw=.da0e1992-ebff-4c31-bdc9-e6c293189276@github.com> Message-ID: On Mon, 21 Sep 2020 11:07:14 GMT, Robbin Ehn wrote: >> src/hotspot/share/runtime/handshake.hpp line 44: >> >>> 42: // by the target JavaThread itself or, depending on whether the operation is >>> 43: // a single target/direct handshake or not, by the JavaThread that requested the >>> 44: // handshake or the VMThread respectively. >> >> This comment now indicates that all single target handshakes are executed as direct-handshakes and never by the >> VMThread - is that correct? > > The concept of direct handshake do not exist in that way. (but can easily be implemented using the filter) > You have operation that you need to be executed on a JavaThread, you add that to that JavaThread. > Any thread ("driver") that succeed to claim that JavaThreads handshake state (lock and that JavaThread is safe) procced > to execute from that handshake queue until empty (empty according to applied filter on queue). I think the entire comment block above can now be simplified: // A handshake closure is a callback that is executed for a JavaThread // while it is in a safepoint/handshake-safe state. Depending on the // nature of the closure, the callback may be executed by the initiating // thread, the target thread, or the VMThread. If the callback is not executed // by the target thread it will remain in a blocked state until the callback completes. >> src/hotspot/share/runtime/handshake.cpp line 230: >> >>> 228: log_trace(handshake)("Threads signaled, begin processing blocked threads by VMThread"); >>> 229: HandshakeSpinYield hsy(start_time_ns); >>> 230: int executed_by_driver = 0; >> >> driver?? Isn't this still the VMThread? > > The driver is VM thread or a JavaThread. But this is VM_HandshakeAllThreads (a VM_Operation), it can't be executed by anything other than the VMThread! ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From dholmes at openjdk.java.net Wed Sep 23 04:13:04 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 23 Sep 2020 04:13:04 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 07:54:57 GMT, Robbin Ehn wrote: >> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes >> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) >> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an >> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a >> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first >> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first >> pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake >> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything >> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed >> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake >> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there >> is now only one method to call when handshaking one thread. Some comments about the changes: >> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake >> all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much >> worse then this. >> >> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that >> thread. >> >> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a >> handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. >> >> - Added WB testing method. >> >> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. >> >> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. >> >> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a >> safepoint and half of them after, in many handshake all operations. >> >> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. >> >> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the >> moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. >> >> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. >> >> - Added filtered queue and gtest for it. >> >> Passes multiple t1-8 runs. >> Been through some pre-reviwing. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Update after Coleen No significant comments. All my concerns relate to naming and terminology, where I think there is scope for quite a bit of tidying up. Thanks. src/hotspot/share/runtime/handshake.hpp line 45: > 43: // a single target/direct handshake or not, by the JavaThread that requested the > 44: // handshake or the VMThread respectively. > 45: class HandshakeClosure : public ThreadClosure, public CHeapObj { Is use of multiple inheritance allowed within hotspot code? src/hotspot/share/runtime/handshake.hpp line 51: > 49: virtual ~HandshakeClosure() {} > 50: const char* name() const { return _name; } > 51: virtual bool is_asynch() { return false; }; I thought "asynch" has already been renamed to drop the 'h' everywhere? src/hotspot/share/runtime/handshake.hpp line 78: > 76: FilterQueue _queue; > 77: Mutex _lock; > 78: Thread* _active_handshaker; To be clear, the _handshakee is the always the target JavaThread, while the _active_handshaker, is the thread that is actually executing the handshake operation (ie do_thread). If so can you add comments on these declarations to clarify that. Thanks. src/hotspot/share/runtime/handshake.hpp line 90: > 88: void add_operation(HandshakeOperation* op); > 89: HandshakeOperation* pop_for_self(); > 90: HandshakeOperation* pop_for_processor(); What is "processor" in this context - the active handshaker? Can we not introduce yet another piece of terminology here. We should have consistency of naming when it comes to "self" and others. ie. we have pop_for_self() but has_operation() rather than has_operation_for_self() If we made the "self" case explicit then we could leave the not-self case implicit e.g. pop_for_self(); // Called by handshakee only pop(); // Called by handshaker or VMThread has_operation_for_self(); // Is there an operation that can be executed by the handshakee itself has_operation(); // Is there an operation that can be executed by the handshaker or VMThread We can then stop using "processor" in other places as well. src/hotspot/share/runtime/handshake.hpp line 96: > 94: return !_queue.is_empty(); > 95: } > 96: bool block_for_operation() { should_block_for_operation() ? Though looking at the loop that uses this the name doesn't seem right as we are not blocking but processing the operation. ?? src/hotspot/share/runtime/handshake.hpp line 97: > 95: } > 96: bool block_for_operation() { > 97: return !_queue.is_empty() || _lock.is_locked(); I really don't understand the is_locked() check in this condition. ?? And the check for !empty is racy, so how do we avoid missing an in-progress addition? src/hotspot/share/runtime/handshake.cpp line 44: > 42: protected: > 43: HandshakeClosure* _handshake_cl; > 44: int32_t _pending_threads; Not new but the meaning of _pending_threads is unclear - please add a descriptive comment. src/hotspot/share/runtime/handshake.cpp line 63: > 61: }; > 62: > 63: class AsyncHandshakeOperation : public HandshakeOperation { This doesn't quite make sense. If you have an AsyncHandshakeOperation as a distinct subclass then it should not be possible for is_async() on a HandshakeOperation to return true - but it can because it can be passed an AsyncHandshakeClosure when constructed. If you want async and non-async operations to be distinct types then you will need to restrict how the base class is constructed, and provide a protected constructor that just takes an AsyncHandShakeClosure. src/hotspot/share/runtime/handshake.cpp line 195: > 193: } > 194: > 195: static void log_handshake_info(jlong start_time_ns, const char* name, int targets, int requester_executed, const > char* extra = NULL) { It is not clear what "requester_executed" actually means here - why is this an int? what does it represent? Again we have new terminology "requester" - is that the handshakee or ??? src/hotspot/share/runtime/handshake.cpp line 244: > 242: // A new thread on the ThreadsList will not have an operation, > 243: // hence it is skipped in handshake_try_process. > 244: HandshakeState::ProcessResult pr = thr->handshake_state()->try_process(_op); To be clear on what can be happening here ... as the VMThread has to loop through all threads first to initiate the handshake, by the time it starts trying to process the op, the target threads may have already done it themselves. Additionally while looping through all threads, a thread that has not yet been handshaked could try to handshake with a thread the VMThread has already processed, and so it could also execute the operation before the VMThread gets to it. src/hotspot/share/runtime/handshake.cpp line 356: > 354: } > 355: > 356: HandshakeState::HandshakeState(JavaThread* thread) : s/thread/target/ for clarity src/hotspot/share/runtime/handshake.cpp line 412: > 410: if (op != NULL) { > 411: assert(op->_target == NULL || op->_target == Thread::current(), "Wrong thread"); > 412: assert(_handshakee == Thread::current(), "Wrong thread"); You already asserted this at line 400. src/hotspot/share/runtime/thread.hpp line 1358: > 1356: HandshakeState* handshake_state() { return &_handshake; } > 1357: > 1358: // A JavaThread can always safely operate on it self and other threads s/it self/itself/ src/hotspot/share/runtime/thread.hpp line 1359: > 1357: > 1358: // A JavaThread can always safely operate on it self and other threads > 1359: // can do it safely it if they are the active handshaker. s/it if/if/ src/hotspot/share/utilities/filterQueue.hpp line 32: > 30: > 31: template > 32: class FilterQueue { A brief description of the class would be good. It is basically a FIFO queue but with the ability to skip nodes that match a given "filter" criteria. src/hotspot/share/utilities/filterQueue.hpp line 34: > 32: class FilterQueue { > 33: private: > 34: class FilterQueueNode : public CHeapObj { The Filter in FilterQueueNode is redundant given this is a nested type. Node would suffice. src/hotspot/share/utilities/filterQueue.hpp line 56: > 54: } > 55: > 56: // MT-safe Not sure where our previous discussion is on this but these posix-style MT-safe labels don't really halp for this kind of abstract data type API. Please briefly explain the thread-safety properties of add and pop. src/hotspot/share/utilities/filterQueue.hpp line 57: > 55: > 56: // MT-safe > 57: void add(E data); It would be more regular naming to use add/remove or push/pop rather than add/pop. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/151 From dholmes at openjdk.java.net Wed Sep 23 04:13:05 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 23 Sep 2020 04:13:05 GMT Subject: RFR: 8238761: Asynchronous handshakes [v3] In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 21:59:40 GMT, Coleen Phillimore wrote: >> The thread executing this handshake operation, what the current thread is doesn't matter. >> You can't use current threads resources or be dependent otherwise on it. >> >> Exception being locking issues in JVM TI, where we are dependent that requester have locked JVM TI state lock for us, >> but we are not dependent that the current thread is the owner. So checking that the lock is held by requester doesn't >> matter for how is the 'driver'. > > The "driver" concept is odd. Should it really be caller? Like the thread that called VMHandshake? In this context "driver" is just the current thread, that called execute. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From kim.barrett at oracle.com Wed Sep 23 04:51:21 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 23 Sep 2020 00:51:21 -0400 Subject: RFR: 8253397: Ensure LogTag types are sorted In-Reply-To: References: Message-ID: > On Sep 21, 2020, at 6:35 AM, Kim Barrett wrote: > >> On Sep 21, 2020, at 1:30 AM, Claes Redestad wrote: >> >> - Sort LogTag type enum alphabetically >> - Assert that the tags are sorted instead of sorting >> >> ------------- >> >> Commit messages: >> - Ensure LogTag types are sorted >> >> Changes: https://git.openjdk.java.net/jdk/pull/274/files >> Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=274&range=00 >> Issue: https://bugs.openjdk.java.net/browse/JDK-8253397 >> Stats: 58 lines in 2 files changed: 24 ins; 22 del; 12 mod >> Patch: https://git.openjdk.java.net/jdk/pull/274.diff >> Fetch: git fetch https://git.openjdk.java.net/jdk pull/274/head:pull/274 >> >> PR: https://git.openjdk.java.net/jdk/pull/274 > > I think having the LOG_TAG_LIST in sorted order is good. > > But I think the checking can be improved. I think it can be done at > compile-time using constexpr, completely eliminating the runtime execution > and data. I'm currently prototyping; I'll let you know what I come up with. It's not hard to make the is_sorted check constexpr, but doesn't provide a real benefit compared to the proposed conditional compilation approach; neither has any additional code or computation in a product build. A pre-existing thing that should be fixed though: src/hotspot/share/logging/logTag.hpp 231 static const char* _name[]; That should be 231 static const char* const _name[]; And similarly in the .cpp file. Looks good with the above change. From shade at openjdk.java.net Wed Sep 23 05:30:24 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 23 Sep 2020 05:30:24 GMT Subject: RFR: 8253457: Remove unimplemented register stack functions [v2] In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 19:00:12 GMT, Coleen Phillimore wrote: >> Please review removed functions left over from Itanium. >> >> Ran tier1 testing on Oracle platforms (linux-x64, macos-x64, windows-x64 and linux-aarch64) and built on >> linux-arm32,linux-ppc64le-debug,linux-s390x-debug,linux-x64-zero. >> Thanks, >> Coleen > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > added back unintentionally deleted functions Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/300 From rrich at openjdk.java.net Wed Sep 23 07:21:09 2020 From: rrich at openjdk.java.net (Richard Reingruber) Date: Wed, 23 Sep 2020 07:21:09 GMT Subject: Integrated: 8253241: Update comment on java_suspend_self_with_safepoint_check() In-Reply-To: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> References: <6_dkEAXMQ0jLwWxWcjsToVSUV6PD_Jv0lTFMLHvYSgo=.8dd71e6f-da1d-4544-84b4-78c2265e78b6@github.com> Message-ID: On Thu, 17 Sep 2020 13:57:04 GMT, Richard Reingruber wrote: > After JDK-8252414 the safepoint/handshake code does not take _suspend_flags into accout anymore in its assessment if a > thread is safepoint/handshake safe. This change updates the comment on > JavaThread::java_suspend_self_with_safepoint_check(). I have (not yet) fixed the line breaks (fill-paragraph in emacs > lingo) for a clearer diff. > Also I could inline the (*) footnote. This pull request has now been integrated. Changeset: 226faa55 Author: Richard Reingruber URL: https://git.openjdk.java.net/jdk/commit/226faa55 Stats: 8 lines in 1 file changed: 0 ins; 2 del; 6 mod 8253241: Update comment on java_suspend_self_with_safepoint_check() Reviewed-by: dcubed, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/225 From rehn at openjdk.java.net Wed Sep 23 08:54:05 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 08:54:05 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 02:40:31 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/runtime/handshake.hpp line 45: > >> 43: // a single target/direct handshake or not, by the JavaThread that requested the >> 44: // handshake or the VMThread respectively. >> 45: class HandshakeClosure : public ThreadClosure, public CHeapObj { > > Is use of multiple inheritance allowed within hotspot code? There is other 'sane' way to 'implement an interface' and being heap allocated. We have a few like: class DefaultICProtectionBehaviour: public CompiledICProtectionBehaviour, public CHeapObj class AOTCompiledMethod : public CompiledMethod, public CHeapObj template class TaskQueueSetSuperImpl: public CHeapObj, public TaskQueueSetSuper So the 'rule' about multiple inheritance is in conflict with not using new/delete operator without using CHeapObj. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Wed Sep 23 08:57:13 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 08:57:13 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 02:41:37 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/runtime/handshake.hpp line 51: > >> 49: virtual ~HandshakeClosure() {} >> 50: const char* name() const { return _name; } >> 51: virtual bool is_asynch() { return false; }; > > I thought "asynch" has already been renamed to drop the 'h' everywhere? Missed this one, fixing! ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From simonis at openjdk.java.net Wed Sep 23 09:02:11 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Wed, 23 Sep 2020 09:02:11 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 14:13:02 GMT, Severin Gehwolf wrote: > Did you run container tests with this? Yes. It took me some time to find out that I have to set `-Djdk.test.docker.image.name=ubuntu` and `-Djdk.test.docker.image.version=18.04` on Ubuntu in order to run them :) For release builds they all pass with and without the change (except `TestJFRWithJMX.java` which always fails, but I don' t think that's related to this issue). Fast- and slowdebug builds don't even start without the fix and pass all the tests with it (again except `TestJFRWithJMX.java`). ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From rehn at openjdk.java.net Wed Sep 23 09:13:43 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 09:13:43 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 02:45:27 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/runtime/handshake.hpp line 78: > >> 76: FilterQueue _queue; >> 77: Mutex _lock; >> 78: Thread* _active_handshaker; > > To be clear, the _handshakee is the always the target JavaThread, while the _active_handshaker, is the thread that is > actually executing the handshake operation (ie do_thread). If so can you add comments on these declarations to clarify > that. Thanks. Fixed > src/hotspot/share/runtime/handshake.hpp line 90: > >> 88: void add_operation(HandshakeOperation* op); >> 89: HandshakeOperation* pop_for_self(); >> 90: HandshakeOperation* pop_for_processor(); > > What is "processor" in this context - the active handshaker? Can we not introduce yet another piece of terminology > here. We should have consistency of naming when it comes to "self" and others. ie. we have > pop_for_self() > > but > > has_operation() > > rather than > > has_operation_for_self() > > If we made the "self" case explicit then we could leave the not-self case implicit e.g. > > pop_for_self(); // Called by handshakee only > pop(); // Called by handshaker or VMThread > has_operation_for_self(); // Is there an operation that can be executed by the handshakee itself > has_operation(); // Is there an operation that can be executed by the handshaker or VMThread > We can then stop using "processor" in other places as well. Fixing ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From simonis at openjdk.java.net Wed Sep 23 09:18:47 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Wed, 23 Sep 2020 09:18:47 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 08:59:37 GMT, Volker Simonis wrote: >> Did you run container tests with this? > >> Did you run container tests with this? > > Yes. It took me some time to find out that I have to set `-Djdk.test.docker.image.name=ubuntu` and > `-Djdk.test.docker.image.version=18.04` on Ubuntu in order to run them :) > For release builds they all pass with and without the change (except `TestJFRWithJMX.java` which always fails, but I > don' t think that's related to this issue). Fast- and slowdebug builds don't even start without the fix and pass all > the tests with it (again except `TestJFRWithJMX.java`). > _Mailing list message from [Bob Vandette](mailto:bob.vandette at oracle.com) on > [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ > Yuk. I just fixed a bug which caused us to use the mount source for the cgroup type. Not fixing > that bug would have hidden your problem. > Sorry, but I don't understand. which bug are you speaking of and has it been fixed in the jdk already? > Are there any hints in /proc/self/cgroup or /proc/self/mounts that we could use to eliminate this manual mount? > Not that I'm aware of. I couldn't find any. > I?d be tempted to eliminate mountinfo entries that are 1) duplicate controllers and 2) not in ?/sys/fs/cgroup? mount > point. It's not easy to remove the right duplicate :) Checking for `/sys/fs/cgroup` as mount point is probably safer although that path is also just a convention as far as I know. So what about the following solution: - record the mount point for `memory`, `cpu` and `cpuacct` - when hitting `cpuset` and its mountpoint is different from the recorded one ignore it. If there's no recorded mount prefix ignore the entry if its mount prefix is not `/sys/fs/cgroup`. I think this is the best we can do if we don't want to parse `mountinfo` two times. What do you think? > Bob. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From rehn at openjdk.java.net Wed Sep 23 09:20:31 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 09:20:31 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 02:54:09 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/runtime/handshake.hpp line 96: > >> 94: return !_queue.is_empty(); >> 95: } >> 96: bool block_for_operation() { > > should_block_for_operation() ? Though looking at the loop that uses this the name doesn't seem right as we are not > blocking but processing the operation. ?? Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From sgehwolf at openjdk.java.net Wed Sep 23 09:29:38 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 23 Sep 2020 09:29:38 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 08:59:37 GMT, Volker Simonis wrote: > > Did you run container tests with this? > > Yes. It took me some time to find out that I have to set `-Djdk.test.docker.image.name=ubuntu` and > `-Djdk.test.docker.image.version=18.04` on Ubuntu in order to run them :) Yes, this is a bit painful. I should have said that. > For release builds they all pass with and without the change (except `TestJFRWithJMX.java` which always fails, but I > don' t think that's related to this issue). Fast- and slowdebug builds don't even start without the fix and pass all > the tests with it (again except `TestJFRWithJMX.java`). Hmm, which ones did you run? It seems odd that they fail to run in fastdebug config. FWIW, I've crafted a regression test for this issue. Please include something like that if you can: https://github.com/simonis/jdk/pull/1 A fix like this should make it pass (uses the `/sys/fs/cgroup` convention) - on top of your change: diff --git a/src/hotspot/os/linux/cgroupSubsystem_linux.cpp b/src/hotspot/os/linux/cgroupSubsystem_linux.cpp index 21be7a8260c..4c6ae541929 100644 --- a/src/hotspot/os/linux/cgroupSubsystem_linux.cpp +++ b/src/hotspot/os/linux/cgroupSubsystem_linux.cpp @@ -295,10 +295,10 @@ bool CgroupSubsystemFactory::determine_type(CgroupInfo* cg_infos, // Skip cgroup2 fs lines on hybrid or unified hierarchy. continue; } - if (strcmp("none", tmpsource) == 0) { - // Skip cpusets created manually or by cset/cpuset (https://github.com/lpechacek/cpuset) - // The "mount source" for these mounts is usually "none" while the source of "true" Cgroup - // controllers is usually "cgroup". But this is just another heuristic... + if (strcmp(tmpmount, "/sys/fs/cgroup") < 0) { + // Skip potentially duplicate, manually mounted cgroup controllers + // not on /sys/fs/cgroup + log_info(os, container)("%s not mounted at /sys/fs/cgroup, skipping!", tmpmount); ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From sgehwolf at openjdk.java.net Wed Sep 23 09:34:26 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 23 Sep 2020 09:34:26 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 09:15:38 GMT, Volker Simonis wrote: > So what about the following solution: > > * record the mount point for `memory`, `cpu` and `cpuacct` > * when hitting `cpuset` and its mountpoint is different from the recorded one ignore it. If there's no recorded mount > prefix ignore the entry if its mount prefix is not `/sys/fs/cgroup`. Sounds sensible to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From rehn at openjdk.java.net Wed Sep 23 09:37:56 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 09:37:56 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 02:56:00 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/runtime/handshake.hpp line 97: > >> 95: } >> 96: bool block_for_operation() { >> 97: return !_queue.is_empty() || _lock.is_locked(); > > I really don't understand the is_locked() check in this condition. ?? > And the check for !empty is racy, so how do we avoid missing an in-progress addition? A JavaThread is blocked. A second thread have just executed a handshake operation for this JavaThread and are on line: https://github.com/openjdk/jdk/blob/cd784a751a3153939b9284898f370160124ca610/src/hotspot/share/runtime/handshake.cpp#L510 And the queue is empty. The JavaThread wakes up and changes state from blocked to blocked_trans. It now checks if it's allowed to complete the transition to e.g. vm. If a third thread adds to queue before the second thread leaves the loop it's operation can be executed. But the JavaThread could see the queue as empty. (racey as you say) The executor takes lock and then checks if the JavaThread is safe for processing. The JavaThread becomes unsafe and then check if lock is locked. If the lock is locked we must take slow path to avoid this. We should also take slow path if there is something on queue to processes. We are unsafe when we check queue and lock is not held, if we 'miss' that anything is on queue, it's fine. Since any other thread cannot have seen us as safe and seen the item on queue. (since lock is not held) Thus not allowed to process the operation. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From martin.doerr at sap.com Wed Sep 23 09:39:37 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 23 Sep 2020 09:39:37 +0000 Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms In-Reply-To: References: Message-ID: Hi Gerard, sorry for the long delay. It took time to get our nightly tests working again on AIX. I have seen an issue, but it may be unrelated to your change. We'll retest it. Note that there's still unused code left in os_aix.cpp (see below). Thanks again for taking care of AIX. I appreciate having more shared POSIX code. Best regards, Martin diff --git a/src/hotspot/os/aix/os_aix.cpp b/src/hotspot/os/aix/os_aix.cpp index 02568cf..eb84217 100644 --- a/src/hotspot/os/aix/os_aix.cpp +++ b/src/hotspot/os/aix/os_aix.cpp @@ -1550,134 +1550,6 @@ void os::print_jni_name_suffix_on(outputStream* st, int args_size) { // no suffix required } -//////////////////////////////////////////////////////////////////////////////// -// sun.misc.Signal support - -static void -UserHandler(int sig, void *siginfo, void *context) { - // Ctrl-C is pressed during error reporting, likely because the error - // handler fails to abort. Let VM die immediately. - if (sig == SIGINT && VMError::is_error_reported()) { - os::die(); - } - - os::signal_notify(sig); -} - -extern "C" { - typedef void (*sa_handler_t)(int); - typedef void (*sa_sigaction_t)(int, siginfo_t *, void *); -} - -// -// The following code is moved from os.cpp for making this -// code platform specific, which it is by its very nature. -// - -// a counter for each possible signal value -static volatile jint pending_signals[NSIG+1] = { 0 }; - -// Wrapper functions for: sem_init(), sem_post(), sem_wait() -// On AIX, we use sem_init(), sem_post(), sem_wait() -// On Pase, we need to use msem_lock() and msem_unlock(), because Posix Semaphores -// do not seem to work at all on PASE (unimplemented, will cause SIGILL). -// Note that just using msem_.. APIs for both PASE and AIX is not an option either, as -// on AIX, msem_..() calls are suspected of causing problems. -static sem_t sig_sem; -static msemaphore* p_sig_msem = 0; - -static void local_sem_init() { - if (os::Aix::on_aix()) { - int rc = ::sem_init(&sig_sem, 0, 0); - guarantee(rc != -1, "sem_init failed"); - } else { - // Memory semaphores must live in shared mem. - guarantee0(p_sig_msem == NULL); - p_sig_msem = (msemaphore*)os::reserve_memory(sizeof(msemaphore), NULL); - guarantee(p_sig_msem, "Cannot allocate memory for memory semaphore"); - guarantee(::msem_init(p_sig_msem, 0) == p_sig_msem, "msem_init failed"); - } -} - -static void local_sem_post() { - static bool warn_only_once = false; - if (os::Aix::on_aix()) { - int rc = ::sem_post(&sig_sem); - if (rc == -1 && !warn_only_once) { - trcVerbose("sem_post failed (errno = %d, %s)", errno, os::errno_name(errno)); - warn_only_once = true; - } - } else { - guarantee0(p_sig_msem != NULL); - int rc = ::msem_unlock(p_sig_msem, 0); - if (rc == -1 && !warn_only_once) { - trcVerbose("msem_unlock failed (errno = %d, %s)", errno, os::errno_name(errno)); - warn_only_once = true; - } - } -} - -static void local_sem_wait() { - static bool warn_only_once = false; - if (os::Aix::on_aix()) { - int rc = ::sem_wait(&sig_sem); - if (rc == -1 && !warn_only_once) { - trcVerbose("sem_wait failed (errno = %d, %s)", errno, os::errno_name(errno)); - warn_only_once = true; - } - } else { - guarantee0(p_sig_msem != NULL); // must init before use - int rc = ::msem_lock(p_sig_msem, 0); - if (rc == -1 && !warn_only_once) { - trcVerbose("msem_lock failed (errno = %d, %s)", errno, os::errno_name(errno)); - warn_only_once = true; - } - } -} - -static void jdk_misc_signal_init() { - // Initialize signal structures - ::memset((void*)pending_signals, 0, sizeof(pending_signals)); - - // Initialize signal semaphore - local_sem_init(); -} - -static int check_pending_signals() { - for (;;) { - for (int i = 0; i < NSIG + 1; i++) { - jint n = pending_signals[i]; - if (n > 0 && n == Atomic::cmpxchg(&pending_signals[i], n, n - 1)) { - return i; - } - } - JavaThread *thread = JavaThread::current(); - ThreadBlockInVM tbivm(thread); - - bool threadIsSuspended; - do { - thread->set_suspend_equivalent(); - // cleared by handle_special_suspend_equivalent_condition() or java_suspend_self() - - local_sem_wait(); - - // were we externally suspended while we were waiting? - threadIsSuspended = thread->handle_special_suspend_equivalent_condition(); - if (threadIsSuspended) { - // - // The semaphore has been incremented, but while we were waiting - // another thread suspended us. We don't want to continue running - // while suspended because that would surprise the thread that - // suspended us. - // - - local_sem_post(); - - thread->java_suspend_self(); - } - } while (threadIsSuspended); - } -} //////////////////////////////////////////////////////////////////////////////// // Virtual Memory > -----Original Message----- > From: hotspot-runtime-dev > On Behalf Of Gerard Ziemski > Sent: Mittwoch, 16. September 2020 18:00 > To: hotspot-runtime-dev at openjdk.java.net > Subject: RFR: 8252324: Signal related code should be shared among POSIX > platforms > > hi all, > > Please review this change that refactors common POSIX code into a separate > file. > > Currently there appears to be quite a bit of duplicated code among POSIX > platforms, which makes it difficult to apply single fix to the signal code. > With this fix, we will only need to touch single file for common POSIX > code fixes from now on. > > ---------------------------------------------------------------------------- > The APIs which moved from os/bsd/os_bsd.cpp to to > os/posix/PosixSignals.cpp: > > ////////////////////////////////////////////////////////////////////////////// > // > // signal support > void os::Bsd::signal_sets_init() > sigset_t* os::Bsd::unblocked_signals() > sigset_t* os::Bsd::vm_signals() > void os::Bsd::hotspot_sigmask(Thread* thread) > ////////////////////////////////////////////////////////////////////////////// > // > // sun.misc.Signal support > static void UserHandler(int sig, void *siginfo, void *context) > void* os::user_handler() > void* os::signal(int signal_number, void* handler) > void os::signal_raise(int signal_number) > int os::sigexitnum_pd() > static void jdk_misc_signal_init() > void os::signal_notify(int sig) > static int check_pending_signals() > int os::signal_wait() > ////////////////////////////////////////////////////////////////////////////// > // > // suspend/resume support > static void resume_clear_context(OSThread *osthread) > static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, > ucontext_t* context) > static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) > static int SR_initialize() > static int sr_notify(OSThread* osthread) > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > ////////////////////////////////////////////////////////////////////////////// > ///// > // signal handling (except suspend/resume) > static void signalHandler(int sig, siginfo_t* info, void* uc) > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > static bool call_chained_handler(struct sigaction *actp, int sig, > siginfo_t *siginfo, void *context) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::install_signal_handlers() > static const char* get_signal_handler_name(address handler, > char* buf, int buflen) > static void print_signal_handler(outputStream* st, int sig, > char* buf, size_t buflen) > void os::run_periodic_checks() > void os::Bsd::check_signal_handler(int sig) > > ----------------------------------------------------------------------------- > The APIs which moved from os/posix/os_posix.cpp to > os/posix/PosixSignals.cpp: > > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > int os::Posix::get_signal_number(const char* signal_name) > int os::get_signal_number(const char* signal_name) > bool os::Posix::is_valid_signal(int sig) > bool os::Posix::is_sig_ignored(int sig) > const char* os::exception_name(int sig, char* buf, size_t size) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* > buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, > enum_sigcode_desc_t* out) > void os::print_siginfo(outputStream* os, const void* si0) > bool os::signal_thread(Thread* thread, int sig, const char* reason) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > struct sigaction* os::Posix::get_preinstalled_handler(int sig) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > > -------------------------------------------------------- > -------------------------------------------------------- > > DETAILS: > > -------------------------------------------------------- > Public APIs which are now internal static PosixSignals:: > > sigset_t* os::Bsd::vm_signals() > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::check_signal_handler(int sig) > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > bool os::Posix::is_valid_signal(int sig) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* > buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, > enum_sigcode_desc_t* out) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > ------------------------------------------------ > Public APIs which moved to public PosixSignals:: > > void os::Bsd::signal_sets_init() > void os::Bsd::hotspot_sigmask(Thread* thread) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > void os::Bsd::install_signal_handlers() > bool os::Posix::is_sig_ignored(int sig) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > > ---------------------------------------------------- > Internal APIs which are now public in PosixSignals:: > > static void jdk_misc_signal_init() > static int SR_initialize() > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > static void print_signal_handler(outputStream* st, int sig, char* buf, size_t > buflen) > > -------------------------- > New APIs in PosixSignals:: > > static bool are_signal_handlers_installed(); > > ------------- > > Commit messages: > - removed white spaces > - Refactored common POSIX signal code into seperate file > > Changes: https://git.openjdk.java.net/jdk/pull/157/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8252324 > Stats: 5247 lines in 21 files changed: 1740 ins; 3400 del; 107 mod > Patch: https://git.openjdk.java.net/jdk/pull/157.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/157/head:pull/157 > > PR: https://git.openjdk.java.net/jdk/pull/157 From sgehwolf at openjdk.java.net Wed Sep 23 09:41:17 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 23 Sep 2020 09:41:17 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 09:15:38 GMT, Volker Simonis wrote: > > _Mailing list message from [Bob Vandette](mailto:bob.vandette at oracle.com) on > > [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ Yuk. I just fixed a bug which caused us to use the > > mount source for the cgroup type. Not fixing that bug would have hidden your problem. > > Sorry, but I don't understand. which bug are you speaking of and has it been fixed in the jdk already? @bobvandette probably meant https://bugs.openjdk.java.net/browse/JDK-8252359. A little correction, though. We didn't use the mount source as the cgroup type before JDK-8252359, but we relied on the mount source to be `cgroup` or `cgroup2` before JDK-8252359, which wasn't the case on those affected systems. They had the controller name as the mount source. Bob is right, though, prior JDK-8252359, you wouldn't have hit the assert because of what I just said above. Your extra cpuset entries have `none` as mount source. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From martin.doerr at sap.com Wed Sep 23 09:51:15 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 23 Sep 2020 09:51:15 +0000 Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms In-Reply-To: References: Message-ID: Sorry, Gerard's email address was wrong. > -----Original Message----- > From: Doerr, Martin > Sent: Mittwoch, 23. September 2020 11:40 > To: Gerard Ziemski ; hotspot-runtime- > dev at openjdk.java.net > Cc: Stuefe, Thomas > Subject: RE: RFR: 8252324: Signal related code should be shared among POSIX > platforms > > Hi Gerard, > > sorry for the long delay. It took time to get our nightly tests working again on > AIX. > I have seen an issue, but it may be unrelated to your change. We'll retest it. > > Note that there's still unused code left in os_aix.cpp (see below). > Thanks again for taking care of AIX. I appreciate having more shared POSIX > code. > > Best regards, > Martin > > > diff --git a/src/hotspot/os/aix/os_aix.cpp b/src/hotspot/os/aix/os_aix.cpp > index 02568cf..eb84217 100644 > --- a/src/hotspot/os/aix/os_aix.cpp > +++ b/src/hotspot/os/aix/os_aix.cpp > @@ -1550,134 +1550,6 @@ void > os::print_jni_name_suffix_on(outputStream* st, int args_size) { > // no suffix required > } > > - > ////////////////////////////////////////////////////////////////////////////// > // > -// sun.misc.Signal support > - > -static void > -UserHandler(int sig, void *siginfo, void *context) { > - // Ctrl-C is pressed during error reporting, likely because the error > - // handler fails to abort. Let VM die immediately. > - if (sig == SIGINT && VMError::is_error_reported()) { > - os::die(); > - } > - > - os::signal_notify(sig); > -} > - > -extern "C" { > - typedef void (*sa_handler_t)(int); > - typedef void (*sa_sigaction_t)(int, siginfo_t *, void *); > -} > - > -// > -// The following code is moved from os.cpp for making this > -// code platform specific, which it is by its very nature. > -// > - > -// a counter for each possible signal value > -static volatile jint pending_signals[NSIG+1] = { 0 }; > - > -// Wrapper functions for: sem_init(), sem_post(), sem_wait() > -// On AIX, we use sem_init(), sem_post(), sem_wait() > -// On Pase, we need to use msem_lock() and msem_unlock(), because > Posix Semaphores > -// do not seem to work at all on PASE (unimplemented, will cause SIGILL). > -// Note that just using msem_.. APIs for both PASE and AIX is not an option > either, as > -// on AIX, msem_..() calls are suspected of causing problems. > -static sem_t sig_sem; > -static msemaphore* p_sig_msem = 0; > - > -static void local_sem_init() { > - if (os::Aix::on_aix()) { > - int rc = ::sem_init(&sig_sem, 0, 0); > - guarantee(rc != -1, "sem_init failed"); > - } else { > - // Memory semaphores must live in shared mem. > - guarantee0(p_sig_msem == NULL); > - p_sig_msem = > (msemaphore*)os::reserve_memory(sizeof(msemaphore), NULL); > - guarantee(p_sig_msem, "Cannot allocate memory for memory > semaphore"); > - guarantee(::msem_init(p_sig_msem, 0) == p_sig_msem, "msem_init > failed"); > - } > -} > - > -static void local_sem_post() { > - static bool warn_only_once = false; > - if (os::Aix::on_aix()) { > - int rc = ::sem_post(&sig_sem); > - if (rc == -1 && !warn_only_once) { > - trcVerbose("sem_post failed (errno = %d, %s)", errno, > os::errno_name(errno)); > - warn_only_once = true; > - } > - } else { > - guarantee0(p_sig_msem != NULL); > - int rc = ::msem_unlock(p_sig_msem, 0); > - if (rc == -1 && !warn_only_once) { > - trcVerbose("msem_unlock failed (errno = %d, %s)", errno, > os::errno_name(errno)); > - warn_only_once = true; > - } > - } > -} > - > -static void local_sem_wait() { > - static bool warn_only_once = false; > - if (os::Aix::on_aix()) { > - int rc = ::sem_wait(&sig_sem); > - if (rc == -1 && !warn_only_once) { > - trcVerbose("sem_wait failed (errno = %d, %s)", errno, > os::errno_name(errno)); > - warn_only_once = true; > - } > - } else { > - guarantee0(p_sig_msem != NULL); // must init before use > - int rc = ::msem_lock(p_sig_msem, 0); > - if (rc == -1 && !warn_only_once) { > - trcVerbose("msem_lock failed (errno = %d, %s)", errno, > os::errno_name(errno)); > - warn_only_once = true; > - } > - } > -} > - > -static void jdk_misc_signal_init() { > - // Initialize signal structures > - ::memset((void*)pending_signals, 0, sizeof(pending_signals)); > - > - // Initialize signal semaphore > - local_sem_init(); > -} > - > -static int check_pending_signals() { > - for (;;) { > - for (int i = 0; i < NSIG + 1; i++) { > - jint n = pending_signals[i]; > - if (n > 0 && n == Atomic::cmpxchg(&pending_signals[i], n, n - 1)) { > - return i; > - } > - } > - JavaThread *thread = JavaThread::current(); > - ThreadBlockInVM tbivm(thread); > - > - bool threadIsSuspended; > - do { > - thread->set_suspend_equivalent(); > - // cleared by handle_special_suspend_equivalent_condition() or > java_suspend_self() > - > - local_sem_wait(); > - > - // were we externally suspended while we were waiting? > - threadIsSuspended = thread- > >handle_special_suspend_equivalent_condition(); > - if (threadIsSuspended) { > - // > - // The semaphore has been incremented, but while we were waiting > - // another thread suspended us. We don't want to continue running > - // while suspended because that would surprise the thread that > - // suspended us. > - // > - > - local_sem_post(); > - > - thread->java_suspend_self(); > - } > - } while (threadIsSuspended); > - } > -} > > > ////////////////////////////////////////////////////////////////////////////// > // > // Virtual Memory > > > > > -----Original Message----- > > From: hotspot-runtime-dev retn at openjdk.java.net> > > On Behalf Of Gerard Ziemski > > Sent: Mittwoch, 16. September 2020 18:00 > > To: hotspot-runtime-dev at openjdk.java.net > > Subject: RFR: 8252324: Signal related code should be shared among POSIX > > platforms > > > > hi all, > > > > Please review this change that refactors common POSIX code into a > separate > > file. > > > > Currently there appears to be quite a bit of duplicated code among POSIX > > platforms, which makes it difficult to apply single fix to the signal code. > > With this fix, we will only need to touch single file for common POSIX > > code fixes from now on. > > > > ---------------------------------------------------------------------------- > > The APIs which moved from os/bsd/os_bsd.cpp to to > > os/posix/PosixSignals.cpp: > > > > > ////////////////////////////////////////////////////////////////////////////// > > // > > // signal support > > void os::Bsd::signal_sets_init() > > sigset_t* os::Bsd::unblocked_signals() > > sigset_t* os::Bsd::vm_signals() > > void os::Bsd::hotspot_sigmask(Thread* thread) > > > ////////////////////////////////////////////////////////////////////////////// > > // > > // sun.misc.Signal support > > static void UserHandler(int sig, void *siginfo, void *context) > > void* os::user_handler() > > void* os::signal(int signal_number, void* handler) > > void os::signal_raise(int signal_number) > > int os::sigexitnum_pd() > > static void jdk_misc_signal_init() > > void os::signal_notify(int sig) > > static int check_pending_signals() > > int os::signal_wait() > > > ////////////////////////////////////////////////////////////////////////////// > > // > > // suspend/resume support > > static void resume_clear_context(OSThread *osthread) > > static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, > > ucontext_t* context) > > static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) > > static int SR_initialize() > > static int sr_notify(OSThread* osthread) > > static bool do_suspend(OSThread* osthread) > > static void do_resume(OSThread* osthread) > > > ////////////////////////////////////////////////////////////////////////////// > > ///// > > // signal handling (except suspend/resume) > > static void signalHandler(int sig, siginfo_t* info, void* uc) > > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > > static bool call_chained_handler(struct sigaction *actp, int sig, > > siginfo_t *siginfo, void *context) > > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > > int os::Bsd::get_our_sigflags(int sig) > > void os::Bsd::set_our_sigflags(int sig, int flags) > > void os::Bsd::set_signal_handler(int sig, bool set_installed) > > void os::Bsd::install_signal_handlers() > > static const char* get_signal_handler_name(address handler, > > char* buf, int buflen) > > static void print_signal_handler(outputStream* st, int sig, > > char* buf, size_t buflen) > > void os::run_periodic_checks() > > void os::Bsd::check_signal_handler(int sig) > > > > ----------------------------------------------------------------------------- > > The APIs which moved from os/posix/os_posix.cpp to > > os/posix/PosixSignals.cpp: > > > > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > > int os::Posix::get_signal_number(const char* signal_name) > > int os::get_signal_number(const char* signal_name) > > bool os::Posix::is_valid_signal(int sig) > > bool os::Posix::is_sig_ignored(int sig) > > const char* os::exception_name(int sig, char* buf, size_t size) > > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* > > buffer, size_t buf_size) > > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* > set) > > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > > oid os::Posix::print_sa_flags(outputStream* st, int flags) > > static bool get_signal_code_description(const siginfo_t* si, > > enum_sigcode_desc_t* out) > > void os::print_siginfo(outputStream* os, const void* si0) > > bool os::signal_thread(Thread* thread, int sig, const char* reason) > > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > > struct sigaction* os::Posix::get_preinstalled_handler(int sig) > > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > > > > > -------------------------------------------------------- > > -------------------------------------------------------- > > > > DETAILS: > > > > -------------------------------------------------------- > > Public APIs which are now internal static PosixSignals:: > > > > sigset_t* os::Bsd::vm_signals() > > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > > int os::Bsd::get_our_sigflags(int sig) > > void os::Bsd::set_our_sigflags(int sig, int flags) > > void os::Bsd::set_signal_handler(int sig, bool set_installed) > > void os::Bsd::check_signal_handler(int sig) > > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > > bool os::Posix::is_valid_signal(int sig) > > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* > > buffer, size_t buf_size) > > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* > set) > > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > > oid os::Posix::print_sa_flags(outputStream* st, int flags) > > static bool get_signal_code_description(const siginfo_t* si, > > enum_sigcode_desc_t* out) > > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > > > ------------------------------------------------ > > Public APIs which moved to public PosixSignals:: > > > > void os::Bsd::signal_sets_init() > > void os::Bsd::hotspot_sigmask(Thread* thread) > > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > > void os::Bsd::install_signal_handlers() > > bool os::Posix::is_sig_ignored(int sig) > > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > > > > ---------------------------------------------------- > > Internal APIs which are now public in PosixSignals:: > > > > static void jdk_misc_signal_init() > > static int SR_initialize() > > static bool do_suspend(OSThread* osthread) > > static void do_resume(OSThread* osthread) > > static void print_signal_handler(outputStream* st, int sig, char* buf, size_t > > buflen) > > > > -------------------------- > > New APIs in PosixSignals:: > > > > static bool are_signal_handlers_installed(); > > > > ------------- > > > > Commit messages: > > - removed white spaces > > - Refactored common POSIX signal code into seperate file > > > > Changes: https://git.openjdk.java.net/jdk/pull/157/files > > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=00 > > Issue: https://bugs.openjdk.java.net/browse/JDK-8252324 > > Stats: 5247 lines in 21 files changed: 1740 ins; 3400 del; 107 mod > > Patch: https://git.openjdk.java.net/jdk/pull/157.diff > > Fetch: git fetch https://git.openjdk.java.net/jdk pull/157/head:pull/157 > > > > PR: https://git.openjdk.java.net/jdk/pull/157 From sgehwolf at openjdk.java.net Wed Sep 23 10:08:01 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 23 Sep 2020 10:08:01 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 09:26:13 GMT, Severin Gehwolf wrote: > > > Did you run container tests with this? For the record, the important one to run is this one: test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java It's independent of your hosts cgroup files. I believe that test broke with your proposed v1 fix because after your patch any `none` entries would be skipped. All of them are `none` for this test after JDK-8252359. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From rehn at openjdk.java.net Wed Sep 23 10:11:40 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 10:11:40 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 03:04:39 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/runtime/handshake.cpp line 63: > >> 61: }; >> 62: >> 63: class AsyncHandshakeOperation : public HandshakeOperation { > > This doesn't quite make sense. If you have an AsyncHandshakeOperation as a distinct subclass then it should not be > possible for is_async() on a HandshakeOperation to return true - but it can because it can be passed an > AsyncHandshakeClosure when constructed. If you want async and non-async operations to be distinct types then you will > need to restrict how the base class is constructed, and provide a protected constructor that just takes an > AsyncHandShakeClosure. This implementation code not part of the interface. By casting the AsyncHandShakeClosure to a HandshakeClosure before instantiating the HandshakeOperation you can still get is_async() to return true. And there are a loads of other user error which can be done like stack allocating AsyncHandshakeOperation. Protecting against all those kinds of errors requires a lot of more code. > src/hotspot/share/runtime/handshake.cpp line 44: > >> 42: protected: >> 43: HandshakeClosure* _handshake_cl; >> 44: int32_t _pending_threads; > > Not new but the meaning of _pending_threads is unclear - please add a descriptive comment. Fixed > src/hotspot/share/runtime/handshake.cpp line 195: > >> 193: } >> 194: >> 195: static void log_handshake_info(jlong start_time_ns, const char* name, int targets, int requester_executed, const >> char* extra = NULL) { > > It is not clear what "requester_executed" actually means here - why is this an int? what does it represent? > Again we have new terminology "requester" - is that the handshakee or ??? Fixed > src/hotspot/share/runtime/handshake.cpp line 356: > >> 354: } >> 355: >> 356: HandshakeState::HandshakeState(JavaThread* thread) : > > s/thread/target/ for clarity Fixed > src/hotspot/share/runtime/handshake.cpp line 412: > >> 410: if (op != NULL) { >> 411: assert(op->_target == NULL || op->_target == Thread::current(), "Wrong thread"); >> 412: assert(_handshakee == Thread::current(), "Wrong thread"); > > You already asserted this at line 400. Fixed > src/hotspot/share/runtime/thread.hpp line 1358: > >> 1356: HandshakeState* handshake_state() { return &_handshake; } >> 1357: >> 1358: // A JavaThread can always safely operate on it self and other threads > > s/it self/itself/ Fixed > src/hotspot/share/runtime/thread.hpp line 1359: > >> 1357: >> 1358: // A JavaThread can always safely operate on it self and other threads >> 1359: // can do it safely it if they are the active handshaker. > > s/it if/if/ Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Wed Sep 23 10:25:56 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 10:25:56 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 04:03:30 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/utilities/filterQueue.hpp line 32: > >> 30: >> 31: template >> 32: class FilterQueue { > > A brief description of the class would be good. It is basically a FIFO queue but with the ability to skip nodes that > match a given "filter" criteria. Added > src/hotspot/share/utilities/filterQueue.hpp line 34: > >> 32: class FilterQueue { >> 33: private: >> 34: class FilterQueueNode : public CHeapObj { > > The Filter in FilterQueueNode is redundant given this is a nested type. Node would suffice. Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Wed Sep 23 10:52:27 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 10:52:27 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 04:06:46 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/utilities/filterQueue.hpp line 56: > >> 54: } >> 55: >> 56: // MT-safe > > Not sure where our previous discussion is on this but these posix-style MT-safe labels don't really halp for this kind > of abstract data type API. Please briefly explain the thread-safety properties of add and pop. Added comment > src/hotspot/share/utilities/filterQueue.hpp line 57: > >> 55: >> 56: // MT-safe >> 57: void add(E data); > > It would be more regular naming to use add/remove or push/pop rather than add/pop. Change to push. ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From rehn at openjdk.java.net Wed Sep 23 10:55:20 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 10:55:20 GMT Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 03:20:02 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after Coleen > > src/hotspot/share/runtime/handshake.cpp line 244: > >> 242: // A new thread on the ThreadsList will not have an operation, >> 243: // hence it is skipped in handshake_try_process. >> 244: HandshakeState::ProcessResult pr = thr->handshake_state()->try_process(_op); > > To be clear on what can be happening here ... as the VMThread has to loop through all threads first to initiate the > handshake, by the time it starts trying to process the op, the target threads may have already done it themselves. > Additionally while looping through all threads, a thread that has not yet been handshaked could try to handshake with a > thread the VMThread has already processed, and so it could also execute the operation before the VMThread gets to it. Yes ------------- PR: https://git.openjdk.java.net/jdk/pull/151 From kbarrett at openjdk.java.net Wed Sep 23 11:12:11 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 23 Sep 2020 11:12:11 GMT Subject: RFR: 8253397: Ensure LogTag types are sorted In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:11:32 GMT, Claes Redestad wrote: > - Sort LogTag type enum alphabetically > - Assert that the tags are sorted instead of sorting Marked as reviewed by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/274 From rehn at openjdk.java.net Wed Sep 23 11:20:32 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 11:20:32 GMT Subject: RFR: 8238761: Asynchronous handshakes [v6] In-Reply-To: References: Message-ID: > This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes > are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) > Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an > arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a > singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first > pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first > pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake > operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything > except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed > filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake > operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there > is now only one method to call when handshaking one thread. Some comments about the changes: > - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake > all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much > worse then this. > > - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that > thread. > > - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a > handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. > > - Added WB testing method. > > - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. > > - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. > > - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a > safepoint and half of them after, in many handshake all operations. > > - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. > > - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the > moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. > > - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. > > - Added filtered queue and gtest for it. > > Passes multiple t1-8 runs. > Been through some pre-reviwing. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Update after David ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/151/files - new: https://git.openjdk.java.net/jdk/pull/151/files/cd784a75..31421807 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=04-05 Stats: 105 lines in 7 files changed: 27 ins; 5 del; 73 mod Patch: https://git.openjdk.java.net/jdk/pull/151.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/151/head:pull/151 PR: https://git.openjdk.java.net/jdk/pull/151 From tschatzl at openjdk.java.net Wed Sep 23 11:21:20 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 23 Sep 2020 11:21:20 GMT Subject: RFR: 8253397: Ensure LogTag types are sorted In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:11:32 GMT, Claes Redestad wrote: > - Sort LogTag type enum alphabetically > - Assert that the tags are sorted instead of sorting Marked as reviewed by tschatzl (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/274 From coleenp at openjdk.java.net Wed Sep 23 11:26:40 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 23 Sep 2020 11:26:40 GMT Subject: Withdrawn: 8253457: Remove unimplemented register stack functions In-Reply-To: References: Message-ID: <49W0yrwzlGwI3kmMfzNPMCk0gWeKq9acvFxexAsAIZQ=.4c8110d8-a26f-4c7b-b17f-901f3e0c3b53@github.com> On Tue, 22 Sep 2020 14:55:39 GMT, Coleen Phillimore wrote: > Please review removed functions left over from Itanium. > > Ran tier1 testing on Oracle platforms (linux-x64, macos-x64, windows-x64 and linux-aarch64) and built on > linux-arm32,linux-ppc64le-debug,linux-s390x-debug,linux-x64-zero. > Thanks, > Coleen This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/300 From coleenp at openjdk.java.net Wed Sep 23 11:34:36 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 23 Sep 2020 11:34:36 GMT Subject: Integrated: 8253457: Remove unimplemented register stack functions In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 14:55:39 GMT, Coleen Phillimore wrote: > Please review removed functions left over from Itanium. > > Ran tier1 testing on Oracle platforms (linux-x64, macos-x64, windows-x64 and linux-aarch64) and built on > linux-arm32,linux-ppc64le-debug,linux-s390x-debug,linux-x64-zero. > Thanks, > Coleen This pull request has now been integrated. Changeset: b8ea80af Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/b8ea80af Stats: 175 lines in 12 files changed: 165 ins; 0 del; 10 mod 8253457: Remove unimplemented register stack functions Reviewed-by: iklam, dholmes, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/300 From rehn at openjdk.java.net Wed Sep 23 11:39:34 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 11:39:34 GMT Subject: RFR: 8238761: Asynchronous handshakes [v7] In-Reply-To: References: Message-ID: > This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes > are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime) > Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an > arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a > singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first > pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first > pointer also CAS. The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake > operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything > except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed > filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake > operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there > is now only one method to call when handshaking one thread. Some comments about the changes: > - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake > all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much > worse then this. > > - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that > thread. > > - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a > handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation. > > - Added WB testing method. > > - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did. > > - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking. > > - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a > safepoint and half of them after, in many handshake all operations. > > - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe. > > - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the > moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows. > > - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank. > > - Added filtered queue and gtest for it. > > Passes multiple t1-8 runs. > Been through some pre-reviwing. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Fixed trailing whitespace ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/151/files - new: https://git.openjdk.java.net/jdk/pull/151/files/31421807..94daf2c7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=151&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/151.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/151/head:pull/151 PR: https://git.openjdk.java.net/jdk/pull/151 From bob.vandette at oracle.com Wed Sep 23 11:57:14 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 23 Sep 2020 07:57:14 -0400 Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: <27183C10-0820-4B0C-A52B-0F9158BADD91@oracle.com> > On Sep 23, 2020, at 5:41 AM, Severin Gehwolf wrote: > > On Wed, 23 Sep 2020 09:15:38 GMT, Volker Simonis wrote: > >>> _Mailing list message from [Bob Vandette](mailto:bob.vandette at oracle.com) on >>> [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ Yuk. I just fixed a bug which caused us to use the >>> mount source for the cgroup type. Not fixing that bug would have hidden your problem. >> >> Sorry, but I don't understand. which bug are you speaking of and has it been fixed in the jdk already? > > @bobvandette probably meant https://bugs.openjdk.java.net/browse/JDK-8252359. A little correction, though. We didn't > use the mount source as the cgroup type before JDK-8252359, but we relied on the mount source to be `cgroup` or > `cgroup2` before JDK-8252359, which wasn't the case on those affected systems. They had the controller name as the > mount source. That?s the bug I was referring to. Bob. > > Bob is right, though, prior JDK-8252359, you wouldn't have hit the assert because of what I just said above. Your extra > cpuset entries have `none` as mount source. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/295 From bob.vandette at oracle.com Wed Sep 23 12:05:11 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 23 Sep 2020 08:05:11 -0400 Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: > On Sep 23, 2020, at 5:34 AM, Severin Gehwolf wrote: > > On Wed, 23 Sep 2020 09:15:38 GMT, Volker Simonis wrote: > >> So what about the following solution: >> >> * record the mount point for `memory`, `cpu` and `cpuacct` >> * when hitting `cpuset` and its mountpoint is different from the recorded one ignore it. If there's no recorded mount >> prefix ignore the entry if its mount prefix is not `/sys/fs/cgroup`. > > Sounds sensible to me. I?m ok with that approach as well. Bob. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/295 From gziemski at openjdk.java.net Wed Sep 23 13:14:45 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 23 Sep 2020 13:14:45 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms In-Reply-To: References: Message-ID: On Mon, 14 Sep 2020 20:59:57 GMT, Gerard Ziemski wrote: > hi all, > > Please review this change that refactors common POSIX code into a separate > file. > > Currently there appears to be quite a bit of duplicated code among POSIX > platforms, which makes it difficult to apply single fix to the signal code. > With this fix, we will only need to touch single file for common POSIX > code fixes from now on. > > ---------------------------------------------------------------------------- > The APIs which moved from os/bsd/os_bsd.cpp to to os/posix/PosixSignals.cpp: > > //////////////////////////////////////////////////////////////////////////////// > // signal support > void os::Bsd::signal_sets_init() > sigset_t* os::Bsd::unblocked_signals() > sigset_t* os::Bsd::vm_signals() > void os::Bsd::hotspot_sigmask(Thread* thread) > //////////////////////////////////////////////////////////////////////////////// > // sun.misc.Signal support > static void UserHandler(int sig, void *siginfo, void *context) > void* os::user_handler() > void* os::signal(int signal_number, void* handler) > void os::signal_raise(int signal_number) > int os::sigexitnum_pd() > static void jdk_misc_signal_init() > void os::signal_notify(int sig) > static int check_pending_signals() > int os::signal_wait() > //////////////////////////////////////////////////////////////////////////////// > // suspend/resume support > static void resume_clear_context(OSThread *osthread) > static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, ucontext_t* context) > static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) > static int SR_initialize() > static int sr_notify(OSThread* osthread) > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > /////////////////////////////////////////////////////////////////////////////////// > // signal handling (except suspend/resume) > static void signalHandler(int sig, siginfo_t* info, void* uc) > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > static bool call_chained_handler(struct sigaction *actp, int sig, > siginfo_t *siginfo, void *context) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::install_signal_handlers() > static const char* get_signal_handler_name(address handler, > char* buf, int buflen) > static void print_signal_handler(outputStream* st, int sig, > char* buf, size_t buflen) > void os::run_periodic_checks() > void os::Bsd::check_signal_handler(int sig) > > ----------------------------------------------------------------------------- > The APIs which moved from os/posix/os_posix.cpp to os/posix/PosixSignals.cpp: > > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > int os::Posix::get_signal_number(const char* signal_name) > int os::get_signal_number(const char* signal_name) > bool os::Posix::is_valid_signal(int sig) > bool os::Posix::is_sig_ignored(int sig) > const char* os::exception_name(int sig, char* buf, size_t size) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::print_siginfo(outputStream* os, const void* si0) > bool os::signal_thread(Thread* thread, int sig, const char* reason) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > struct sigaction* os::Posix::get_preinstalled_handler(int sig) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > > -------------------------------------------------------- > -------------------------------------------------------- > > DETAILS: > > -------------------------------------------------------- > Public APIs which are now internal static PosixSignals:: > > sigset_t* os::Bsd::vm_signals() > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::check_signal_handler(int sig) > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > bool os::Posix::is_valid_signal(int sig) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > ------------------------------------------------ > Public APIs which moved to public PosixSignals:: > > void os::Bsd::signal_sets_init() > void os::Bsd::hotspot_sigmask(Thread* thread) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > void os::Bsd::install_signal_handlers() > bool os::Posix::is_sig_ignored(int sig) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > > ---------------------------------------------------- > Internal APIs which are now public in PosixSignals:: > > static void jdk_misc_signal_init() > static int SR_initialize() > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) > > -------------------------- > New APIs in PosixSignals:: > > static bool are_signal_handlers_installed(); > _Mailing list message from [Doerr, Martin](mailto:martin.doerr at sap.com) on > [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ > Hi Gerard, > > sorry for the long delay. It took time to get our nightly tests working again on AIX. > I have seen an issue, but it may be unrelated to your change. We'll retest it. I will take another look at AIX changes to see if I missed something. > Note that there's still unused code left in os_aix.cpp (see below). > Thanks again for taking care of AIX. I appreciate having more shared POSIX code. Thank you for the diff, I'll use it and update the webrev. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From redestad at openjdk.java.net Wed Sep 23 13:56:41 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 23 Sep 2020 13:56:41 GMT Subject: RFR: 8253397: Ensure LogTag types are sorted [v2] In-Reply-To: References: Message-ID: > - Sort LogTag type enum alphabetically > - Assert that the tags are sorted instead of sorting Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Make LogTag::_name const - Ensure LogTag types are sorted ------------- Changes: https://git.openjdk.java.net/jdk/pull/274/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=274&range=01 Stats: 60 lines in 2 files changed: 24 ins; 22 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/274.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/274/head:pull/274 PR: https://git.openjdk.java.net/jdk/pull/274 From redestad at openjdk.java.net Wed Sep 23 14:17:53 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 23 Sep 2020 14:17:53 GMT Subject: Integrated: 8253397: Ensure LogTag types are sorted In-Reply-To: References: Message-ID: On Mon, 21 Sep 2020 05:11:32 GMT, Claes Redestad wrote: > - Sort LogTag type enum alphabetically > - Assert that the tags are sorted instead of sorting This pull request has now been integrated. Changeset: 5f1d6120 Author: Claes Redestad URL: https://git.openjdk.java.net/jdk/commit/5f1d6120 Stats: 61 lines in 2 files changed: 23 ins; 25 del; 13 mod 8253397: Ensure LogTag types are sorted Reviewed-by: dholmes, kbarrett, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/274 From coleenp at openjdk.java.net Wed Sep 23 14:39:25 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 23 Sep 2020 14:39:25 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function Message-ID: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> That the monitor has already been unlocked, or is a null stacklock monitor has been already checked in the caller, so the code that makes it a JRT_ENTRY_NO_ASYNC is unnecessary. Making it a JRT_LEAF like the compiled method entries makes it safer. We know it can never safepoint and unintentionally install a async exception. Tested with tier1-6. ------------- Commit messages: - 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function Changes: https://git.openjdk.java.net/jdk/pull/320/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=320&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253540 Stats: 11 lines in 1 file changed: 0 ins; 4 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/320.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/320/head:pull/320 PR: https://git.openjdk.java.net/jdk/pull/320 From gziemski at openjdk.java.net Wed Sep 23 15:30:27 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 23 Sep 2020 15:30:27 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v2] In-Reply-To: References: Message-ID: > hi all, > > Please review this change that refactors common POSIX code into a separate > file. > > Currently there appears to be quite a bit of duplicated code among POSIX > platforms, which makes it difficult to apply single fix to the signal code. > With this fix, we will only need to touch single file for common POSIX > code fixes from now on. > > ---------------------------------------------------------------------------- > The APIs which moved from os/bsd/os_bsd.cpp to to os/posix/PosixSignals.cpp: > > //////////////////////////////////////////////////////////////////////////////// > // signal support > void os::Bsd::signal_sets_init() > sigset_t* os::Bsd::unblocked_signals() > sigset_t* os::Bsd::vm_signals() > void os::Bsd::hotspot_sigmask(Thread* thread) > //////////////////////////////////////////////////////////////////////////////// > // sun.misc.Signal support > static void UserHandler(int sig, void *siginfo, void *context) > void* os::user_handler() > void* os::signal(int signal_number, void* handler) > void os::signal_raise(int signal_number) > int os::sigexitnum_pd() > static void jdk_misc_signal_init() > void os::signal_notify(int sig) > static int check_pending_signals() > int os::signal_wait() > //////////////////////////////////////////////////////////////////////////////// > // suspend/resume support > static void resume_clear_context(OSThread *osthread) > static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, ucontext_t* context) > static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) > static int SR_initialize() > static int sr_notify(OSThread* osthread) > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > /////////////////////////////////////////////////////////////////////////////////// > // signal handling (except suspend/resume) > static void signalHandler(int sig, siginfo_t* info, void* uc) > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > static bool call_chained_handler(struct sigaction *actp, int sig, > siginfo_t *siginfo, void *context) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::install_signal_handlers() > static const char* get_signal_handler_name(address handler, > char* buf, int buflen) > static void print_signal_handler(outputStream* st, int sig, > char* buf, size_t buflen) > void os::run_periodic_checks() > void os::Bsd::check_signal_handler(int sig) > > ----------------------------------------------------------------------------- > The APIs which moved from os/posix/os_posix.cpp to os/posix/PosixSignals.cpp: > > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > int os::Posix::get_signal_number(const char* signal_name) > int os::get_signal_number(const char* signal_name) > bool os::Posix::is_valid_signal(int sig) > bool os::Posix::is_sig_ignored(int sig) > const char* os::exception_name(int sig, char* buf, size_t size) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::print_siginfo(outputStream* os, const void* si0) > bool os::signal_thread(Thread* thread, int sig, const char* reason) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > struct sigaction* os::Posix::get_preinstalled_handler(int sig) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > > -------------------------------------------------------- > -------------------------------------------------------- > > DETAILS: > > -------------------------------------------------------- > Public APIs which are now internal static PosixSignals:: > > sigset_t* os::Bsd::vm_signals() > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::check_signal_handler(int sig) > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > bool os::Posix::is_valid_signal(int sig) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > ------------------------------------------------ > Public APIs which moved to public PosixSignals:: > > void os::Bsd::signal_sets_init() > void os::Bsd::hotspot_sigmask(Thread* thread) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > void os::Bsd::install_signal_handlers() > bool os::Posix::is_sig_ignored(int sig) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > > ---------------------------------------------------- > Internal APIs which are now public in PosixSignals:: > > static void jdk_misc_signal_init() > static int SR_initialize() > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) > > -------------------------- > New APIs in PosixSignals:: > > static bool are_signal_handlers_installed(); Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: Remove leftover AIX signal code ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/157/files - new: https://git.openjdk.java.net/jdk/pull/157/files/4bc34bc1..f3e3dd85 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=00-01 Stats: 129 lines in 1 file changed: 0 ins; 129 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/157.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/157/head:pull/157 PR: https://git.openjdk.java.net/jdk/pull/157 From rehn at openjdk.java.net Wed Sep 23 15:30:51 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 23 Sep 2020 15:30:51 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: <4ku2THydFU3O7EPgI9EYZkGD6aOJrUy-qlq4LJAPxQw=.a1ce42b6-8229-42ee-ae3d-e6a64071c21c@github.com> On Wed, 23 Sep 2020 14:32:21 GMT, Coleen Phillimore wrote: > That the monitor has already been unlocked, or is a null stacklock monitor has been already checked in the caller, so > the code that makes it a JRT_ENTRY_NO_ASYNC is unnecessary. > Making it a JRT_LEAF like the compiled method entries makes it safer. We know it can never safepoint and > unintentionally install a async exception. > Tested with tier1-6. Marked as reviewed by rehn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From bulasevich at openjdk.java.net Wed Sep 23 15:46:09 2020 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Wed, 23 Sep 2020 15:46:09 GMT Subject: RFR: 8253464: ARM32 Zero: atomic_copy64 is incorrect, breaking volatile stores In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 14:12:42 GMT, Aleksey Shipilev wrote: > There is a regression introduced by addition of ARMv7-specific block by JDK-8211387. It readily manifests as crash > during jcstress initialization, and investigation points at broken volatile stores. Reverting JDK-8211387 from head JDK > makes ARM32 start and run jcstress. The underlying reason seems to be the half-done `atomic_copy64`: it does the load > with exclusive load, but then defers to the C++ store. Somewhere during handing over the value from the asm load to C++ > store and/or C++ store itself, we garble the value. The way out is to implement the whole thing in asm. Also see > `StubGenerator::generate_atomic_load_long` and `StubGenerator::generate_atomic_store_long` in `stubGenerator_arm.cpp`, > that do roughly the same thing and were the basis for this implementation. Attention @theRealAph, @bulasevich. > > Testing: > - [x] ARM32 Linux zero release jcstress run src/hotspot/os_cpu/linux_zero/os_linux_zero.hpp line 79: > 77: asm volatile ("ldrexd %[tmp_r], [%[src]]\n" > 78: "clrex\n" > 79: "1:\n" The change is good. Minor remarks: I don't see reason of tmp_r(_w) naming and I'd prefer meaningful label name. ------------- PR: https://git.openjdk.java.net/jdk/pull/299 From shade at openjdk.java.net Wed Sep 23 16:05:39 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 23 Sep 2020 16:05:39 GMT Subject: RFR: 8253464: ARM32 Zero: atomic_copy64 is incorrect, breaking volatile stores In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 15:43:18 GMT, Boris Ulasevich wrote: >> There is a regression introduced by addition of ARMv7-specific block by JDK-8211387. It readily manifests as crash >> during jcstress initialization, and investigation points at broken volatile stores. Reverting JDK-8211387 from head JDK >> makes ARM32 start and run jcstress. The underlying reason seems to be the half-done `atomic_copy64`: it does the load >> with exclusive load, but then defers to the C++ store. Somewhere during handing over the value from the asm load to C++ >> store and/or C++ store itself, we garble the value. The way out is to implement the whole thing in asm. Also see >> `StubGenerator::generate_atomic_load_long` and `StubGenerator::generate_atomic_store_long` in `stubGenerator_arm.cpp`, >> that do roughly the same thing and were the basis for this implementation. Attention @theRealAph, @bulasevich. >> >> Testing: >> - [x] ARM32 Linux zero release jcstress run > > src/hotspot/os_cpu/linux_zero/os_linux_zero.hpp line 79: > >> 77: asm volatile ("ldrexd %[tmp_r], [%[src]]\n" >> 78: "clrex\n" >> 79: "1:\n" > > The change is good. > Minor remarks: I don't see reason of tmp_r(_w) naming and I'd prefer meaningful label name. There are five operands, I'd prefer to use symbolic names to avoid confusion. "1:" is the local _numeric_ label, it does not show up in (and potentially conflict with) symbol table. Meaningful names would necessarily be _symbolic_ labels. Does that resolve your concerns? ------------- PR: https://git.openjdk.java.net/jdk/pull/299 From mdoerr at openjdk.java.net Wed Sep 23 16:07:17 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 23 Sep 2020 16:07:17 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: <12XVpCUkBwseij2afllGJ_aA9kU16-KWEM8d_E_zVho=.38084ba5-cbcb-4c47-aeab-65e96fc31517@github.com> On Wed, 23 Sep 2020 14:32:21 GMT, Coleen Phillimore wrote: > That the monitor has already been unlocked, or is a null stacklock monitor has been already checked in the caller, so > the code that makes it a JRT_ENTRY_NO_ASYNC is unnecessary. > Making it a JRT_LEAF like the compiled method entries makes it safer. We know it can never safepoint and > unintentionally install a async exception. > Tested with tier1-6. Hi Coleen, looks like a nice cleanup. Shouldn't we use call_VM_leaf at the call sites? src/hotspot/share/interpreter/interpreterRuntime.cpp line 742: > 740: oop obj = elem->obj(); > 741: assert(!obj->is_unlocked(), "caller checked these conditions"); > 742: assert(Universe::heap()->is_in_or_null(obj), "must be NULL or an object"); obj can't be null at this point ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From bulasevich at openjdk.java.net Wed Sep 23 16:09:53 2020 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Wed, 23 Sep 2020 16:09:53 GMT Subject: RFR: 8253464: ARM32 Zero: atomic_copy64 is incorrect, breaking volatile stores In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 16:02:17 GMT, Aleksey Shipilev wrote: >> src/hotspot/os_cpu/linux_zero/os_linux_zero.hpp line 79: >> >>> 77: asm volatile ("ldrexd %[tmp_r], [%[src]]\n" >>> 78: "clrex\n" >>> 79: "1:\n" >> >> The change is good. >> Minor remarks: I don't see reason of tmp_r(_w) naming and I'd prefer meaningful label name. > > There are five operands, I'd prefer to use symbolic names to avoid confusion. > > "1:" is the local _numeric_ label, it does not show up in (and potentially conflict with) symbol table. Meaningful > names would necessarily be _symbolic_ labels. > Does that resolve your concerns? good ------------- PR: https://git.openjdk.java.net/jdk/pull/299 From aph at openjdk.java.net Wed Sep 23 16:49:09 2020 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 23 Sep 2020 16:49:09 GMT Subject: RFR: 8253464: ARM32 Zero: atomic_copy64 is incorrect, breaking volatile stores In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 14:12:42 GMT, Aleksey Shipilev wrote: > There is a regression introduced by addition of ARMv7-specific block by JDK-8211387. It readily manifests as crash > during jcstress initialization, and investigation points at broken volatile stores. Reverting JDK-8211387 from head JDK > makes ARM32 start and run jcstress. The underlying reason seems to be the half-done `atomic_copy64`: it does the load > with exclusive load, but then defers to the C++ store. Somewhere during handing over the value from the asm load to C++ > store and/or C++ store itself, we garble the value. The way out is to implement the whole thing in asm. Also see > `StubGenerator::generate_atomic_load_long` and `StubGenerator::generate_atomic_store_long` in `stubGenerator_arm.cpp`, > that do roughly the same thing and were the basis for this implementation. Attention @theRealAph, @bulasevich. > > Testing: > - [x] ARM32 Linux zero release jcstress run OK. It's fugly, but as far as I know there really is no better way to do it. From what I remember (it's been a while) even LDREXD wasn't guaranteed to be atomic unless accompanied by a corresponding STREXD at the same address. However, later versions of the Arm ARM do state that LDREXD is single-copy atomic, so we're good. ------------- Marked as reviewed by aph (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/299 From shade at openjdk.java.net Wed Sep 23 16:55:37 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 23 Sep 2020 16:55:37 GMT Subject: Integrated: 8253464: ARM32 Zero: atomic_copy64 is incorrect, breaking volatile stores In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 14:12:42 GMT, Aleksey Shipilev wrote: > There is a regression introduced by addition of ARMv7-specific block by JDK-8211387. It readily manifests as crash > during jcstress initialization, and investigation points at broken volatile stores. Reverting JDK-8211387 from head JDK > makes ARM32 start and run jcstress. The underlying reason seems to be the half-done `atomic_copy64`: it does the load > with exclusive load, but then defers to the C++ store. Somewhere during handing over the value from the asm load to C++ > store and/or C++ store itself, we garble the value. The way out is to implement the whole thing in asm. Also see > `StubGenerator::generate_atomic_load_long` and `StubGenerator::generate_atomic_store_long` in `stubGenerator_arm.cpp`, > that do roughly the same thing and were the basis for this implementation. Attention @theRealAph, @bulasevich. > > Testing: > - [x] ARM32 Linux zero release jcstress run This pull request has now been integrated. Changeset: c21690b5 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/c21690b5 Stats: 17 lines in 1 file changed: 8 ins; 0 del; 9 mod 8253464: ARM32 Zero: atomic_copy64 is incorrect, breaking volatile stores Reviewed-by: aph ------------- PR: https://git.openjdk.java.net/jdk/pull/299 From dcubed at openjdk.java.net Wed Sep 23 17:29:50 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 17:29:50 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: On Wed, 23 Sep 2020 14:32:21 GMT, Coleen Phillimore wrote: > That the monitor has already been unlocked, or is a null stacklock monitor has been already checked in the caller, so > the code that makes it a JRT_ENTRY_NO_ASYNC is unnecessary. > Making it a JRT_LEAF like the compiled method entries makes it safer. We know it can never safepoint and > unintentionally install a async exception. > Tested with tier1-6. Changes requested by dcubed (Reviewer). src/hotspot/share/interpreter/interpreterRuntime.cpp line 746: > 744: if (elem == NULL || h_obj()->is_unlocked()) { > 745: THROW(vmSymbols::java_lang_IllegalMonitorStateException()); > 746: } In the case of an unbalanced monitorexit(), you are losing the throwing of the IllegalMonitorStateException here. At one point, we had at least one test that used JNI MonitorExit() to induce an unbalanced monitorexit() and it verified that the IllegalMonitorStateException was thrown. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From dcubed at openjdk.java.net Wed Sep 23 17:29:50 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 17:29:50 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: On Wed, 23 Sep 2020 17:26:16 GMT, Daniel D. Daugherty wrote: >> That the monitor has already been unlocked, or is a null stacklock monitor has been already checked in the caller, so >> the code that makes it a JRT_ENTRY_NO_ASYNC is unnecessary. >> Making it a JRT_LEAF like the compiled method entries makes it safer. We know it can never safepoint and >> unintentionally install a async exception. >> Tested with tier1-6. > > Changes requested by dcubed (Reviewer). The review invite says this: > That the monitor has already been unlocked, or is a null stacklock monitor has been already checked in the caller so I also need to go look at the caller contexts. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From dcubed at openjdk.java.net Wed Sep 23 17:29:51 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 17:29:51 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: <12XVpCUkBwseij2afllGJ_aA9kU16-KWEM8d_E_zVho=.38084ba5-cbcb-4c47-aeab-65e96fc31517@github.com> References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <12XVpCUkBwseij2afllGJ_aA9kU16-KWEM8d_E_zVho=.38084ba5-cbcb-4c47-aeab-65e96fc31517@github.com> Message-ID: On Wed, 23 Sep 2020 16:03:14 GMT, Martin Doerr wrote: >> That the monitor has already been unlocked, or is a null stacklock monitor has been already checked in the caller, so >> the code that makes it a JRT_ENTRY_NO_ASYNC is unnecessary. >> Making it a JRT_LEAF like the compiled method entries makes it safer. We know it can never safepoint and >> unintentionally install a async exception. >> Tested with tier1-6. > > src/hotspot/share/interpreter/interpreterRuntime.cpp line 742: > >> 740: oop obj = elem->obj(); >> 741: assert(!obj->is_unlocked(), "caller checked these conditions"); >> 742: assert(Universe::heap()->is_in_or_null(obj), "must be NULL or an object"); > > obj can't be null at this point Agreed. If 'obj == NULL' on L740, the L741 would crash in ASSERT enabled bits. The same is true of the original code if 'obj == NULL': old L744: if (elem == NULL || h_obj()->is_unlocked()) { we would crash on is_unlocked() call. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From pchilanomate at openjdk.java.net Wed Sep 23 18:12:39 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Wed, 23 Sep 2020 18:12:39 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: On Wed, 23 Sep 2020 17:19:25 GMT, Daniel D. Daugherty wrote: >> That the monitor has already been unlocked, or is a null stacklock monitor has been already checked in the caller, so >> the code that makes it a JRT_ENTRY_NO_ASYNC is unnecessary. >> Making it a JRT_LEAF like the compiled method entries makes it safer. We know it can never safepoint and >> unintentionally install a async exception. >> Tested with tier1-6. > > src/hotspot/share/interpreter/interpreterRuntime.cpp line 746: > >> 744: if (elem == NULL || h_obj()->is_unlocked()) { >> 745: THROW(vmSymbols::java_lang_IllegalMonitorStateException()); >> 746: } > > In the case of an unbalanced monitorexit(), you are losing > the throwing of the IllegalMonitorStateException here. > At one point, we had at least one test that used JNI MonitorExit() > to induce an unbalanced monitorexit() and it verified that the > IllegalMonitorStateException was thrown. But JNI MonitorExit() calls ObjectSynchronizer::jni_exit() instead and that calls check_owner() which will throw java_lang_IllegalMonitorStateException() if the thread is not the owner. I agree we might still need to throw java_lang_IllegalMonitorStateException() here for the release bits though. Maybe we still need to cover the case of the classfile having wrong bytecodes and there is an extra monitorexit? ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From dcubed at openjdk.java.net Wed Sep 23 18:43:39 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 18:43:39 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: On Wed, 23 Sep 2020 18:08:50 GMT, Patricio Chilano Mateo wrote: >> src/hotspot/share/interpreter/interpreterRuntime.cpp line 746: >> >>> 744: if (elem == NULL || h_obj()->is_unlocked()) { >>> 745: THROW(vmSymbols::java_lang_IllegalMonitorStateException()); >>> 746: } >> >> In the case of an unbalanced monitorexit(), you are losing >> the throwing of the IllegalMonitorStateException here. >> At one point, we had at least one test that used JNI MonitorExit() >> to induce an unbalanced monitorexit() and it verified that the >> IllegalMonitorStateException was thrown. > > But JNI MonitorExit() calls ObjectSynchronizer::jni_exit() instead and that calls check_owner() which will throw > java_lang_IllegalMonitorStateException() if the thread is not the owner. I agree we might still need to throw > java_lang_IllegalMonitorStateException() here for the release bits though. Maybe we still need to cover the case of the > classfile having wrong bytecodes and there is an extra monitorexit? The JNI MonitorExit() call will succeed because the calling thread is the owner of the monitor. The subsequent monitorexit byte code should fail because the calling thread is no longer the owner of the monitor. It's a way of simulating an errant situation. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From gziemski at openjdk.java.net Wed Sep 23 19:15:08 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 23 Sep 2020 19:15:08 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 13:11:42 GMT, Gerard Ziemski wrote: > > _Mailing list message from [Doerr, Martin](mailto:martin.doerr at sap.com) on > > [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ Hi Gerard, > > sorry for the long delay. It took time to get our nightly tests working again on AIX. > > I have seen an issue, but it may be unrelated to your change. We'll retest it. > > I will take another look at AIX changes to see if I missed something. > > > Note that there's still unused code left in os_aix.cpp (see below). > > Thanks again for taking care of AIX. I appreciate having more shared POSIX code. > > Thank you for the diff, I'll use it and update the webrev. I took another look at the changes and I have concerns over the following methods: PosixSignals::SR_handler PosixSignals::do_suspend PosixSignals::do_resume They seem to differ substantially in parts from the other POSIX platforms. In my early webrevs I have accounted for those changes using "#if defined(AIX)", but during pre-reviews you asked me to revert to the common POSIX code. Can you please take a look again and tell me if you are OK with the following SA related changes: - using POSIX semaphores - using the common POSIX code I will upload, what I think (based only on the code diffs between the platforms as I lack AIX platform understanding) the AIX platform needs here. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Wed Sep 23 19:20:45 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 23 Sep 2020 19:20:45 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v3] In-Reply-To: References: Message-ID: > hi all, > > Please review this change that refactors common POSIX code into a separate > file. > > Currently there appears to be quite a bit of duplicated code among POSIX > platforms, which makes it difficult to apply single fix to the signal code. > With this fix, we will only need to touch single file for common POSIX > code fixes from now on. > > ---------------------------------------------------------------------------- > The APIs which moved from os/bsd/os_bsd.cpp to to os/posix/PosixSignals.cpp: > > //////////////////////////////////////////////////////////////////////////////// > // signal support > void os::Bsd::signal_sets_init() > sigset_t* os::Bsd::unblocked_signals() > sigset_t* os::Bsd::vm_signals() > void os::Bsd::hotspot_sigmask(Thread* thread) > //////////////////////////////////////////////////////////////////////////////// > // sun.misc.Signal support > static void UserHandler(int sig, void *siginfo, void *context) > void* os::user_handler() > void* os::signal(int signal_number, void* handler) > void os::signal_raise(int signal_number) > int os::sigexitnum_pd() > static void jdk_misc_signal_init() > void os::signal_notify(int sig) > static int check_pending_signals() > int os::signal_wait() > //////////////////////////////////////////////////////////////////////////////// > // suspend/resume support > static void resume_clear_context(OSThread *osthread) > static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, ucontext_t* context) > static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) > static int SR_initialize() > static int sr_notify(OSThread* osthread) > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > /////////////////////////////////////////////////////////////////////////////////// > // signal handling (except suspend/resume) > static void signalHandler(int sig, siginfo_t* info, void* uc) > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > static bool call_chained_handler(struct sigaction *actp, int sig, > siginfo_t *siginfo, void *context) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::install_signal_handlers() > static const char* get_signal_handler_name(address handler, > char* buf, int buflen) > static void print_signal_handler(outputStream* st, int sig, > char* buf, size_t buflen) > void os::run_periodic_checks() > void os::Bsd::check_signal_handler(int sig) > > ----------------------------------------------------------------------------- > The APIs which moved from os/posix/os_posix.cpp to os/posix/PosixSignals.cpp: > > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > int os::Posix::get_signal_number(const char* signal_name) > int os::get_signal_number(const char* signal_name) > bool os::Posix::is_valid_signal(int sig) > bool os::Posix::is_sig_ignored(int sig) > const char* os::exception_name(int sig, char* buf, size_t size) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::print_siginfo(outputStream* os, const void* si0) > bool os::signal_thread(Thread* thread, int sig, const char* reason) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > struct sigaction* os::Posix::get_preinstalled_handler(int sig) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > > -------------------------------------------------------- > -------------------------------------------------------- > > DETAILS: > > -------------------------------------------------------- > Public APIs which are now internal static PosixSignals:: > > sigset_t* os::Bsd::vm_signals() > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::check_signal_handler(int sig) > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > bool os::Posix::is_valid_signal(int sig) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > ------------------------------------------------ > Public APIs which moved to public PosixSignals:: > > void os::Bsd::signal_sets_init() > void os::Bsd::hotspot_sigmask(Thread* thread) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > void os::Bsd::install_signal_handlers() > bool os::Posix::is_sig_ignored(int sig) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > > ---------------------------------------------------- > Internal APIs which are now public in PosixSignals:: > > static void jdk_misc_signal_init() > static int SR_initialize() > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) > > -------------------------- > New APIs in PosixSignals:: > > static bool are_signal_handlers_installed(); Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: Add AIX specific SA code ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/157/files - new: https://git.openjdk.java.net/jdk/pull/157/files/f3e3dd85..cc13700d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=01-02 Stats: 70 lines in 1 file changed: 69 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/157.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/157/head:pull/157 PR: https://git.openjdk.java.net/jdk/pull/157 From coleenp at openjdk.java.net Wed Sep 23 19:22:50 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 23 Sep 2020 19:22:50 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: On Wed, 23 Sep 2020 17:27:08 GMT, Daniel D. Daugherty wrote: >> Changes requested by dcubed (Reviewer). > > The review invite says this: > >> That the monitor has already been unlocked, or is a null stacklock >> monitor has been already checked in the caller > > so I also need to go look at the caller contexts. > > Update: So I went and took a look and very quickly found this: > > src/hotspot/cpu/x86/interp_masm_x86.cpp: > > void InterpreterMacroAssembler::unlock_object(Register lock_reg) { > assert(lock_reg == LP64_ONLY(c_rarg1) NOT_LP64(rdx), > "The argument is only for looks. It must be c_rarg1"); > > if (UseHeavyMonitors) { > call_VM(noreg, > CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorexit), > lock_reg); > } else { > > So when -XX:+UseHeavyMonitors is used, we make a direct call > to InterpreterRuntime::monitorexit() without checking the lock_reg > parameter at all. Read the caller of this function. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From dcubed at openjdk.java.net Wed Sep 23 19:22:50 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 19:22:50 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: On Wed, 23 Sep 2020 19:16:45 GMT, Coleen Phillimore wrote: >> The review invite says this: >> >>> That the monitor has already been unlocked, or is a null stacklock >>> monitor has been already checked in the caller >> >> so I also need to go look at the caller contexts. >> >> Update: So I went and took a look and very quickly found this: >> >> src/hotspot/cpu/x86/interp_masm_x86.cpp: >> >> void InterpreterMacroAssembler::unlock_object(Register lock_reg) { >> assert(lock_reg == LP64_ONLY(c_rarg1) NOT_LP64(rdx), >> "The argument is only for looks. It must be c_rarg1"); >> >> if (UseHeavyMonitors) { >> call_VM(noreg, >> CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorexit), >> lock_reg); >> } else { >> >> So when -XX:+UseHeavyMonitors is used, we make a direct call >> to InterpreterRuntime::monitorexit() without checking the lock_reg >> parameter at all. > > Read the caller of this function. Here's the else branch (!UseHeavyMonitors) in InterpreterMacroAssembler::unlock_object: src/hotspot/cpu/x86/interp_masm_x86.cpp: 1287 } else { 1288 Label done; 1289 1290 const Register swap_reg = rax; // Must use rax for cmpxchg instruction 1291 const Register header_reg = LP64_ONLY(c_rarg2) NOT_LP64(rbx); // Will contain the old oopMark 1292 const Register obj_reg = LP64_ONLY(c_rarg3) NOT_LP64(rcx); // Will contain the oop 1293 1294 save_bcp(); // Save in case of exception 1295 1296 // Convert from BasicObjectLock structure to object and BasicLock 1297 // structure Store the BasicLock address into %rax 1298 lea(swap_reg, Address(lock_reg, BasicObjectLock::lock_offset_in_bytes())); 1299 1300 // Load oop into obj_reg(%c_rarg3) 1301 movptr(obj_reg, Address(lock_reg, BasicObjectLock::obj_offset_in_bytes())); 1302 1303 // Free entry 1304 movptr(Address(lock_reg, BasicObjectLock::obj_offset_in_bytes()), (int32_t)NULL_WORD); 1305 1306 if (UseBiasedLocking) { 1307 biased_locking_exit(obj_reg, header_reg, done); 1308 } 1309 1310 // Load the old header from BasicLock structure 1311 movptr(header_reg, Address(swap_reg, 1312 BasicLock::displaced_header_offset_in_bytes())); 1313 1314 // Test for recursion 1315 testptr(header_reg, header_reg); 1316 1317 // zero for recursive case 1318 jcc(Assembler::zero, done); 1319 1320 // Atomic swap back the old header 1321 lock(); 1322 cmpxchgptr(header_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes())); 1323 1324 // zero for simple unlock of a stack-lock case 1325 jcc(Assembler::zero, done); 1326 1327 // Call the runtime routine for slow case. 1328 movptr(Address(lock_reg, BasicObjectLock::obj_offset_in_bytes()), 1329 obj_reg); // restore obj 1330 call_VM(noreg, 1331 CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorexit), 1332 lock_reg); 1333 1334 bind(done); 1335 1336 restore_bcp(); 1337 } If the BasicLock passed in via lock_reg is NULL, then this code will crash on L1298 (I think) before we get to InterpreterRuntime::monitorexit(). However, I haven't checked the other code paths yet. What I did notice is that this code path does not check for the monitor being owned by the calling thread so the deletion of this block from InterpreterRuntime::monitorexit(): if (elem == NULL || h_obj()->is_unlocked()) { THROW(vmSymbols::java_lang_IllegalMonitorStateException()); } is a problem. While it might be true that the "elem == NULL" part cannot happen, the "h_obj()->is_unlocked()" can definitely happen and this change results in the IllegalMonitorStateException not being thrown. I think this change isn't going to work. > Read the caller of this function. I have and I have even started quoting caller code. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From mdoerr at openjdk.java.net Wed Sep 23 19:30:36 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 23 Sep 2020 19:30:36 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: <86XUy7L61xxgpXo7p95CX02jc0gBIP5V_dZrcFQtD8g=.661acf40-a115-46d2-b02a-1b8363faa988@github.com> On Wed, 23 Sep 2020 19:19:24 GMT, Daniel D. Daugherty wrote: >> Read the caller of this function. > >> Read the caller of this function. > > I have and I have even started quoting caller code. Right, looks like the UseHeavyMonitors case in unlock_object misses the "Free entry" part. I believe this is how it's supposed to work: - TemplateTable::monitorexit only uses unlock_object if the obj is not null and it is found in the monitor section on stack. - unlock_object should clear the obj field on stack ("Free entry" on x86) - when calling monitorexit for the same unlocked obj, TemplateTable::monitorexit shouldn't find it on stack any more and throw the exception ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From coleenp at openjdk.java.net Wed Sep 23 19:43:41 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 23 Sep 2020 19:43:41 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: <86XUy7L61xxgpXo7p95CX02jc0gBIP5V_dZrcFQtD8g=.661acf40-a115-46d2-b02a-1b8363faa988@github.com> References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <86XUy7L61xxgpXo7p95CX02jc0gBIP5V_dZrcFQtD8g=.661acf40-a115-46d2-b02a-1b8363faa988@github.com> Message-ID: On Wed, 23 Sep 2020 19:27:51 GMT, Martin Doerr wrote: >>> Read the caller of this function. >> >> I have and I have even started quoting caller code. > > Right, looks like the UseHeavyMonitors case in unlock_object misses the "Free entry" part. > > I believe this is how it's supposed to work: > - TemplateTable::monitorexit only uses unlock_object if the obj is not null and it is found in the monitor section on > stack. > - unlock_object should clear the obj field on stack ("Free entry" on x86) > - when calling monitorexit for the same unlocked obj, TemplateTable::monitorexit shouldn't find it on stack any more and > throw the exception > > Ah, we have "elem->set_obj(NULL);" in InterpreterRuntime::monitorexit. So this should also cover the UseHeavyMonitors > case. The callers of unlock_object all check that the object is non-null and not already unlocked, otherwise they throw NSME. I haven't checked the non-x86 platforms recently to verify that, but that's how it should work. This should really be refactored so that it's easier to tell. But this shouldn't throw NSME inside of the InterpreterRuntime function. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From dcubed at openjdk.java.net Wed Sep 23 20:07:55 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 20:07:55 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <86XUy7L61xxgpXo7p95CX02jc0gBIP5V_dZrcFQtD8g=.661acf40-a115-46d2-b02a-1b8363faa988@github.com> Message-ID: <9hN1uVLXiL11dGtg6_Q3sCyU94DXg6s_Jx7iYeLwc2w=.593841f1-7a02-42c0-88a8-eb6fa916310a@github.com> On Wed, 23 Sep 2020 19:40:02 GMT, Coleen Phillimore wrote: >> Right, looks like the UseHeavyMonitors case in unlock_object misses the "Free entry" part. >> >> I believe this is how it's supposed to work: >> - TemplateTable::monitorexit only uses unlock_object if the obj is not null and it is found in the monitor section on >> stack. >> - unlock_object should clear the obj field on stack ("Free entry" on x86) >> - when calling monitorexit for the same unlocked obj, TemplateTable::monitorexit shouldn't find it on stack any more and >> throw the exception >> >> Ah, we have "elem->set_obj(NULL);" in InterpreterRuntime::monitorexit. So this should also cover the UseHeavyMonitors >> case. > > The callers of unlock_object all check that the object is non-null and not already unlocked, otherwise they throw > NSME. I haven't checked the non-x86 platforms recently to verify that, but that's how it should work. This should > really be refactored so that it's easier to tell. But this shouldn't throw NSME inside of the InterpreterRuntime > function. I went through all the callers of InterpreterMacroAssembler::unlock_object() and it does look like all caller paths are properly protected by an ownership check and a throwing of IllegalMonitorStateException. InterpreterMacroAssembler::unlock_object() isn't the only caller of InterpreterRuntime::monitorexit(). It looks like it is also used here: src/hotspot/cpu/zero/zeroInterpreter_zero.cpp src/hotspot/share/interpreter/zero/bytecodeInterpreter.cpp At least one of the code paths in bytecodeInterpreter.cpp has a (sort of) proper pre-check for IllegalMonitorStateException before the call to InterpreterRuntime::monitorexit(). But at least two of the code paths call InterpreterRuntime::monitorexit() and check for throwing IllegalMonitorStateException afterwards which totally boggles my mind. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From coleenp at openjdk.java.net Wed Sep 23 20:19:36 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 23 Sep 2020 20:19:36 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: <86XUy7L61xxgpXo7p95CX02jc0gBIP5V_dZrcFQtD8g=.661acf40-a115-46d2-b02a-1b8363faa988@github.com> References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <86XUy7L61xxgpXo7p95CX02jc0gBIP5V_dZrcFQtD8g=.661acf40-a115-46d2-b02a-1b8363faa988@github.com> Message-ID: On Wed, 23 Sep 2020 19:27:51 GMT, Martin Doerr wrote: > Ah, we have "elem->set_obj(NULL);" in InterpreterRuntime::monitorexit. So this should also cover the UseHeavyMonitors > case. I found out the hard way that this line is needed :) ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From pchilanomate at openjdk.java.net Wed Sep 23 20:19:36 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Wed, 23 Sep 2020 20:19:36 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> Message-ID: <3aWZDJX4OXhpPvOXaWP3wA0SPxJGSslQZXl6T_N2riY=.3d6025a6-cf37-4b88-abd8-2b22426bd210@github.com> On Wed, 23 Sep 2020 18:41:04 GMT, Daniel D. Daugherty wrote: >> But JNI MonitorExit() calls ObjectSynchronizer::jni_exit() instead and that calls check_owner() which will throw >> java_lang_IllegalMonitorStateException() if the thread is not the owner. I agree we might still need to throw >> java_lang_IllegalMonitorStateException() here for the release bits though. Maybe we still need to cover the case of the >> classfile having wrong bytecodes and there is an extra monitorexit? > > The JNI MonitorExit() call will succeed because the calling thread is > the owner of the monitor. The subsequent monitorexit byte code > should fail because the calling thread is no longer the owner of > the monitor. It's a way of simulating an errant situation. Ok, so the scenario would be: 1) Thread calls monitorenter bytecode 2) Thread calls jni_enter() * n and jni_exit() * (n+1) (extra unlock). Also with JNI we always inflate the lock. 3a) If no deflation happens and the thread calls monitorexit bytecode then we hit the assert in https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/objectMonitor.cpp#L1047 3b) If the monitor was deflated then on monitorexit is_unlocked() returns true and we throw java_lang_IllegalMonitorStateException. (We would change this case then). Is that correct? ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From dcubed at openjdk.java.net Wed Sep 23 20:19:36 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 20:19:36 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: <3aWZDJX4OXhpPvOXaWP3wA0SPxJGSslQZXl6T_N2riY=.3d6025a6-cf37-4b88-abd8-2b22426bd210@github.com> References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <3aWZDJX4OXhpPvOXaWP3wA0SPxJGSslQZXl6T_N2riY=.3d6025a6-cf37-4b88-abd8-2b22426bd210@github.com> Message-ID: On Wed, 23 Sep 2020 20:15:13 GMT, Patricio Chilano Mateo wrote: >> The JNI MonitorExit() call will succeed because the calling thread is >> the owner of the monitor. The subsequent monitorexit byte code >> should fail because the calling thread is no longer the owner of >> the monitor. It's a way of simulating an errant situation. > > Ok, so the scenario would be: > 1) Thread calls monitorenter bytecode > 2) Thread calls jni_enter() * n and jni_exit() * (n+1) (extra unlock). Also with JNI we always inflate the lock. > 3a) If no deflation happens and the thread calls monitorexit bytecode then we hit the assert in > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/objectMonitor.cpp#L1047 3b) If the monitor was > deflated then on monitorexit is_unlocked() returns true and we throw java_lang_IllegalMonitorStateException. (We would > change this case then). Is that correct? No jni_enter(). Just jni_exit(). We are trying to provoke an unbalanced monitorexit() here. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From coleenp at openjdk.java.net Wed Sep 23 20:48:38 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 23 Sep 2020 20:48:38 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <12XVpCUkBwseij2afllGJ_aA9kU16-KWEM8d_E_zVho=.38084ba5-cbcb-4c47-aeab-65e96fc31517@github.com> Message-ID: <_YKJ1FvH3daQoM-mP-_Qxw0sV0M6cPNyhxmgYUj7Vgg=.6ed78236-0c89-4a02-9065-20fed58e7b2c@github.com> On Wed, 23 Sep 2020 17:13:45 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/interpreter/interpreterRuntime.cpp line 742: >> >>> 740: oop obj = elem->obj(); >>> 741: assert(!obj->is_unlocked(), "caller checked these conditions"); >>> 742: assert(Universe::heap()->is_in_or_null(obj), "must be NULL or an object"); >> >> obj can't be null at this point > > Agreed. If 'obj == NULL' on L740, the L741 would crash in ASSERT enabled bits. > The same is true of the original code if 'obj == NULL': > old L744: if (elem == NULL || h_obj()->is_unlocked()) { > we would crash on is_unlocked() call. Yes should change assert to Universe::heap()->is_in(obj). ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From iklam at openjdk.java.net Wed Sep 23 21:03:51 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 23 Sep 2020 21:03:51 GMT Subject: RFR: 8251261: CDS dumping should not clear states in live classes [v3] In-Reply-To: References: Message-ID: <9eJUbWO2MciKJ0ew3DM8dU6Jgzgr0BBBoRM_vIuwrzk=.5159bda7-776f-49d5-b779-f20b2cbf4058@github.com> > We had an issue when CDS dumped a static archive (java -Xshare:dump), it would call `Klass::remove_unshareable_info()` > too early. In one of the test failures, ZGC was still scanning the heap and stepped on a class whose mirror has been > removed. The fix is to avoid modifying the states of the Java classes during -Xshare:dump. Instead, we call > `Klass::remove_unshareable_info()` only on the **copy** of the classes which are written into the archive. It's safe to > do so because these copies are visible only to the CDS dumping code. They aren't accessible by the GC or any other > subsystems. It turns out that we were already doing this for the dynamic archive. So I just generalized the code in > dynamicArchive.cpp and moved it to archiveBuilder.cpp. So this PR is one step forward for [JDK-8234693 Consolidate CDS > static and dynamic archive dumping code](https://bugs.openjdk.java.net/browse/JDK-8234693). I also fixed another case > where we modify the global VM state -- I removed `Universe::clear_basic_type_mirrors()`. > ---- > > We are still modifying some global VM states (such as SystemDictionary::_well_known_klasses). They seem harmless now, > but we might have to do more fixes in the future. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into 8251261-cds-shouldnt-clear-states-of-live-classes - Merge branch 'master' into 8251261-cds-shouldnt-clear-states-of-live-classes - restore assert for shared class - Merge branch 'master' into 8251261-cds-shouldnt-clear-states-of-live-classes - 8251261: CDS dumping should not clear states in live classes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/227/files - new: https://git.openjdk.java.net/jdk/pull/227/files/03f74855..a6029d97 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=227&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=227&range=01-02 Stats: 4627 lines in 219 files changed: 1969 ins; 2132 del; 526 mod Patch: https://git.openjdk.java.net/jdk/pull/227.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/227/head:pull/227 PR: https://git.openjdk.java.net/jdk/pull/227 From mdoerr at openjdk.java.net Wed Sep 23 21:03:36 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 23 Sep 2020 21:03:36 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 19:12:46 GMT, Gerard Ziemski wrote: >>> _Mailing list message from [Doerr, Martin](mailto:martin.doerr at sap.com) on >>> [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ >>> Hi Gerard, >>> >>> sorry for the long delay. It took time to get our nightly tests working again on AIX. >>> I have seen an issue, but it may be unrelated to your change. We'll retest it. >> >> I will take another look at AIX changes to see if I missed something. >> >> >>> Note that there's still unused code left in os_aix.cpp (see below). >>> Thanks again for taking care of AIX. I appreciate having more shared POSIX code. >> >> Thank you for the diff, I'll use it and update the webrev. > >> > _Mailing list message from [Doerr, Martin](mailto:martin.doerr at sap.com) on >> > [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ Hi Gerard, >> > sorry for the long delay. It took time to get our nightly tests working again on AIX. >> > I have seen an issue, but it may be unrelated to your change. We'll retest it. >> >> I will take another look at AIX changes to see if I missed something. > > I took another look at the changes and I have concerns over the following methods: > > PosixSignals::SR_handler > PosixSignals::do_suspend > PosixSignals::do_resume > > They seem to differ substantially in parts from the other POSIX platforms. In my early webrevs I have accounted for > those changes using "#if defined(AIX)", but during pre-reviews you asked me to revert to the common POSIX code. > Can you please take a look again and tell me if you are OK with the following SA related changes: > > - using POSIX semaphores > - using the common POSIX code > > I will upload, what I think (based only on the code diffs between the platforms as I lack AIX platform understanding) > the AIX platform needs here. I believe the reason for not using POSIX semaphores was that it is not supported on as400 PASE which we don't support in OpenJDK. I'm not aware of any problems when using this common POSIX code on AIX 7.2. Is SA supported on AIX? That would be new to me. But I'm not an expert for these topics. I hope that Thomas can find some time to take a look. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From coleenp at openjdk.java.net Wed Sep 23 21:25:20 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 23 Sep 2020 21:25:20 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <3aWZDJX4OXhpPvOXaWP3wA0SPxJGSslQZXl6T_N2riY=.3d6025a6-cf37-4b88-abd8-2b22426bd210@github.com> Message-ID: On Wed, 23 Sep 2020 20:16:53 GMT, Daniel D. Daugherty wrote: >> Ok, so the scenario would be: >> 1) Thread calls monitorenter bytecode >> 2) Thread calls jni_enter() * n and jni_exit() * (n+1) (extra unlock). Also with JNI we always inflate the lock. >> 3a) If no deflation happens and the thread calls monitorexit bytecode then we hit the assert in >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/objectMonitor.cpp#L1047 3b) If the monitor was >> deflated then on monitorexit is_unlocked() returns true and we throw java_lang_IllegalMonitorStateException. (We would >> change this case then). Is that correct? > > No jni_enter(). Just jni_exit(). We are trying to provoke an unbalanced > monitorexit() here. Yes, it looks like you can break this with JNI code with and extra jni_exit() for the object, and then call the monitorexit from the interpreter. This interpreter change is to be consistent with the compiled code that calls via JRT_LEAF, so compiled code would have the same problem. I can add a fatal to this like: fatal(obj->is_locked(), "Native code may have unlocked this object"); fatal() would assert in product mode, and errant native code can expect crashes even in product mode. elem would never be null since it's allocated in the caller. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From coleenp at openjdk.java.net Wed Sep 23 21:33:00 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 23 Sep 2020 21:33:00 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <3aWZDJX4OXhpPvOXaWP3wA0SPxJGSslQZXl6T_N2riY=.3d6025a6-cf37-4b88-abd8-2b22426bd210@github.com> Message-ID: <8_8hkSO3rgGn5Mx3cxVrN_D90UsFH1tokl5g4Dn-H_E=.5df6f931-d73d-468c-82aa-9f3662e78e53@github.com> On Wed, 23 Sep 2020 21:28:04 GMT, Daniel D. Daugherty wrote: >> Yes, it looks like you can break this with JNI code with and extra jni_exit() for the object, and then call the >> monitorexit from the interpreter. This interpreter change is to be consistent with the compiled code that calls via >> JRT_LEAF, so compiled code would have the same problem. I can add a fatal to this like: fatal(obj->is_locked(), "Native >> code may have unlocked this object"); fatal() would assert in product mode, and errant native code can expect crashes >> even in product mode. elem would never be null since it's allocated in the caller. > > I think you mean assert(obj->is_locked(), "this object must be locked") since fatal() doesn't take a bool... > Maybe we still need to cover the case of the classfile having wrong bytecodes and there is an extra monitorexit? The checks before the calls to unlock_object() handle this case. ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From dcubed at openjdk.java.net Wed Sep 23 21:33:00 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 21:33:00 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <3aWZDJX4OXhpPvOXaWP3wA0SPxJGSslQZXl6T_N2riY=.3d6025a6-cf37-4b88-abd8-2b22426bd210@github.com> Message-ID: On Wed, 23 Sep 2020 21:22:23 GMT, Coleen Phillimore wrote: >> No jni_enter(). Just jni_exit(). We are trying to provoke an unbalanced >> monitorexit() here. > > Yes, it looks like you can break this with JNI code with and extra jni_exit() for the object, and then call the > monitorexit from the interpreter. This interpreter change is to be consistent with the compiled code that calls via > JRT_LEAF, so compiled code would have the same problem. I can add a fatal to this like: fatal(obj->is_locked(), "Native > code may have unlocked this object"); fatal() would assert in product mode, and errant native code can expect crashes > even in product mode. elem would never be null since it's allocated in the caller. I think you mean assert(obj->is_locked(), "this object must be locked") since fatal() doesn't take a bool... ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From dcubed at openjdk.java.net Wed Sep 23 21:33:00 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 21:33:00 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: <8_8hkSO3rgGn5Mx3cxVrN_D90UsFH1tokl5g4Dn-H_E=.5df6f931-d73d-468c-82aa-9f3662e78e53@github.com> References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <3aWZDJX4OXhpPvOXaWP3wA0SPxJGSslQZXl6T_N2riY=.3d6025a6-cf37-4b88-abd8-2b22426bd210@github.com> <8_8hkSO3rgGn5Mx3cxVrN_D90UsFH1tokl5g4Dn-H_E=.5df6f931-d73d-468c-82aa-9f3662e78e53@github.com> Message-ID: On Wed, 23 Sep 2020 21:29:07 GMT, Coleen Phillimore wrote: >> I think you mean assert(obj->is_locked(), "this object must be locked") since fatal() doesn't take a bool... > >> Maybe we still need to cover the case of the classfile having wrong bytecodes and there is an extra monitorexit? > The checks before the calls to unlock_object() handle this case. Actually, I think it has to be: assert(!obj->is_unlocked(), "this object must be locked") ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From dcubed at openjdk.java.net Wed Sep 23 21:41:23 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 23 Sep 2020 21:41:23 GMT Subject: RFR: 8253540: InterpreterRuntime::monitorexit should be a JRT_LEAF function In-Reply-To: References: <5nU1CYJdKOywpJVw3s_c5vTIOAm0ojFcUc_WxRbvFfM=.ddc234fd-01be-439b-a2dd-2cf33c4ec7e8@github.com> <3aWZDJX4OXhpPvOXaWP3wA0SPxJGSslQZXl6T_N2riY=.3d6025a6-cf37-4b88-abd8-2b22426bd210@github.com> <8_8hkSO3rgGn5Mx3cxVrN_D90UsFH1tokl5g4Dn-H_E=.5df6f931-d73d-468c-82aa-9f3662e78e53@github.com> Message-ID: <83qkysm3yGgirk6gOgaO-XbWqW_DFddXtpggcUVZTJE=.9c1a6a13-d224-4448-b213-484cc0a1cdf7@github.com> On Wed, 23 Sep 2020 21:30:07 GMT, Daniel D. Daugherty wrote: >>> Maybe we still need to cover the case of the classfile having wrong bytecodes and there is an extra monitorexit? >> The checks before the calls to unlock_object() handle this case. > > Actually, I think it has to be: assert(!obj->is_unlocked(), "this object must be locked") Update: looks like oop has both is_locked() and is_unlocked()... ------------- PR: https://git.openjdk.java.net/jdk/pull/320 From ccheung at openjdk.java.net Wed Sep 23 21:58:52 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 23 Sep 2020 21:58:52 GMT Subject: RFR: 8251261: CDS dumping should not clear states in live classes [v3] In-Reply-To: <9eJUbWO2MciKJ0ew3DM8dU6Jgzgr0BBBoRM_vIuwrzk=.5159bda7-776f-49d5-b779-f20b2cbf4058@github.com> References: <9eJUbWO2MciKJ0ew3DM8dU6Jgzgr0BBBoRM_vIuwrzk=.5159bda7-776f-49d5-b779-f20b2cbf4058@github.com> Message-ID: <-P0gxnwVlE0o4WVDbYvCx14ifzRkE3RzTiW84197F70=.51d31d53-010c-4b38-a211-b9550407d6f6@github.com> On Wed, 23 Sep 2020 21:03:51 GMT, Ioi Lam wrote: >> We had an issue when CDS dumped a static archive (java -Xshare:dump), it would call `Klass::remove_unshareable_info()` >> too early. In one of the test failures, ZGC was still scanning the heap and stepped on a class whose mirror has been >> removed. The fix is to avoid modifying the states of the Java classes during -Xshare:dump. Instead, we call >> `Klass::remove_unshareable_info()` only on the **copy** of the classes which are written into the archive. It's safe to >> do so because these copies are visible only to the CDS dumping code. They aren't accessible by the GC or any other >> subsystems. It turns out that we were already doing this for the dynamic archive. So I just generalized the code in >> dynamicArchive.cpp and moved it to archiveBuilder.cpp. So this PR is one step forward for [JDK-8234693 Consolidate CDS >> static and dynamic archive dumping code](https://bugs.openjdk.java.net/browse/JDK-8234693). I also fixed another case >> where we modify the global VM state -- I removed `Universe::clear_basic_type_mirrors()`. >> ---- >> >> We are still modifying some global VM states (such as SystemDictionary::_well_known_klasses). They seem harmless now, >> but we might have to do more fixes in the future. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes > the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last > revision: > - Merge branch 'master' into 8251261-cds-shouldnt-clear-states-of-live-classes > - Merge branch 'master' into 8251261-cds-shouldnt-clear-states-of-live-classes > - restore assert for shared class > - Merge branch 'master' into 8251261-cds-shouldnt-clear-states-of-live-classes > - 8251261: CDS dumping should not clear states in live classes Marked as reviewed by ccheung (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/227 From minqi at openjdk.java.net Wed Sep 23 23:20:11 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 23 Sep 2020 23:20:11 GMT Subject: RFR: 8253500: [REDO] JDK-8253208 Move CDS related code to a separate class Message-ID: This patch is a REDO for JDK-8253208 which was backed out since it caused runtime/cds/DeterministicDump.java failed, see JDK-8253495. Since the failure is another issue and only triggered by this patch, the test case now is put on ProblemList.txt. The real root cause for the failure is detailed in JDK-8253495. When JDK-8253208 was backed out, CDS.java remained without removed from that patch (see JDK-8253495 subtask), so this patch does not include CDS.java (this is why you will see CDS.c without CDS.java in this patch). Tests tier1-4 passed. ------------- Commit messages: - 8253500: [REDO] JDK-8253208 Move CDS related code to a separate class Changes: https://git.openjdk.java.net/jdk/pull/327/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=327&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253500 Stats: 156 lines in 22 files changed: 57 ins; 53 del; 46 mod Patch: https://git.openjdk.java.net/jdk/pull/327.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/327/head:pull/327 PR: https://git.openjdk.java.net/jdk/pull/327 From mchung at openjdk.java.net Thu Sep 24 00:20:24 2020 From: mchung at openjdk.java.net (Mandy Chung) Date: Thu, 24 Sep 2020 00:20:24 GMT Subject: RFR: 8253500: [REDO] JDK-8253208 Move CDS related code to a separate class In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 23:05:59 GMT, Yumin Qi wrote: > This patch is a REDO for JDK-8253208 which was backed out since it caused runtime/cds/DeterministicDump.java failed, > see JDK-8253495. Since the failure is another issue and only triggered by this patch, the test case now is put on > ProblemList.txt. The real root cause for the failure is detailed in JDK-8253495. When JDK-8253208 was backed out, > CDS.java remained without removed from that patch (see JDK-8253495 subtask), so this patch does not include CDS.java > (this is why you will see CDS.c without CDS.java in this patch). Tests tier1-4 passed. Marked as reviewed by mchung (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/327 From mchung at openjdk.java.net Thu Sep 24 00:26:11 2020 From: mchung at openjdk.java.net (Mandy Chung) Date: Thu, 24 Sep 2020 00:26:11 GMT Subject: RFR: 8246774: Record Classes (final) implementation In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Tue, 22 Sep 2020 09:49:12 GMT, Chris Hegarty wrote: >> note: I have removed from the original patch the code related to javax.lang.model, I will publish them in a separate PR > > @vicente-romero-oracle I noticed that we can also remove the preview args from the record serialization tests and > ObjectMethodsTest. I opened a PR against the branch in your fork. You should be able to just merge in the changes. See > https://github.com/vicente-romero-oracle/jdk/pull/1 What is the policy of `@since` release value when a preview API becomes final. I would expect `@since` should be updated from 14 to 16 because 16 is the Java SE release these APIs are added?? ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From alex.buckley at oracle.com Thu Sep 24 00:40:11 2020 From: alex.buckley at oracle.com (Alex Buckley) Date: Wed, 23 Sep 2020 17:40:11 -0700 Subject: RFR: 8246774: Record Classes (final) implementation In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: <495cc1e9-41ad-3f9a-c91d-93b0d0d6c3f9@oracle.com> On 9/23/2020 5:26 PM, Mandy Chung wrote: > What is the policy of `@since` release value when a preview API > becomes final. I would expect `@since` should be updated from 14 > to 16 because 16 is the Java SE release these APIs are added?? Yes. Per http://openjdk.java.net/jeps/12#Specifications-of-Preview-Features : In particular, all elements of a preview API must have the following tags: ... An @since tag that indicates the release when [the API element] was first added. *If the API element is eventually made final and permanent in Java SE $N, then the @since tag must be changed to indicate the $N release (the element's history prior to $N is not of long-term interest).* Alex From mchung at openjdk.java.net Thu Sep 24 00:55:22 2020 From: mchung at openjdk.java.net (Mandy Chung) Date: Thu, 24 Sep 2020 00:55:22 GMT Subject: RFR: 8246774: Record Classes (final) implementation In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Thu, 24 Sep 2020 00:23:13 GMT, Mandy Chung wrote: >> @vicente-romero-oracle I noticed that we can also remove the preview args from the record serialization tests and >> ObjectMethodsTest. I opened a PR against the branch in your fork. You should be able to just merge in the changes. See >> https://github.com/vicente-romero-oracle/jdk/pull/1 > > What is the policy of `@since` release value when a preview API becomes final. I would expect `@since` should be > updated from 14 to 16 because 16 is the Java SE release these APIs are added?? Thanks Alex. @vicente-romero-oracle `@since` needs to be changed. ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From iklam at openjdk.java.net Thu Sep 24 02:03:57 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 24 Sep 2020 02:03:57 GMT Subject: RFR: 8253500: [REDO] JDK-8253208 Move CDS related code to a separate class In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 23:05:59 GMT, Yumin Qi wrote: > This patch is a REDO for JDK-8253208 which was backed out since it caused runtime/cds/DeterministicDump.java failed, > see JDK-8253495. Since the failure is another issue and only triggered by this patch, the test case now is put on > ProblemList.txt. The real root cause for the failure is detailed in JDK-8253495. When JDK-8253208 was backed out, > CDS.java remained without removed from that patch (see JDK-8253495 subtask), so this patch does not include CDS.java > (this is why you will see CDS.c without CDS.java in this patch). Tests tier1-4 passed. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/327 From dholmes at openjdk.java.net Thu Sep 24 02:33:00 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 24 Sep 2020 02:33:00 GMT Subject: RFR: 8253397: Ensure LogTag types are sorted [v2] In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 11:18:21 GMT, Thomas Schatzl wrote: >> Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev >> excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since >> the last revision: >> - Make LogTag::_name const >> - Ensure LogTag types are sorted > > Marked as reviewed by tschatzl (Reviewer). @cl4es Please do not force push updates to a PR. There is no way for reviewers to see what, if anything, differs between the final commit and the commit that they reviewed. If you want to merge with mainline/master then just merge and push the new commit to your personal fork. The commits will be flattened as part of the actual integration. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/274 From david.holmes at oracle.com Thu Sep 24 04:24:57 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 24 Sep 2020 14:24:57 +1000 Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: <9c56a19a-0b3f-929b-d652-e9b77f03ed33@oracle.com> On 23/09/2020 7:37 pm, Robbin Ehn wrote: > On Wed, 23 Sep 2020 02:56:00 GMT, David Holmes wrote: > >>> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Update after Coleen >> >> src/hotspot/share/runtime/handshake.hpp line 97: >> >>> 95: } >>> 96: bool block_for_operation() { >>> 97: return !_queue.is_empty() || _lock.is_locked(); >> >> I really don't understand the is_locked() check in this condition. ?? >> And the check for !empty is racy, so how do we avoid missing an in-progress addition? > > A JavaThread is blocked. > A second thread have just executed a handshake operation for this JavaThread and are on line: > https://github.com/openjdk/jdk/blob/cd784a751a3153939b9284898f370160124ca610/src/hotspot/share/runtime/handshake.cpp#L510 > And the queue is empty. > > The JavaThread wakes up and changes state from blocked to blocked_trans. > It now checks if it's allowed to complete the transition to e.g. vm. > > If a third thread adds to queue before the second thread leaves the loop it's operation can be executed. > But the JavaThread could see the queue as empty. (racey as you say) > > The executor takes lock and then checks if the JavaThread is safe for processing. > The JavaThread becomes unsafe and then check if lock is locked. > If the lock is locked we must take slow path to avoid this. > > We should also take slow path if there is something on queue to processes. > We are unsafe when we check queue and lock is not held, if we 'miss' that anything is on queue, it's fine. > Since any other thread cannot have seen us as safe and seen the item on queue. (since lock is not held) > Thus not allowed to process the operation. Wow! That all definitely needs some detailed commentary. Thanks, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/151 > From david.holmes at oracle.com Thu Sep 24 04:28:20 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 24 Sep 2020 14:28:20 +1000 Subject: RFR: 8238761: Asynchronous handshakes [v5] In-Reply-To: References: Message-ID: On 23/09/2020 8:11 pm, Robbin Ehn wrote: > On Wed, 23 Sep 2020 03:04:39 GMT, David Holmes wrote: > >>> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Update after Coleen >> >> src/hotspot/share/runtime/handshake.cpp line 63: >> >>> 61: }; >>> 62: >>> 63: class AsyncHandshakeOperation : public HandshakeOperation { >> >> This doesn't quite make sense. If you have an AsyncHandshakeOperation as a distinct subclass then it should not be >> possible for is_async() on a HandshakeOperation to return true - but it can because it can be passed an >> AsyncHandshakeClosure when constructed. If you want async and non-async operations to be distinct types then you will >> need to restrict how the base class is constructed, and provide a protected constructor that just takes an >> AsyncHandShakeClosure. > > This implementation code not part of the interface. I find it hard to tell which classes form which. > By casting the AsyncHandShakeClosure to a HandshakeClosure before instantiating the HandshakeOperation you can still > get is_async() to return true. And there are a loads of other user error which can be done like stack allocating > AsyncHandshakeOperation. Protecting against all those kinds of errors requires a lot of more code. Can we at least declare a protected constructor for HandshakeOperation that takes the AsyncHandshakeClosure, so that an accidental creation of "new HandShakeperation(anAsyncClosure)" will be prevented? Thanks, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/151 > From tschatzl at openjdk.java.net Thu Sep 24 08:14:28 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 24 Sep 2020 08:14:28 GMT Subject: RFR: 8253397: Ensure LogTag types are sorted [v2] In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 02:30:09 GMT, David Holmes wrote: >> Marked as reviewed by tschatzl (Reviewer). > > @cl4es Please do not force push updates to a PR. There is no way for reviewers to see what, if anything, differs > between the final commit and the commit that they reviewed. If you want to merge with mainline/master then just merge > and push the new commit to your personal fork. The commits will be flattened as part of the actual integration. Thanks. @dholmes-ora: fwiw, there is: click on the actual commit link (e.g. https://github.com/openjdk/jdk/pull/274/commits/19d37e729150dc330efbd494ef4e3a08b3017886 in this case where it says " cl4es added 2 commits 4 days ago ") and you'll get a diff in github. There is an enhancement/bug open on skara for better support of diffs for force pushes. ------------- PR: https://git.openjdk.java.net/jdk/pull/274 From shade at openjdk.java.net Thu Sep 24 08:36:44 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 24 Sep 2020 08:36:44 GMT Subject: RFR: 8253581: runtime/stringtable/StringTableCleaningTest.java fails on 32-bit platforms Message-ID: Test fails because it requests 3g heap. It seems to accept 1g as well. Test is the new addition in 16 since JDK-8248391. Running affected test: - [x] Linux x86_32 -XX:+UseSerialGC - [x] Linux x86_32 -XX:+UseParallelGC - [x] Linux x86_32 -XX:+UseG1GC - [x] Linux x86_32 -XX:+UseShenandoahGC - [x] Linux x86_64 -XX:+UseSerialGC - [x] Linux x86_64 -XX:+UseParallelGC - [x] Linux x86_64 -XX:+UseG1GC - [x] Linux x86_64 -XX:+UseShenandoahGC - [x] Linux x86_64 -XX:+UseZGC ------------- Commit messages: - 8253581: runtime/stringtable/StringTableCleaningTest.java fails on 32-bit platforms Changes: https://git.openjdk.java.net/jdk/pull/330/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=330&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253581 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/330.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/330/head:pull/330 PR: https://git.openjdk.java.net/jdk/pull/330 From simonis at openjdk.java.net Thu Sep 24 09:02:03 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Thu, 24 Sep 2020 09:02:03 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: References: Message-ID: > Hi, > > can I please have a review (or an idea for a better fix) for this PR? > > If a tool like [cpuset](https://github.com/lpechacek/cpuset) is used to manually create and manage > [cpusets](https://man7.org/linux/man-pages/man7/cpuset.7.html) the cgroups detections will be confused and crash in a > debug build or behave unexpectedly in a product build. The problem is that the additionally mounted cpuset will be > interpreted as if it was belonging to Cgroup controller: $ grep cgroup /proc/self/mountinfo > 36 25 0:30 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 > 49 36 0:43 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,memory > 50 36 0:44 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,rdma > ... > 43 36 0:37 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset > 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset > The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet > another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup. > Thanks, > Volker Volker Simonis has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/295/files - new: https://git.openjdk.java.net/jdk/pull/295/files/568c4896..2949e5fd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=295&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=295&range=00-01 Stats: 53 lines in 2 files changed: 47 ins; 1 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/295.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/295/head:pull/295 PR: https://git.openjdk.java.net/jdk/pull/295 From redestad at openjdk.java.net Thu Sep 24 09:03:31 2020 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 24 Sep 2020 09:03:31 GMT Subject: RFR: 8253397: Ensure LogTag types are sorted [v2] In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 08:11:26 GMT, Thomas Schatzl wrote: >> @cl4es Please do not force push updates to a PR. There is no way for reviewers to see what, if anything, differs >> between the final commit and the commit that they reviewed. If you want to merge with mainline/master then just merge >> and push the new commit to your personal fork. The commits will be flattened as part of the actual integration. Thanks. > > @dholmes-ora: fwiw, there is: click on the actual commit link (e.g. > https://github.com/openjdk/jdk/pull/274/commits/19d37e729150dc330efbd494ef4e3a08b3017886 in this case where it says " > cl4es added 2 commits 4 days ago ") and you'll get a diff in github. There is an enhancement/bug open on skara for > better support of diffs for force pushes. @dholmes-ora once I ended up in a state where force-push seemed like the only option it was too late. I assure you there were no changes from the reviewed changeset, and since there were no inline comments little I figured little was lost. I believe a `git rebase` was the cause of this, and will use `git merge` in my workflow instead. I got into the habit of continually rebasing rather than merging in mercurial in order to avoid merge commits, but here those will be squashed anyhow. I will refrain from rebasing after opening a PR from now on. ------------- PR: https://git.openjdk.java.net/jdk/pull/274 From simonis at openjdk.java.net Thu Sep 24 09:05:44 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Thu, 24 Sep 2020 09:05:44 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 09:49:00 GMT, Severin Gehwolf wrote: >>> > Did you run container tests with this? >>> >>> Yes. It took me some time to find out that I have to set `-Djdk.test.docker.image.name=ubuntu` and >>> `-Djdk.test.docker.image.version=18.04` on Ubuntu in order to run them :) >> >> Yes, this is a bit painful. I should have said that. >> >>> For release builds they all pass with and without the change (except `TestJFRWithJMX.java` which always fails, but I >>> don' t think that's related to this issue). Fast- and slowdebug builds don't even start without the fix and pass all >>> the tests with it (again except `TestJFRWithJMX.java`). >> >> Hmm, which ones did you run? It seems odd that they fail to run in fastdebug config. >> >> FWIW, I've crafted a regression test for this issue. Please include something like that if you can: >> https://github.com/simonis/jdk/pull/1 >> >> A fix like this should make it pass (uses the `/sys/fs/cgroup` convention) - on top of your change: >> >> diff --git a/src/hotspot/os/linux/cgroupSubsystem_linux.cpp b/src/hotspot/os/linux/cgroupSubsystem_linux.cpp >> index 21be7a8260c..4c6ae541929 100644 >> --- a/src/hotspot/os/linux/cgroupSubsystem_linux.cpp >> +++ b/src/hotspot/os/linux/cgroupSubsystem_linux.cpp >> @@ -295,10 +295,10 @@ bool CgroupSubsystemFactory::determine_type(CgroupInfo* cg_infos, >> // Skip cgroup2 fs lines on hybrid or unified hierarchy. >> continue; >> } >> - if (strcmp("none", tmpsource) == 0) { >> - // Skip cpusets created manually or by cset/cpuset (https://github.com/lpechacek/cpuset) >> - // The "mount source" for these mounts is usually "none" while the source of "true" Cgroup >> - // controllers is usually "cgroup". But this is just another heuristic... >> + if (strcmp(tmpmount, "/sys/fs/cgroup") < 0) { >> + // Skip potentially duplicate, manually mounted cgroup controllers >> + // not on /sys/fs/cgroup >> + log_info(os, container)("%s not mounted at /sys/fs/cgroup, skipping!", tmpmount); > >> > > Did you run container tests with this? > > For the record, the important one to run is this one: > test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java > > It's independent of your hosts cgroup files. I believe that test broke with your proposed v1 fix because after your > patch any `none` entries would be skipped. All of them are `none` for this test after JDK-8252359. Hi Severin, Bob, so here comes the new version as discussed. @jerboaa thanks for the additional test. I've merged it and extended it such that it checks both variants, when the manual csets controller comes before and when it comes after all the other controllers to exercise both code path in the detection. All container tests pass now (except `TestJFRWithJMX.java` which doesn't seem to be related to this issue and change). Are you OK with the change? Thank you and best regards, Volker ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From kbarrett at openjdk.java.net Thu Sep 24 09:11:05 2020 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 24 Sep 2020 09:11:05 GMT Subject: RFR: 8253581: runtime/stringtable/StringTableCleaningTest.java fails on 32-bit platforms In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 08:28:12 GMT, Aleksey Shipilev wrote: > Test fails because it requests 3g heap. It seems to accept 1g as well. Test is the new addition in 16 since JDK-8248391. > > Running affected test: > - [x] Linux x86_32 -XX:+UseSerialGC > - [x] Linux x86_32 -XX:+UseParallelGC > - [x] Linux x86_32 -XX:+UseG1GC > - [x] Linux x86_32 -XX:+UseShenandoahGC > - [x] Linux x86_64 -XX:+UseSerialGC > - [x] Linux x86_64 -XX:+UseParallelGC > - [x] Linux x86_64 -XX:+UseG1GC > - [x] Linux x86_64 -XX:+UseShenandoahGC > - [x] Linux x86_64 -XX:+UseZGC Looks good, and trivial. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/330 From tschatzl at openjdk.java.net Thu Sep 24 09:18:51 2020 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 24 Sep 2020 09:18:51 GMT Subject: RFR: 8253581: runtime/stringtable/StringTableCleaningTest.java fails on 32-bit platforms In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 08:28:12 GMT, Aleksey Shipilev wrote: > Test fails because it requests 3g heap. It seems to accept 1g as well. Test is the new addition in 16 since JDK-8248391. > > Running affected test: > - [x] Linux x86_32 -XX:+UseSerialGC > - [x] Linux x86_32 -XX:+UseParallelGC > - [x] Linux x86_32 -XX:+UseG1GC > - [x] Linux x86_32 -XX:+UseShenandoahGC > - [x] Linux x86_64 -XX:+UseSerialGC > - [x] Linux x86_64 -XX:+UseParallelGC > - [x] Linux x86_64 -XX:+UseG1GC > - [x] Linux x86_64 -XX:+UseShenandoahGC > - [x] Linux x86_64 -XX:+UseZGC +1 to Kim's comment. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/330 From shade at openjdk.java.net Thu Sep 24 09:22:20 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 24 Sep 2020 09:22:20 GMT Subject: Integrated: 8253581: runtime/stringtable/StringTableCleaningTest.java fails on 32-bit platforms In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 08:28:12 GMT, Aleksey Shipilev wrote: > Test fails because it requests 3g heap. It seems to accept 1g as well. Test is the new addition in 16 since JDK-8248391. > > Running affected test: > - [x] Linux x86_32 -XX:+UseSerialGC > - [x] Linux x86_32 -XX:+UseParallelGC > - [x] Linux x86_32 -XX:+UseG1GC > - [x] Linux x86_32 -XX:+UseShenandoahGC > - [x] Linux x86_64 -XX:+UseSerialGC > - [x] Linux x86_64 -XX:+UseParallelGC > - [x] Linux x86_64 -XX:+UseG1GC > - [x] Linux x86_64 -XX:+UseShenandoahGC > - [x] Linux x86_64 -XX:+UseZGC This pull request has now been integrated. Changeset: c303fd5d Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/c303fd5d Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8253581: runtime/stringtable/StringTableCleaningTest.java fails on 32-bit platforms Reviewed-by: kbarrett, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/330 From shade at openjdk.java.net Thu Sep 24 10:28:38 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 24 Sep 2020 10:28:38 GMT Subject: RFR: 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence [v3] In-Reply-To: References: Message-ID: > In `atomic_bsd_zero.hpp` and `atomic_linux_zero.hpp` there are uses of __sync_synchronize(). However, > `orderAccess_*_zero.hpp` calls the kernel helper, because: > /* > * ARM Kernel helper for memory barrier. > * Using __asm __volatile ("":::"memory") does not work reliable on ARM > * and gcc __sync_synchronize(); implementation does not use the kernel > * helper for all gcc versions so it is unreliable to use as well. > */ > > We need to clean this up to use `OrderAccess::fence()` to gain access to the kernel helper. > > This depends on JDK-8253464 being fixed first. > > Attention @bulasevich. > > Testing: > - [x] ARM32 Zero jcstress > - [ ] Mac OS x86_64 Zero jcstress Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Add comments - 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence ------------- Changes: https://git.openjdk.java.net/jdk/pull/298/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=298&range=02 Stats: 10 lines in 2 files changed: 4 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/298.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/298/head:pull/298 PR: https://git.openjdk.java.net/jdk/pull/298 From stuefe at openjdk.java.net Thu Sep 24 11:44:51 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 24 Sep 2020 11:44:51 GMT Subject: RFR: 8253572: [windows] CDS archive may fail to open with long file names Message-ID: Hi all, there is a long standing bug in the windows version of os::pd_map_memory() which may cause it to fail if the path to the underlying file is longer than the OS limit. This is mainly of interest for CDS, which uses this functionality to map in sections of the archive into the memory. This bug may cause the CDS mapping to fail. See also https://bugs.openjdk.java.net/browse/JDK-8249943. As with similar cases, the fix is to translate the input file name to a wide character UNC path name and use the Unicode variant of CreateFile which accepts long path names. ------------- Commit messages: - JDK-8253572 Changes: https://git.openjdk.java.net/jdk/pull/332/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=332&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253572 Stats: 12 lines in 1 file changed: 10 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/332.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/332/head:pull/332 PR: https://git.openjdk.java.net/jdk/pull/332 From coleenp at openjdk.java.net Thu Sep 24 12:27:14 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 24 Sep 2020 12:27:14 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v3] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Wed, 23 Sep 2020 03:34:29 GMT, Vicente Romero wrote: >> Co-authored-by: Vicente Romero >> Co-authored-by: Harold Seigel >> Co-authored-by: Jonathan Gibbons >> Co-authored-by: Brian Goetz >> Co-authored-by: Maurizio Cimadamore >> Co-authored-by: Joe Darcy >> Co-authored-by: Chris Hegarty >> Co-authored-by: Jan Lahoda > > Vicente Romero has updated the pull request incrementally with three additional commits since the last revision: > > - Merge pull request #1 from ChrisHegarty/record-serial-tests > > Remove preview args from JDK tests > - Remove preview args from ObjectMethodsTest > - Remove preview args from record serialization tests The classfile parser changes look good to me. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/290 From stuefe at openjdk.java.net Thu Sep 24 13:09:36 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 24 Sep 2020 13:09:36 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace Message-ID: Hi all, this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the delay, had vacation then the entrance of Skara delayed things a bit. For the delta diff please see [3]. This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all feedback individually in this PR body, but I incorporated almost all into the new revision. What changed since the last version: - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the group consent. - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth class loader which made it rather pointless. Now I run it always. - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. - In ChunkManager::purge() I improved the comments after discussions with Leo. - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run every time. - Fixed indentation issues as Leo requested - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more flexible since it allows us to have list with purge-able and non-purge-able nodes. - and various smaller fixes, mainly on request of Leo. @lkorinth: > VirtualSpaceNode.hpp > >102 // Start pointer of the area. >103 MetaWord* const _base; > >How does this differ from _rs._base? Really needed? > >105 // Size, in words, of the whole node >106 const size_t _word_size; > >Can we not calculate this from _rs.size()? You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at some point and make those members const - as they should be - then I would rewrite this as you suggest. Thanks, again, for all your review work! ------ [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d ------------- Commit messages: - Remove trailing whitespaces - JEP387 review feedback round 2 - Review version 2 (2020-09-04) Changes: https://git.openjdk.java.net/jdk/pull/336/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8251158 Stats: 23038 lines in 170 files changed: 14916 ins; 6912 del; 1210 mod Patch: https://git.openjdk.java.net/jdk/pull/336.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/336/head:pull/336 PR: https://git.openjdk.java.net/jdk/pull/336 From thomas.stuefe at gmail.com Thu Sep 24 13:25:08 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 24 Sep 2020 15:25:08 +0200 Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms In-Reply-To: References: Message-ID: Hi Gerard, first off, have said it already, but great work! This is a really good cleanup. About your last commit ( https://github.com/openjdk/jdk/pull/157/commits/cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1 Add AIX specific SA code): I think you can drop that again. The version beforehand looks good to me. I did some archeology and all this coding was forked from Linux and introduced by Sun for JEP167: https://bugs.openjdk.java.net/browse/JDK-8005849 It introduces also those "RANDOMLY_.." variables and uses them in Linux and BSD coding. We just forked that for AIX. Later that coding was rewritten in Linux, but the changes were not ported to AIX, hence the diff. -- Weirdly enough, that exact version for JDK-8005849 I cannot even find in the current jdk git or mercurial depots. I see jep167 introduced in mercurial with: changeset: 18025:b7bcf7497f93 user: sla date: Mon Jun 10 11:30:51 2013 +0200 summary: 8005849: JEP 167: Event-Based JVM Tracing But that one seems to be already the later version, not the initial one I see at https://bugs.openjdk.java.net/browse/JDK-8005849. Who knows, the history may be lost in the course of the hg forest consolidation. Maybe Steffan Larson may know. -- Anyway, I think all that coding is not needed for AIX. The fact that it exists is only an effect of AIX missing code after the initial fork. --https://bugs.openjdk.java.net/browse/JDK-8005849 Also, if you feel like it, you also can remove the remaining unused definitions of "RANDOMLY_..." in os_linux.cpp and os_bsd.cpp. -- I wish you had done the move to a new file (os_posix -> signals_posix) in a separate change, that would have made the review easier. But alas, it is what it is. I eyed the changes and they seem to be okay. A couple of places have #ifdef AIX still but we can go through them later. Also, you still have JDK-8252533 outstanding which will do away with some of the AIX specific sections. For all I see this seems fine. I think roll back the last commit and let us test the final version for a night or two in our systems, and then this is good to go. Thanks, Thomas On Wed, Sep 23, 2020 at 9:17 PM Gerard Ziemski wrote: > On Wed, 23 Sep 2020 13:11:42 GMT, Gerard Ziemski > wrote: > > > > _Mailing list message from [Doerr, Martin](mailto:martin.doerr at sap.com) > on > > > [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ > Hi Gerard, > > > sorry for the long delay. It took time to get our nightly tests > working again on AIX. > > > I have seen an issue, but it may be unrelated to your change. We'll > retest it. > > > > I will take another look at AIX changes to see if I missed something. > > > > > Note that there's still unused code left in os_aix.cpp (see below). > > > Thanks again for taking care of AIX. I appreciate having more shared > POSIX code. > > > > Thank you for the diff, I'll use it and update the webrev. > > I took another look at the changes and I have concerns over the following > methods: > > PosixSignals::SR_handler > PosixSignals::do_suspend > PosixSignals::do_resume > > They seem to differ substantially in parts from the other POSIX platforms. > In my early webrevs I have accounted for > those changes using "#if defined(AIX)", but during pre-reviews you asked > me to revert to the common POSIX code. > > Can you please take a look again and tell me if you are OK with the > following SA related changes: > > - using POSIX semaphores > - using the common POSIX code > > I will upload, what I think (based only on the code diffs between the > platforms as I lack AIX platform understanding) > the AIX platform needs here. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/157 > From gziemski at openjdk.java.net Thu Sep 24 15:04:01 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Thu, 24 Sep 2020 15:04:01 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: Message-ID: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> > hi all, > > Please review this change that refactors common POSIX code into a separate > file. > > Currently there appears to be quite a bit of duplicated code among POSIX > platforms, which makes it difficult to apply single fix to the signal code. > With this fix, we will only need to touch single file for common POSIX > code fixes from now on. > > ---------------------------------------------------------------------------- > The APIs which moved from os/bsd/os_bsd.cpp to to os/posix/PosixSignals.cpp: > > //////////////////////////////////////////////////////////////////////////////// > // signal support > void os::Bsd::signal_sets_init() > sigset_t* os::Bsd::unblocked_signals() > sigset_t* os::Bsd::vm_signals() > void os::Bsd::hotspot_sigmask(Thread* thread) > //////////////////////////////////////////////////////////////////////////////// > // sun.misc.Signal support > static void UserHandler(int sig, void *siginfo, void *context) > void* os::user_handler() > void* os::signal(int signal_number, void* handler) > void os::signal_raise(int signal_number) > int os::sigexitnum_pd() > static void jdk_misc_signal_init() > void os::signal_notify(int sig) > static int check_pending_signals() > int os::signal_wait() > //////////////////////////////////////////////////////////////////////////////// > // suspend/resume support > static void resume_clear_context(OSThread *osthread) > static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, ucontext_t* context) > static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) > static int SR_initialize() > static int sr_notify(OSThread* osthread) > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > /////////////////////////////////////////////////////////////////////////////////// > // signal handling (except suspend/resume) > static void signalHandler(int sig, siginfo_t* info, void* uc) > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > static bool call_chained_handler(struct sigaction *actp, int sig, > siginfo_t *siginfo, void *context) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::install_signal_handlers() > static const char* get_signal_handler_name(address handler, > char* buf, int buflen) > static void print_signal_handler(outputStream* st, int sig, > char* buf, size_t buflen) > void os::run_periodic_checks() > void os::Bsd::check_signal_handler(int sig) > > ----------------------------------------------------------------------------- > The APIs which moved from os/posix/os_posix.cpp to os/posix/PosixSignals.cpp: > > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > int os::Posix::get_signal_number(const char* signal_name) > int os::get_signal_number(const char* signal_name) > bool os::Posix::is_valid_signal(int sig) > bool os::Posix::is_sig_ignored(int sig) > const char* os::exception_name(int sig, char* buf, size_t size) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::print_siginfo(outputStream* os, const void* si0) > bool os::signal_thread(Thread* thread, int sig, const char* reason) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > struct sigaction* os::Posix::get_preinstalled_handler(int sig) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > > -------------------------------------------------------- > -------------------------------------------------------- > > DETAILS: > > -------------------------------------------------------- > Public APIs which are now internal static PosixSignals:: > > sigset_t* os::Bsd::vm_signals() > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::check_signal_handler(int sig) > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > bool os::Posix::is_valid_signal(int sig) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > ------------------------------------------------ > Public APIs which moved to public PosixSignals:: > > void os::Bsd::signal_sets_init() > void os::Bsd::hotspot_sigmask(Thread* thread) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > void os::Bsd::install_signal_handlers() > bool os::Posix::is_sig_ignored(int sig) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > > ---------------------------------------------------- > Internal APIs which are now public in PosixSignals:: > > static void jdk_misc_signal_init() > static int SR_initialize() > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) > > -------------------------- > New APIs in PosixSignals:: > > static bool are_signal_handlers_installed(); Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: Revert "Add AIX specific SA code" This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/157/files - new: https://git.openjdk.java.net/jdk/pull/157/files/cc13700d..1ce83535 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=02-03 Stats: 70 lines in 1 file changed: 1 ins; 69 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/157.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/157/head:pull/157 PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Thu Sep 24 15:10:57 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Thu, 24 Sep 2020 15:10:57 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 21:00:47 GMT, Martin Doerr wrote: >>> > _Mailing list message from [Doerr, Martin](mailto:martin.doerr at sap.com) on >>> > [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ Hi Gerard, >>> > sorry for the long delay. It took time to get our nightly tests working again on AIX. >>> > I have seen an issue, but it may be unrelated to your change. We'll retest it. >>> >>> I will take another look at AIX changes to see if I missed something. >> >> I took another look at the changes and I have concerns over the following methods: >> >> PosixSignals::SR_handler >> PosixSignals::do_suspend >> PosixSignals::do_resume >> >> They seem to differ substantially in parts from the other POSIX platforms. In my early webrevs I have accounted for >> those changes using "#if defined(AIX)", but during pre-reviews you asked me to revert to the common POSIX code. >> Can you please take a look again and tell me if you are OK with the following SA related changes: >> >> - using POSIX semaphores >> - using the common POSIX code >> >> I will upload, what I think (based only on the code diffs between the platforms as I lack AIX platform understanding) >> the AIX platform needs here. > > I believe the reason for not using POSIX semaphores was that it is not supported on as400 PASE which we don't support > in OpenJDK. I'm not aware of any problems when using this common POSIX code on AIX 7.2. Is SA supported on AIX? That > would be new to me. But I'm not an expert for these topics. I hope that Thomas can find some time to take a look. Reverted the last commit. Thank you for taking a look and testing - please let me know how it goes. I have already tested this change here at Oracle a while ago, but I will do so once again... ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From erikj at openjdk.java.net Thu Sep 24 15:25:17 2020 From: erikj at openjdk.java.net (Erik Joelsson) Date: Thu, 24 Sep 2020 15:25:17 GMT Subject: RFR: 8253500: [REDO] JDK-8253208 Move CDS related code to a separate class In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 23:05:59 GMT, Yumin Qi wrote: > This patch is a REDO for JDK-8253208 which was backed out since it caused runtime/cds/DeterministicDump.java failed, > see JDK-8253495. Since the failure is another issue and only triggered by this patch, the test case now is put on > ProblemList.txt. The real root cause for the failure is detailed in JDK-8253495. When JDK-8253208 was backed out, > CDS.java remained without removed from that patch (see JDK-8253495 subtask), so this patch does not include CDS.java > (this is why you will see CDS.c without CDS.java in this patch). Tests tier1-4 passed. Build changes ok. ------------- PR: https://git.openjdk.java.net/jdk/pull/327 From minqi at openjdk.java.net Thu Sep 24 15:30:52 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 24 Sep 2020 15:30:52 GMT Subject: Integrated: 8253500: [REDO] JDK-8253208 Move CDS related code to a separate class In-Reply-To: References: Message-ID: On Wed, 23 Sep 2020 23:05:59 GMT, Yumin Qi wrote: > This patch is a REDO for JDK-8253208 which was backed out since it caused runtime/cds/DeterministicDump.java failed, > see JDK-8253495. Since the failure is another issue and only triggered by this patch, the test case now is put on > ProblemList.txt. The real root cause for the failure is detailed in JDK-8253495. When JDK-8253208 was backed out, > CDS.java remained without removed from that patch (see JDK-8253495 subtask), so this patch does not include CDS.java > (this is why you will see CDS.c without CDS.java in this patch). Tests tier1-4 passed. This pull request has now been integrated. Changeset: 89c5e49b Author: Yumin Qi URL: https://git.openjdk.java.net/jdk/commit/89c5e49b Stats: 156 lines in 22 files changed: 57 ins; 53 del; 46 mod 8253500: [REDO] JDK-8253208 Move CDS related code to a separate class Reviewed-by: mchung, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/327 From vromero at openjdk.java.net Thu Sep 24 15:49:52 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Thu, 24 Sep 2020 15:49:52 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v4] In-Reply-To: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: > Co-authored-by: Vicente Romero > Co-authored-by: Harold Seigel > Co-authored-by: Jonathan Gibbons > Co-authored-by: Brian Goetz > Co-authored-by: Maurizio Cimadamore > Co-authored-by: Joe Darcy > Co-authored-by: Chris Hegarty > Co-authored-by: Jan Lahoda Vicente Romero has updated the pull request incrementally with one additional commit since the last revision: modifiying @since from 14 to 16 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/290/files - new: https://git.openjdk.java.net/jdk/pull/290/files/26b80775..514b0a80 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=02-03 Stats: 8 lines in 7 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/290.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/290/head:pull/290 PR: https://git.openjdk.java.net/jdk/pull/290 From vromero at openjdk.java.net Thu Sep 24 15:49:52 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Thu, 24 Sep 2020 15:49:52 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v3] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: <0H-sMIm0mwGW2f2OxwAwMGBtxDf2BUf7ds3tGEgYXrc=.849aab2b-4cf3-49f7-8e49-ad1df75cb0bb@github.com> On Thu, 24 Sep 2020 12:23:13 GMT, Coleen Phillimore wrote: >> Vicente Romero has updated the pull request incrementally with three additional commits since the last revision: >> >> - Merge pull request #1 from ChrisHegarty/record-serial-tests >> >> Remove preview args from JDK tests >> - Remove preview args from ObjectMethodsTest >> - Remove preview args from record serialization tests > > The classfile parser changes look good to me. I have modified the `@since`: 14 -> 16 ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From vromero at openjdk.java.net Thu Sep 24 15:54:57 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Thu, 24 Sep 2020 15:54:57 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v5] In-Reply-To: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: <1mJrr-6f-CRtIrhQqH7VlpzuNZdR0dzdDiLnacBx32I=.a0823203-fc37-4fc7-a104-6031c580fd21@github.com> > Co-authored-by: Vicente Romero > Co-authored-by: Harold Seigel > Co-authored-by: Jonathan Gibbons > Co-authored-by: Brian Goetz > Co-authored-by: Maurizio Cimadamore > Co-authored-by: Joe Darcy > Co-authored-by: Chris Hegarty > Co-authored-by: Jan Lahoda Vicente Romero has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge branch 'master' into JDK-8246774 - modifiying @since from 14 to 16 - Merge pull request #1 from ChrisHegarty/record-serial-tests Remove preview args from JDK tests - Remove preview args from ObjectMethodsTest - Remove preview args from record serialization tests - removing the javax.lang.model related code to be moved to a separate bug - 8246774: Record Classes (final) implementation Co-authored-by: Vicente Romero Co-authored-by: Harold Seigel Co-authored-by: Jonathan Gibbons Co-authored-by: Brian Goetz Co-authored-by: Maurizio Cimadamore Co-authored-by: Joe Darcy Co-authored-by: Chris Hegarty Co-authored-by: Jan Lahoda ------------- Changes: https://git.openjdk.java.net/jdk/pull/290/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=04 Stats: 464 lines in 104 files changed: 23 ins; 267 del; 174 mod Patch: https://git.openjdk.java.net/jdk/pull/290.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/290/head:pull/290 PR: https://git.openjdk.java.net/jdk/pull/290 From sgehwolf at openjdk.java.net Thu Sep 24 17:23:04 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Thu, 24 Sep 2020 17:23:04 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: References: Message-ID: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> On Thu, 24 Sep 2020 09:02:03 GMT, Volker Simonis wrote: >> Hi, >> >> can I please have a review (or an idea for a better fix) for this PR? >> >> If a tool like [cpuset](https://github.com/lpechacek/cpuset) is used to manually create and manage >> [cpusets](https://man7.org/linux/man-pages/man7/cpuset.7.html) the cgroups detections will be confused and crash in a >> debug build or behave unexpectedly in a product build. The problem is that the additionally mounted cpuset will be >> interpreted as if it was belonging to Cgroup controller: $ grep cgroup /proc/self/mountinfo >> 36 25 0:30 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 >> 49 36 0:43 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,memory >> 50 36 0:44 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,rdma >> ... >> 43 36 0:37 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset >> 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset >> The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet >> another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup. >> Thanks, >> Volker > > Volker Simonis has refreshed the contents of this pull request, and previous commits have been removed. The incremental > views will show differences compared to the previous content of the PR. The pull request contains one new commit since > the last revision: > 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist I'm not sure this extra gymnastics will be worth the added complexity. This makes the code somewhat harder to follow and, essentially, we end up checking whether or not the cgroup controller is being mounted under `/sys/fs/cgroup`. The code tracks any interesting controller, like `memory`, `cpu`, `cpuset` and `cpuacct` and records the mount point. It's going to be `/sys/fs/cgroup` for almost all cases, would it not? The skip for non-/sys/fs/cgroup controllers ends up being either `/fo/bar/baz` or whatever the lead-up path to the first seen "interesting" controller is or `/sys/fs/cgroup`. I'm not convinced it's really anything other than `/sys/fs/cgroup`. How about a simpler solution like this? https://github.com/jerboaa/jdk/commit/1323315036c0b7625cd1a05690b5944df8457bad This seems to work for me. src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 122: > 120: if (*mount_path == NULL) { > 121: *mount_path = os::strdup(tmpmount); > 122: if (strrchr(*mount_path, '/')) { I guess we could avoid calling `strrchr` twice here by using a local variable. How about this? char* last_slash = strrchr(*mount_path, '/'); if (last_slash != NULL) { *last_slash = '\0'; // truncate controller } src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 318: > 316: // Skip controllers created manually or by cset/cpuset (https://github.com/lpechacek/cpuset). E.g.: > 317: // 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset > 318: // Controllers beloning to a Cgroup are usually mounted under "/sys/fs/cgroup" while s/beloning/belonging/ test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java line 271: > 269: test.testCgroupv2NoCgroup2Fs(wb); > 270: test.testCgroupv1MultipleCpusetMounts(wb, test.cgroupv1MntInfoDoubleCpuset); > 271: test.testCgroupv1MultipleCpusetMounts(wb, test.cgroupv1MntInfoDoubleCpuset2); For this test to actually work properly we need an additional new line in the hybrid snippet. Sorry about this, was probably my fault at the time. Like this: +++ b/test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java @@ -104,7 +104,7 @@ public class CgroupSubsystemFactory { "41 30 0:37 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:13 - cgroup none rw,seclabel,devices\n" + "42 30 0:38 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:14 - cgroup none rw,seclabel,cpuset\n" + "43 30 0:39 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:15 - cgroup none rw,seclabel,blkio\n" + - "44 30 0:40 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:16 - cgroup none rw,seclabel,freezer"; + "44 30 0:40 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:16 - cgroup none rw,seclabel,freezer\n"; private String mntInfoHybridRest = cgroupv1MountInfoLineMemory + mntInfoHybridStub; private String mntInfoHybridMissingMemory = mntInfoHybridStub; private String mntInfoHybrid = cgroupV2LineHybrid + mntInfoHybridRest; Without it the generated `cgroupv1MntInfoDoubleCpuset2` file ends up containing: 31 30 0:27 / /sys/fs/cgroup/unified rw,nosuid,nodev,noexec,relatime shared:5 - cgroup2 none rw,seclabel,nsdelegate 35 30 0:31 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:7 - cgroup none rw,seclabel,memory 30 23 0:26 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:4 - tmpfs tmpfs ro,seclabel,mode=755 32 30 0:28 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:6 - cgroup none rw,seclabel,xattr,name=systemd 36 30 0:32 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:8 - cgroup none rw,seclabel,pids 37 30 0:33 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:9 - cgroup none rw,seclabel,perf_event 38 30 0:34 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:10 - cgroup none rw,seclabel,net_cls,net_prio 39 30 0:35 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:11 - cgroup none rw,seclabel,hugetlb 40 30 0:36 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:12 - cgroup none rw,seclabel,cpu,cpuacct 41 30 0:37 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:13 - cgroup none rw,seclabel,devices 42 30 0:38 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:14 - cgroup none rw,seclabel,cpuset 43 30 0:39 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:15 - cgroup none rw,seclabel,blkio 44 30 0:40 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:16 - cgroup none rw,seclabel,freezer121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset which doesn't trigger the bug. src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 335: > 333: cg_infos[CPUSET_IDX]._mount_path = os::strdup(tmpmount); > 334: cg_infos[CPUSET_IDX]._root_mount_path = os::strdup(tmproot); > 335: cg_infos[CPUSET_IDX]._data_complete = true; It's not clear to me why `check_mount_path(&mount_path, tmpmount);` isn't called on this code path. Could you explain? ------------- Changes requested by sgehwolf (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/295 From zgu at openjdk.java.net Thu Sep 24 18:20:13 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 24 Sep 2020 18:20:13 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads Message-ID: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report thread's state after thread terminated. But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid thread stack, etc. This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated thread from live thread. Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). Note: Java thread does not have such issue, its thread object is deleted before thread terminates. ------------- Commit messages: - Don't call ThreadsSMRSupport::print_info_on() if the thread is not alive - init Changes: https://git.openjdk.java.net/jdk/pull/341/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=341&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253429 Stats: 20 lines in 1 file changed: 12 ins; 3 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/341.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/341/head:pull/341 PR: https://git.openjdk.java.net/jdk/pull/341 From bobv at openjdk.java.net Thu Sep 24 19:20:04 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Thu, 24 Sep 2020 19:20:04 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> Message-ID: On Thu, 24 Sep 2020 17:20:28 GMT, Severin Gehwolf wrote: >> Volker Simonis has refreshed the contents of this pull request, and previous commits have been removed. The incremental >> views will show differences compared to the previous content of the PR. The pull request contains one new commit since >> the last revision: >> 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist > > I'm not sure this extra gymnastics will be worth the added complexity. This makes the code somewhat harder to follow > and, essentially, we end up checking whether or not the cgroup controller is being mounted under `/sys/fs/cgroup`. The > code tracks any interesting controller, like `memory`, `cpu`, `cpuset` and `cpuacct` and records the mount point. It's > going to be `/sys/fs/cgroup` for almost all cases, would it not? The skip for non-/sys/fs/cgroup controllers ends up > being either `/fo/bar/baz` or whatever the lead-up path to the first seen "interesting" controller is or > `/sys/fs/cgroup`. I'm not convinced it's really anything other than `/sys/fs/cgroup`. How about a simpler solution like > this? https://github.com/jerboaa/jdk/commit/1323315036c0b7625cd1a05690b5944df8457bad This seems to work for me. I was expecting to see some logic in this "else if" section that recorded the first occurance but did the validation on the second pass (cg_infos[CPUSET_IDX]._mount_path != NULL). When this situation is detected, we accept the mount with the /sys/fs/cgroup. } else if (strcmp(token, "cpuset") == 0) { assert(cg_infos[CPUSET_IDX]._mount_path == NULL, "stomping of _mount_path"); cg_infos[CPUSET_IDX]._mount_path = os::strdup(tmpmount); cg_infos[CPUSET_IDX]._root_mount_path = os::strdup(tmproot); cg_infos[CPUSET_IDX]._data_complete = true; ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From iklam at openjdk.java.net Thu Sep 24 19:23:16 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 24 Sep 2020 19:23:16 GMT Subject: Integrated: 8251261: CDS dumping should not clear states in live classes In-Reply-To: References: Message-ID: <5X0tzRguk2QR_S9jaCGj_6mSkhpjHqFHAqIpDpfKW-w=.d2f1a588-d637-44a5-8380-a510afd0aeac@github.com> On Thu, 17 Sep 2020 18:53:55 GMT, Ioi Lam wrote: > We had an issue when CDS dumped a static archive (java -Xshare:dump), it would call `Klass::remove_unshareable_info()` > too early. In one of the test failures, ZGC was still scanning the heap and stepped on a class whose mirror has been > removed. The fix is to avoid modifying the states of the Java classes during -Xshare:dump. Instead, we call > `Klass::remove_unshareable_info()` only on the **copy** of the classes which are written into the archive. It's safe to > do so because these copies are visible only to the CDS dumping code. They aren't accessible by the GC or any other > subsystems. It turns out that we were already doing this for the dynamic archive. So I just generalized the code in > dynamicArchive.cpp and moved it to archiveBuilder.cpp. So this PR is one step forward for [JDK-8234693 Consolidate CDS > static and dynamic archive dumping code](https://bugs.openjdk.java.net/browse/JDK-8234693). I also fixed another case > where we modify the global VM state -- I removed `Universe::clear_basic_type_mirrors()`. > ---- > > We are still modifying some global VM states (such as SystemDictionary::_well_known_klasses). They seem harmless now, > but we might have to do more fixes in the future. This pull request has now been integrated. Changeset: 8b85c3a6 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/8b85c3a6 Stats: 214 lines in 9 files changed: 76 ins; 126 del; 12 mod 8251261: CDS dumping should not clear states in live classes Reviewed-by: minqi, ccheung ------------- PR: https://git.openjdk.java.net/jdk/pull/227 From hseigel at openjdk.java.net Thu Sep 24 19:23:38 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 24 Sep 2020 19:23:38 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= Message-ID: Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then only memory usage is returned. The fix was tested by running container tests on systems with and without swap limit capabilities. Additionally, the changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, and Mac OS, and running tier3 - tier5 tests on Linux x64. ------------- Commit messages: - 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 swap limit capabilities - 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 swap limit capabilities Changes: https://git.openjdk.java.net/jdk/pull/342/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=342&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8250984 Stats: 115 lines in 8 files changed: 56 ins; 11 del; 48 mod Patch: https://git.openjdk.java.net/jdk/pull/342.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/342/head:pull/342 PR: https://git.openjdk.java.net/jdk/pull/342 From hseigel at openjdk.java.net Thu Sep 24 19:23:38 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 24 Sep 2020 19:23:38 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= In-Reply-To: References: Message-ID: <7FkvSuEdi9cp5Gm9TGQekREHvI0E3xot5fTuJNFfeXU=.c46a4030-c2a2-4a1d-9f90-b036ca0272ff@github.com> On Thu, 24 Sep 2020 18:32:38 GMT, Harold Seigel wrote: > Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit > capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related > information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then > only memory usage is returned. The fix was tested by running container tests on systems with and without swap limit > capabilities. Additionally, the changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, > and Mac OS, and running tier3 - tier5 tests on Linux x64. This fix was proposed by Bob V. ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From bob.vandette at oracle.com Thu Sep 24 19:54:24 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 24 Sep 2020 15:54:24 -0400 Subject: =?utf-8?Q?Re=3A_RFR=3A_8250984=3A_Memory_Docker_tests_fail_on_som?= =?utf-8?Q?e_Linux_kernels_w/o_cgroupv1_=E2=80=A6?= In-Reply-To: References: Message-ID: <14E7AF77-672B-4338-AD9E-68698F6F96C2@oracle.com> Looks good to me. Bob. > On Sep 24, 2020, at 3:23 PM, Harold Seigel wrote: > > Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit > capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related > information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then > only memory usage is returned. > > The fix was tested by running container tests on systems with and without swap limit capabilities. Additionally, the > changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, and Mac OS, and running tier3 - > tier5 tests on Linux x64. > > ------------- > > Commit messages: > - 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 swap limit capabilities > - 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 swap limit capabilities > > Changes: https://git.openjdk.java.net/jdk/pull/342/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=342&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8250984 > Stats: 115 lines in 8 files changed: 56 ins; 11 del; 48 mod > Patch: https://git.openjdk.java.net/jdk/pull/342.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/342/head:pull/342 > > PR: https://git.openjdk.java.net/jdk/pull/342 From ccheung at openjdk.java.net Thu Sep 24 21:02:13 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Thu, 24 Sep 2020 21:02:13 GMT Subject: RFR: 8253572: [windows] CDS archive may fail to open with long file names In-Reply-To: References: Message-ID: <3zPvyYsEOg4LMcQ2kKf95hOgPH8liURcvf8Yk-W3dys=.b50fb1d8-aa64-4d2f-bf67-413eaad5cc63@github.com> On Thu, 24 Sep 2020 10:04:00 GMT, Thomas Stuefe wrote: > Hi all, > > there is a long standing bug in the windows version of os::pd_map_memory() which may cause it to fail if the path to > the underlying file is longer than the OS limit. This is mainly of interest for CDS, which uses this functionality to > map in sections of the archive into the memory. This bug may cause the CDS mapping to fail. See also > https://bugs.openjdk.java.net/browse/JDK-8249943. As with similar cases, the fix is to translate the input file name > to a wide character UNC path name and use the Unicode variant of CreateFile which accepts long path names. Thanks for identifying the bug and fixing it. The patch looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/332 From vromero at openjdk.java.net Fri Sep 25 00:41:47 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Fri, 25 Sep 2020 00:41:47 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v6] In-Reply-To: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: > Co-authored-by: Vicente Romero > Co-authored-by: Harold Seigel > Co-authored-by: Jonathan Gibbons > Co-authored-by: Brian Goetz > Co-authored-by: Maurizio Cimadamore > Co-authored-by: Joe Darcy > Co-authored-by: Chris Hegarty > Co-authored-by: Jan Lahoda Vicente Romero has updated the pull request incrementally with one additional commit since the last revision: adding missing changes to some tests ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/290/files - new: https://git.openjdk.java.net/jdk/pull/290/files/89f7cc54..915b67e0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=04-05 Stats: 71 lines in 5 files changed: 9 ins; 11 del; 51 mod Patch: https://git.openjdk.java.net/jdk/pull/290.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/290/head:pull/290 PR: https://git.openjdk.java.net/jdk/pull/290 From dholmes at openjdk.java.net Fri Sep 25 02:33:58 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 25 Sep 2020 02:33:58 GMT Subject: RFR: 8253397: Ensure LogTag types are sorted [v2] In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 09:00:55 GMT, Claes Redestad wrote: >> @dholmes-ora: fwiw, there is: click on the actual commit link (e.g. >> https://github.com/openjdk/jdk/pull/274/commits/19d37e729150dc330efbd494ef4e3a08b3017886 in this case where it says " >> cl4es added 2 commits 4 days ago ") and you'll get a diff in github. There is an enhancement/bug open on skara for >> better support of diffs for force pushes. > > @dholmes-ora once I ended up in a state where force-push seemed like the only option it was too late. I assure you > there were no changes from the reviewed changeset, and since there were no inline comments little I figured little was > lost. I believe a `git rebase` was the cause of this, and will use `git merge` in my workflow instead. I got into the > habit of continually rebasing rather than merging in mercurial in order to avoid merge commits, but here those will be > squashed anyhow. I will refrain from rebasing after opening a PR from now on. > @dholmes-ora: fwiw, there is: click on the actual commit link (e.g. > [19d37e7](https://github.com/openjdk/jdk/commit/19d37e729150dc330efbd494ef4e3a08b3017886) in this case where it says " > cl4es added 2 commits 4 days ago ") and you'll get a diff in github. There is an enhancement/bug open on skara for > better support of diffs for force pushes. That is a diff for one of the commits that was force-pushed over the original. I can see diffs both of those commits but what I cannot see is how those commits differ to the original one that was overwritten. Further if I click on "Show changes since your last review" I get an error page: "We went looking everywhere, but couldn?t find those commits. Sometimes commits can disappear after a force-push. Head back to the latest changes here." We don't need better support for forced pushes IMO we just need to not do them. Cheers. ------------- PR: https://git.openjdk.java.net/jdk/pull/274 From vromero at openjdk.java.net Fri Sep 25 02:40:47 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Fri, 25 Sep 2020 02:40:47 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v3] In-Reply-To: <0H-sMIm0mwGW2f2OxwAwMGBtxDf2BUf7ds3tGEgYXrc=.849aab2b-4cf3-49f7-8e49-ad1df75cb0bb@github.com> References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> <0H-sMIm0mwGW2f2OxwAwMGBtxDf2BUf7ds3tGEgYXrc=.849aab2b-4cf3-49f7-8e49-ad1df75cb0bb@github.com> Message-ID: On Thu, 24 Sep 2020 15:45:22 GMT, Vicente Romero wrote: >> The classfile parser changes look good to me. > > I have modified the `@since`: 14 -> 16 [CSR: Record Classes](https://bugs.openjdk.java.net/browse/JDK-8253605) ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From stuefe at openjdk.java.net Fri Sep 25 03:51:12 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 25 Sep 2020 03:51:12 GMT Subject: RFR: 8253572: [windows] CDS archive may fail to open with long file names In-Reply-To: <3zPvyYsEOg4LMcQ2kKf95hOgPH8liURcvf8Yk-W3dys=.b50fb1d8-aa64-4d2f-bf67-413eaad5cc63@github.com> References: <3zPvyYsEOg4LMcQ2kKf95hOgPH8liURcvf8Yk-W3dys=.b50fb1d8-aa64-4d2f-bf67-413eaad5cc63@github.com> Message-ID: On Thu, 24 Sep 2020 20:59:32 GMT, Calvin Cheung wrote: > Thanks for identifying the bug and fixing it. The patch looks good. Thank you Calvin! ------------- PR: https://git.openjdk.java.net/jdk/pull/332 From stuefe at openjdk.java.net Fri Sep 25 04:31:22 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 25 Sep 2020 04:31:22 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> Message-ID: <5Tu_a_8mMPbGL5KyoZ4FzDhTswNVQFcg4aKuhMYd01k=.ec65dd18-e3fc-4c14-b2f3-40fa38df6af6@github.com> On Thu, 24 Sep 2020 18:14:10 GMT, Zhengyu Gu wrote: > For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report > thread's state after thread terminated. > But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid > thread stack, etc. > This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated > thread from live thread. > Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). > > Note: Java thread does not have such issue, its thread object is deleted before thread terminates. Hi Zhengyu, Could this expose us to a race? What I mean is: on Linux and BSD I see this code in os::create_thread() after thread creation and after the initial handshake with the parent thread: > A pthread_create ... > B initial handshake... > > C // Aborted due to thread limit being reached > if (state == ZOMBIE) { > thread->set_osthread(NULL); > delete osthread; > return false; > } (Weirdly enough we don't do (C) for Windows and AIX). I may be wrong do not find anywhere in the code where we set the ZOMBIE state today. So (C) may be just dead code now. With your patch, we now set the state to ZOMBIE in post_run. If child finishes and sets ZOMBIE state before parent reaches (C) os::create_thread would return not false even though the child thread started correctly and ran through? ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From david.holmes at oracle.com Fri Sep 25 05:06:39 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 25 Sep 2020 15:06:39 +1000 Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: <5Tu_a_8mMPbGL5KyoZ4FzDhTswNVQFcg4aKuhMYd01k=.ec65dd18-e3fc-4c14-b2f3-40fa38df6af6@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <5Tu_a_8mMPbGL5KyoZ4FzDhTswNVQFcg4aKuhMYd01k=.ec65dd18-e3fc-4c14-b2f3-40fa38df6af6@github.com> Message-ID: Hi Thomas, On 25/09/2020 2:31 pm, Thomas Stuefe wrote: > On Thu, 24 Sep 2020 18:14:10 GMT, Zhengyu Gu wrote: > >> For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report >> thread's state after thread terminated. >> But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid >> thread stack, etc. >> This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated >> thread from live thread. >> Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). >> >> Note: Java thread does not have such issue, its thread object is deleted before thread terminates. > > Hi Zhengyu, > > Could this expose us to a race? > > What I mean is: on Linux and BSD I see this code in os::create_thread() after thread creation and after the initial > handshake with the parent thread: > >> A pthread_create ... >> B initial handshake... >> >> C // Aborted due to thread limit being reached >> if (state == ZOMBIE) { >> thread->set_osthread(NULL); >> delete osthread; >> return false; >> } > > (Weirdly enough we don't do (C) for Windows and AIX). > > I may be wrong do not find anywhere in the code where we set the ZOMBIE state today. So (C) may be just dead code now. Funnily enough it seems to be dead code that you created: https://bugs.openjdk.java.net/browse/JDK-8078513 but mea culpa too as I missed it in the review. :) Otherwise, yes this would be a theoretical race. So lets just delete the above. Cheers, David > With your patch, we now set the state to ZOMBIE in post_run. If child finishes and sets ZOMBIE state before parent > reaches (C) os::create_thread would return not false even though the child thread started correctly and ran through? > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/341 > From dholmes at openjdk.java.net Fri Sep 25 05:22:24 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 25 Sep 2020 05:22:24 GMT Subject: RFR: 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence [v3] In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 10:28:38 GMT, Aleksey Shipilev wrote: >> In `atomic_bsd_zero.hpp` and `atomic_linux_zero.hpp` there are uses of __sync_synchronize(). However, >> `orderAccess_*_zero.hpp` calls the kernel helper, because: >> /* >> * ARM Kernel helper for memory barrier. >> * Using __asm __volatile ("":::"memory") does not work reliable on ARM >> * and gcc __sync_synchronize(); implementation does not use the kernel >> * helper for all gcc versions so it is unreliable to use as well. >> */ >> >> We need to clean this up to use `OrderAccess::fence()` to gain access to the kernel helper. >> >> This depends on JDK-8253464 being fixed first. >> >> Attention @bulasevich. >> >> Testing: >> - [x] ARM32 Zero jcstress >> - [x] Mac OS x86_64 Zero jcstress > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains two commits: > - Add comments > - 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence Changes seem fine to me. Good to have barriers handled only in OrderAccess. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/298 From thomas.stuefe at gmail.com Fri Sep 25 05:29:01 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 25 Sep 2020 07:29:01 +0200 Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <5Tu_a_8mMPbGL5KyoZ4FzDhTswNVQFcg4aKuhMYd01k=.ec65dd18-e3fc-4c14-b2f3-40fa38df6af6@github.com> Message-ID: Hi David, On Fri, Sep 25, 2020 at 7:06 AM David Holmes wrote: > Hi Thomas, > > On 25/09/2020 2:31 pm, Thomas Stuefe wrote: > > On Thu, 24 Sep 2020 18:14:10 GMT, Zhengyu Gu wrote: > > > >> For some non-JavaThread, their object instances can outlast threads' > lifespan. For example, we still can query/report > >> thread's state after thread terminated. > >> But the query/report currently returns wrong state. E.g. a terminated > thread appears to be alive and seemly has valid > >> thread stack, etc. > >> This patch sets non-JavaThread's state to ZOMBIE just before it > terminates, so that we can distinguish terminated > >> thread from live thread. > >> Also, thread should not report its SMR info, if it has terminated or it > never started (thread->osthread() == NULL). > >> > >> Note: Java thread does not have such issue, its thread object is > deleted before thread terminates. > > > > Hi Zhengyu, > > > > Could this expose us to a race? > > > > What I mean is: on Linux and BSD I see this code in os::create_thread() > after thread creation and after the initial > > handshake with the parent thread: > > > >> A pthread_create ... > >> B initial handshake... > >> > >> C // Aborted due to thread limit being reached > >> if (state == ZOMBIE) { > >> thread->set_osthread(NULL); > >> delete osthread; > >> return false; > >> } > > > > (Weirdly enough we don't do (C) for Windows and AIX). > > > > I may be wrong do not find anywhere in the code where we set the ZOMBIE > state today. So (C) may be just dead code now. > > Funnily enough it seems to be dead code that you created: > > https://bugs.openjdk.java.net/browse/JDK-8078513 > > but mea culpa too as I missed it in the review. :) > > Hah! Thanks for reminding me :) Okay, that's still left over from pre-NPTL. @Zhengyu: Can you remove these two sections too in os_bsd.cpp and os_linux.cpp or should I do this in a separate patch? Cheers Thomas > Otherwise, yes this would be a theoretical race. So lets just delete the > above. > > Cheers, > David > > > With your patch, we now set the state to ZOMBIE in post_run. If child > finishes and sets ZOMBIE state before parent > > reaches (C) os::create_thread would return not false even though the > child thread started correctly and ran through? > > > > ------------- > > > > PR: https://git.openjdk.java.net/jdk/pull/341 > > > From dholmes at openjdk.java.net Fri Sep 25 05:22:24 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 25 Sep 2020 05:22:24 GMT Subject: RFR: 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 12:41:02 GMT, Aleksey Shipilev wrote: > In `atomic_bsd_zero.hpp` and `atomic_linux_zero.hpp` there are uses of __sync_synchronize(). However, > `orderAccess_*_zero.hpp` calls the kernel helper, because: > /* > * ARM Kernel helper for memory barrier. > * Using __asm __volatile ("":::"memory") does not work reliable on ARM > * and gcc __sync_synchronize(); implementation does not use the kernel > * helper for all gcc versions so it is unreliable to use as well. > */ > > We need to clean this up to use `OrderAccess::fence()` to gain access to the kernel helper. > > This depends on JDK-8253464 being fixed first. > > Attention @bulasevich. > > Testing: > - [x] ARM32 Zero jcstress > - [x] Mac OS x86_64 Zero jcstress @shipilev please don't force push commits as it breaks the commit history. There is no way to see what the difference is between the forced commit and the one it overwrote. That makes it hard for reviewers to follow and could render existing review comments nonsensical. If you need to merge with latest changes just do a merge in your local branch and push that commit to your Personal fork. The skara tooling will flatten all commits in the PR into a single simple commit with just the actual changes you made. Most of the time you don't even need to re-merge with latest changes from master. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/298 From simonis at openjdk.java.net Fri Sep 25 07:28:36 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Fri, 25 Sep 2020 07:28:36 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> Message-ID: On Thu, 24 Sep 2020 15:43:47 GMT, Severin Gehwolf wrote: >> Volker Simonis has refreshed the contents of this pull request, and previous commits have been removed. The incremental >> views will show differences compared to the previous content of the PR. The pull request contains one new commit since >> the last revision: >> 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist > > test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java line 271: > >> 269: test.testCgroupv2NoCgroup2Fs(wb); >> 270: test.testCgroupv1MultipleCpusetMounts(wb, test.cgroupv1MntInfoDoubleCpuset); >> 271: test.testCgroupv1MultipleCpusetMounts(wb, test.cgroupv1MntInfoDoubleCpuset2); > > For this test to actually work properly we need an additional new line in the hybrid snippet. Sorry about this, was > probably my fault at the time. Like this: > +++ b/test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java > @@ -104,7 +104,7 @@ public class CgroupSubsystemFactory { > "41 30 0:37 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:13 - cgroup none rw,seclabel,devices\n" + > "42 30 0:38 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:14 - cgroup none rw,seclabel,cpuset\n" + > "43 30 0:39 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:15 - cgroup none rw,seclabel,blkio\n" + > - "44 30 0:40 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:16 - cgroup none rw,seclabel,freezer"; > + "44 30 0:40 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:16 - cgroup none > rw,seclabel,freezer\n"; > private String mntInfoHybridRest = cgroupv1MountInfoLineMemory + mntInfoHybridStub; > private String mntInfoHybridMissingMemory = mntInfoHybridStub; > private String mntInfoHybrid = cgroupV2LineHybrid + mntInfoHybridRest; > > Without it the generated `cgroupv1MntInfoDoubleCpuset2` file ends up containing: > > 31 30 0:27 / /sys/fs/cgroup/unified rw,nosuid,nodev,noexec,relatime shared:5 - cgroup2 none rw,seclabel,nsdelegate > 35 30 0:31 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:7 - cgroup none rw,seclabel,memory > 30 23 0:26 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:4 - tmpfs tmpfs ro,seclabel,mode=755 > 32 30 0:28 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:6 - cgroup none > rw,seclabel,xattr,name=systemd 36 30 0:32 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:8 - cgroup none > rw,seclabel,pids 37 30 0:33 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:9 - cgroup none > rw,seclabel,perf_event 38 30 0:34 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:10 - cgroup > none rw,seclabel,net_cls,net_prio 39 30 0:35 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:11 - > cgroup none rw,seclabel,hugetlb 40 30 0:36 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:12 - > cgroup none rw,seclabel,cpu,cpuacct 41 30 0:37 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:13 - > cgroup none rw,seclabel,devices 42 30 0:38 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:14 - cgroup > none rw,seclabel,cpuset 43 30 0:39 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:15 - cgroup none > rw,seclabel,blkio 44 30 0:40 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:16 - cgroup none > rw,seclabel,freezer121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset which doesn't trigger the bug. Good catch! Fixed > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 318: > >> 316: // Skip controllers created manually or by cset/cpuset (https://github.com/lpechacek/cpuset). E.g.: >> 317: // 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset >> 318: // Controllers beloning to a Cgroup are usually mounted under "/sys/fs/cgroup" while > > s/beloning/belonging/ Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From simonis at openjdk.java.net Fri Sep 25 07:31:26 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Fri, 25 Sep 2020 07:31:26 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> Message-ID: On Thu, 24 Sep 2020 12:26:20 GMT, Severin Gehwolf wrote: >> Volker Simonis has refreshed the contents of this pull request, and previous commits have been removed. The incremental >> views will show differences compared to the previous content of the PR. > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 122: > >> 120: if (*mount_path == NULL) { >> 121: *mount_path = os::strdup(tmpmount); >> 122: if (strrchr(*mount_path, '/')) { > > I guess we could avoid calling `strrchr` twice here by using a local variable. How about this? > > char* last_slash = strrchr(*mount_path, '/'); > if (last_slash != NULL) { > *last_slash = '\0'; // truncate controller > } Changed as suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From simonis at openjdk.java.net Fri Sep 25 07:49:24 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Fri, 25 Sep 2020 07:49:24 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> Message-ID: On Thu, 24 Sep 2020 15:56:17 GMT, Severin Gehwolf wrote: >> Volker Simonis has refreshed the contents of this pull request, and previous commits have been removed. The incremental >> views will show differences compared to the previous content of the PR. The pull request contains one new commit since >> the last revision: >> 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 335: > >> 333: cg_infos[CPUSET_IDX]._mount_path = os::strdup(tmpmount); >> 334: cg_infos[CPUSET_IDX]._root_mount_path = os::strdup(tmproot); >> 335: cg_infos[CPUSET_IDX]._data_complete = true; > > It's not clear to me why `check_mount_path(&mount_path, tmpmount);` isn't called on this code path. Could you explain? Because `cpusets` might be mounted to `/cpusets` or `/dev/cpusets` and we don't know if this manual mount point comes before or after any of the others which belong to a Cgroup. If the manual mount point comes first in `mountinfo`, `check_mount_path()` would record its mount path as the *correct* one and would bail out later, when checking the other controllers. The solution you propose in [jerboaa at 1323315](https://github.com/jerboaa/jdk/commit/1323315036c0b7625cd1a05690b5944df8457bad) is of course simpler, but it assumes that the Cgroup controllers are always mounted under `/sys/fs/cgroup`. I'm not sure about that and I just wanted to avoid the next issue when popping up when the controllers are mounted to a different location. But if you say they are always mounted under `/sys/fs/cgroup` I'm happy to use the simpler solution. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From stefank at openjdk.java.net Fri Sep 25 07:54:54 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 25 Sep 2020 07:54:54 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 12:16:55 GMT, Thomas Stuefe wrote: > Hi all, > > this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the > delay, had vacation then the entrance of Skara delayed things a bit. > For the delta diff please see [3]. > > This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all > feedback individually in this PR body, but I incorporated almost all into the new revision. > What changed since the last version: > > - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the > group consent. > > - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for > overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth > class loader which made it rather pointless. Now I run it always. > > - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. > > - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) > > Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. > > - In ChunkManager::purge() I improved the comments after discussions with Leo. > > - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due > to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run > every time. > > - Fixed indentation issues as Leo requested > > - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested > > - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes > and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more > flexible since it allows us to have list with purge-able and non-purge-able nodes. > > - and various smaller fixes, mainly on request of Leo. > > @lkorinth: > >> VirtualSpaceNode.hpp >> >>102 // Start pointer of the area. >>103 MetaWord* const _base; >> >>How does this differ from _rs._base? Really needed? >> >>105 // Size, in words, of the whole node >>106 const size_t _word_size; >> >>Can we not calculate this from _rs.size()? > > You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it > is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly > improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at > some point and make those members const - as they should be - then I would rewrite this as you suggest. > Thanks, again, for all your review work! > > ------ > > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html > [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html > [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 47: > 45: #include "runtime/mutexLocker.hpp" > 46: #include "runtime/os.hpp" > 47: Just a drive-by comment. Please don't introduce this new style of adding newlines between our internal include lines. The same comment about the newlines after precompiled.hpp. I know that the Shenandoah code already does that, and I've seen some of the metaspace code do that as well, but please stick to the set standard way of listing our includes. This applies to all files in this patch. ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From simonis at openjdk.java.net Fri Sep 25 08:47:05 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Fri, 25 Sep 2020 08:47:05 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v3] In-Reply-To: References: Message-ID: > Hi, > > can I please have a review (or an idea for a better fix) for this PR? > > If a tool like [cpuset](https://github.com/lpechacek/cpuset) is used to manually create and manage > [cpusets](https://man7.org/linux/man-pages/man7/cpuset.7.html) the cgroups detections will be confused and crash in a > debug build or behave unexpectedly in a product build. The problem is that the additionally mounted cpuset will be > interpreted as if it was belonging to Cgroup controller: $ grep cgroup /proc/self/mountinfo > 36 25 0:30 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 > 49 36 0:43 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,memory > 50 36 0:44 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,rdma > ... > 43 36 0:37 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset > 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset > The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet > another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup. > Thanks, > Volker Volker Simonis has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/295/files - new: https://git.openjdk.java.net/jdk/pull/295/files/2949e5fd..7753bc5d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=295&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=295&range=01-02 Stats: 20 lines in 2 files changed: 9 ins; 8 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/295.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/295/head:pull/295 PR: https://git.openjdk.java.net/jdk/pull/295 From simonis at openjdk.java.net Fri Sep 25 08:47:06 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Fri, 25 Sep 2020 08:47:06 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> Message-ID: On Thu, 24 Sep 2020 19:16:51 GMT, Bob Vandette wrote: > I was expecting to see some logic in this "else if" section that recorded the first occurance but did the validation on > the second pass (cg_infos[CPUSET_IDX]._mount_path != NULL). When this situation is detected, we accept the mount with > the /sys/fs/cgroup. ``` > } else if (strcmp(token, "cpuset") == 0) { > assert(cg_infos[CPUSET_IDX]._mount_path == NULL, "stomping of _mount_path"); > cg_infos[CPUSET_IDX]._mount_path = os::strdup(tmpmount); > cg_infos[CPUSET_IDX]._root_mount_path = os::strdup(tmproot); > cg_infos[CPUSET_IDX]._data_complete = true; > ``` You're right, the logic to ignore a `cpusets` entry should be moved into the `else if (strcmp(token, "cpuset") == 0)` and I've done that in the new version of my patch. However, the problem is that we don't know if the first or the second occurrence of `cpusets? is *the right one*. I'm also not sure if the Cgroup controllers are ALWAYS mounted under `/sys/fs/cgroup` (see my answer to Severin's comments). If you can confirm that that's really the case, we could go with the simple solution proposed by Severin. Otherwise, my current solution tries to be conservative and does not assume a predefined mount point for Cgroup controllers. Instead it records the mount point of the first controller out of `cpu`, `cpuacct` and `memory` and checks that all the other controllers from this set are mounted under the same location. In addition, it only accepts `cpusets` if it is under the same mount point like the other controllers or under `/sys/fs/cgroup` if no other controller has been seen before. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From shade at openjdk.java.net Fri Sep 25 10:12:15 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 25 Sep 2020 10:12:15 GMT Subject: RFR: 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence [v3] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 05:20:02 GMT, David Holmes wrote: > Changes seem fine to me. Good to have barriers handled only in OrderAccess. Thanks for review. @bulasevich, do you agree with this change? ------------- PR: https://git.openjdk.java.net/jdk/pull/298 From bulasevich at openjdk.java.net Fri Sep 25 10:12:15 2020 From: bulasevich at openjdk.java.net (Boris Ulasevich) Date: Fri, 25 Sep 2020 10:12:15 GMT Subject: RFR: 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence [v3] In-Reply-To: References: Message-ID: <3yjs36Ev_O0EWkKJhJM-JpvUcNx96NHduziUhdWFYVs=.cb618967-b887-446b-be07-26d25e373312@github.com> On Fri, 25 Sep 2020 10:07:22 GMT, Aleksey Shipilev wrote: >> Changes seem fine to me. Good to have barriers handled only in OrderAccess. > >> Changes seem fine to me. Good to have barriers handled only in OrderAccess. > > Thanks for review. @bulasevich, do you agree with this change? Absolutely. The change is good. ------------- PR: https://git.openjdk.java.net/jdk/pull/298 From shade at openjdk.java.net Fri Sep 25 10:16:32 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 25 Sep 2020 10:16:32 GMT Subject: Integrated: 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 12:41:02 GMT, Aleksey Shipilev wrote: > In `atomic_bsd_zero.hpp` and `atomic_linux_zero.hpp` there are uses of __sync_synchronize(). However, > `orderAccess_*_zero.hpp` calls the kernel helper, because: > /* > * ARM Kernel helper for memory barrier. > * Using __asm __volatile ("":::"memory") does not work reliable on ARM > * and gcc __sync_synchronize(); implementation does not use the kernel > * helper for all gcc versions so it is unreliable to use as well. > */ > > We need to clean this up to use `OrderAccess::fence()` to gain access to the kernel helper. > > This depends on JDK-8253464 being fixed first. > > Attention @bulasevich. > > Testing: > - [x] ARM32 Zero jcstress > - [x] Mac OS x86_64 Zero jcstress This pull request has now been integrated. Changeset: f62eefc0 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/f62eefc0 Stats: 10 lines in 2 files changed: 4 ins; 0 del; 6 mod 8253469: ARM32 Zero: replace usages of __sync_synchronize() with OrderAccess::fence Reviewed-by: dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/298 From stuefe at openjdk.java.net Fri Sep 25 11:00:20 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 25 Sep 2020 11:00:20 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: > Hi all, > > this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the > delay, had vacation then the entrance of Skara delayed things a bit. > For the delta diff please see [3]. > > This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all > feedback individually in this PR body, but I incorporated almost all into the new revision. > What changed since the last version: > > - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the > group consent. > > - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for > overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth > class loader which made it rather pointless. Now I run it always. > > - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. > > - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) > > Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. > > - In ChunkManager::purge() I improved the comments after discussions with Leo. > > - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due > to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run > every time. > > - Fixed indentation issues as Leo requested > > - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested > > - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes > and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more > flexible since it allows us to have list with purge-able and non-purge-able nodes. > > - and various smaller fixes, mainly on request of Leo. > > @lkorinth: > >> VirtualSpaceNode.hpp >> >>102 // Start pointer of the area. >>103 MetaWord* const _base; >> >>How does this differ from _rs._base? Really needed? >> >>105 // Size, in words, of the whole node >>106 const size_t _word_size; >> >>Can we not calculate this from _rs.size()? > > You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it > is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly > improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at > some point and make those members const - as they should be - then I would rewrite this as you suggest. > Thanks, again, for all your review work! > > ------ > > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html > [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html > [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Remove empty lines from include sections ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/336/files - new: https://git.openjdk.java.net/jdk/pull/336/files/b5eaf32c..9f68bab7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=00-01 Stats: 66 lines in 39 files changed: 0 ins; 66 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/336.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/336/head:pull/336 PR: https://git.openjdk.java.net/jdk/pull/336 From stuefe at openjdk.java.net Fri Sep 25 11:00:21 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 25 Sep 2020 11:00:21 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 07:51:51 GMT, Stefan Karlsson wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove empty lines from include sections > > src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 47: > >> 45: #include "runtime/mutexLocker.hpp" >> 46: #include "runtime/os.hpp" >> 47: > > Just a drive-by comment. Please don't introduce this new style of adding newlines between our internal include lines. > The same comment about the newlines after precompiled.hpp. I know that the Shenandoah code already does that, and I've > seen some of the metaspace code do that as well, but please stick to the set standard way of listing our includes. This > applies to all files in this patch. Sure, no problem. I changed it. ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From zgu at openjdk.java.net Fri Sep 25 12:10:16 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 25 Sep 2020 12:10:16 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: <5Tu_a_8mMPbGL5KyoZ4FzDhTswNVQFcg4aKuhMYd01k=.ec65dd18-e3fc-4c14-b2f3-40fa38df6af6@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <5Tu_a_8mMPbGL5KyoZ4FzDhTswNVQFcg4aKuhMYd01k=.ec65dd18-e3fc-4c14-b2f3-40fa38df6af6@github.com> Message-ID: On Fri, 25 Sep 2020 04:29:05 GMT, Thomas Stuefe wrote: >> For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report >> thread's state after thread terminated. >> But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid >> thread stack, etc. >> This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated >> thread from live thread. >> Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). >> >> Note: Java thread does not have such issue, its thread object is deleted before thread terminates. > > Hi Zhengyu, > > Could this expose us to a race? > > What I mean is: on Linux and BSD I see this code in os::create_thread() after thread creation and after the initial > handshake with the parent thread: >> A pthread_create ... >> B initial handshake... >> >> C // Aborted due to thread limit being reached >> if (state == ZOMBIE) { >> thread->set_osthread(NULL); >> delete osthread; >> return false; >> } > > (Weirdly enough we don't do (C) for Windows and AIX). > > I may be wrong do not find anywhere in the code where we set the ZOMBIE state today. So (C) may be just dead code now. > > With your patch, we now set the state to ZOMBIE in post_run. If child finishes and sets ZOMBIE state before parent > reaches (C) os::create_thread would return not false even though the child thread started correctly and ran through? > _Mailing list message from [Thomas St??fe](mailto:thomas.stuefe at gmail.com) on > [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_ > Hi David, > > On Fri, Sep 25, 2020 at 7:06 AM David Holmes > wrote: > > > Hi Thomas, > > On 25/09/2020 2:31 pm, Thomas Stuefe wrote: > > > On Thu, 24 Sep 2020 18:14:10 GMT, Zhengyu Gu wrote: > > > > For some non-JavaThread, their object instances can outlast threads' > > > > lifespan. For example, we still can query/report > > > > thread's state after thread terminated. > > > > But the query/report currently returns wrong state. E.g. a terminated > > > > thread appears to be alive and seemly has valid > > > > thread stack, etc. > > > > This patch sets non-JavaThread's state to ZOMBIE just before it > > > > terminates, so that we can distinguish terminated > > > > thread from live thread. > > > > Also, thread should not report its SMR info, if it has terminated or it > > > > never started (thread->osthread() == NULL). > > > > Note: Java thread does not have such issue, its thread object is > > > > deleted before thread terminates. > > > > > > > > > Hi Zhengyu, > > > Could this expose us to a race? > > > What I mean is: on Linux and BSD I see this code in os::create_thread() > > > after thread creation and after the initial > > > handshake with the parent thread: > > > > A pthread_create ... > > > > B initial handshake... > > > > C // Aborted due to thread limit being reached > > > > if (state == ZOMBIE) { > > > > thread->set_osthread(NULL); > > > > delete osthread; > > > > return false; > > > > } > > > > > > > > > (Weirdly enough we don't do (C) for Windows and AIX). > > > I may be wrong do not find anywhere in the code where we set the ZOMBIE > > > state today. So (C) may be just dead code now. > > > > > > Funnily enough it seems to be dead code that you created: > > https://bugs.openjdk.java.net/browse/JDK-8078513 > > but mea culpa too as I missed it in the review. :) > > Hah! Thanks for reminding me :) > > Okay, that's still left over from pre-NPTL. > > @zhengyu: Can you remove these two sections too in os_bsd.cpp and > os_linux.cpp or should I do this in a separate patch? > > Cheers Thomas > > > Otherwise, yes this would be a theoretical race. So lets just delete the > > above. > > Cheers, > > David > > > With your patch, we now set the state to ZOMBIE in post_run. If child > > > finishes and sets ZOMBIE state before parent > > > reaches (C) os::create_thread would return not false even though the > > > child thread started correctly and ran through? Hi Thomas and David, I did notice the dead code (no one sets osthread's state to ZOMBIE before this patch) and plan to file a separate CR. I don't quite understand the race condition you described above. My understanding is that, os::create_thread() starts native thread, but it blocks until os::start_thread() to unblock it. Therefore, there is no chance for os::create_thread() to see ZOMBIE state. What do I miss? Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From bobv at openjdk.java.net Fri Sep 25 12:22:41 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Fri, 25 Sep 2020 12:22:41 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= In-Reply-To: References: Message-ID: <7pvy7kdTmrJTHgM_DJxzdcypl9Fj4QN2Bp-Y9e3sPKI=.915974b5-b810-4379-b78f-308ad32d4b4f@github.com> On Thu, 24 Sep 2020 18:32:38 GMT, Harold Seigel wrote: > Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit > capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related > information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then > only memory usage is returned. The fix was tested by running container tests on systems with and without swap limit > capabilities. Additionally, the changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, > and Mac OS, and running tier3 - tier5 tests on Linux x64. Marked as reviewed by bobv (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From stefank at openjdk.java.net Fri Sep 25 12:49:04 2020 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 25 Sep 2020 12:49:04 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 10:35:20 GMT, Thomas Stuefe wrote: >> src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 47: >> >>> 45: #include "runtime/mutexLocker.hpp" >>> 46: #include "runtime/os.hpp" >>> 47: >> >> Just a drive-by comment. Please don't introduce this new style of adding newlines between our internal include lines. >> The same comment about the newlines after precompiled.hpp. I know that the Shenandoah code already does that, and I've >> seen some of the metaspace code do that as well, but please stick to the set standard way of listing our includes. This >> applies to all files in this patch. > > Sure, no problem. I changed it. Much obliged ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From sgehwolf at openjdk.java.net Fri Sep 25 13:12:44 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Fri, 25 Sep 2020 13:12:44 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 18:32:38 GMT, Harold Seigel wrote: > Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit > capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related > information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then > only memory usage is returned. The fix was tested by running container tests on systems with and without swap limit > capabilities. Additionally, the changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, > and Mac OS, and running tier3 - tier5 tests on Linux x64. test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java line 163: > 161: // or the cgroup is not mounted. Memory limited without swap." > 162: // the getTotalSwapSpaceSize does not return the expected result and > 163: // getFreeSwapSpaceSize returns 0 https://bugs.openjdk.java.net/browse/JDK-8244500 indicates a system where we have the kernel warning, but getFreeSwapSpaceSize() returned the system values. *Not* 0 as indicated in this comment. This yields me to believe there are inconsistently behaving systems out there. Could we rephrase this comment? test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java line 167: > 165: out.shouldContain("OperatingSystemMXBean.getTotalSwapSpaceSize: " + expectedSwap); > 166: } catch(RuntimeException ex) { > 167: out.shouldMatch("OperatingSystemMXBean.getTotalSwapSpaceSize: -?([0-9]+)"); Is the optional `-` intentional? My understanding is that it should never be negative, should it not? ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From stuefe at openjdk.java.net Fri Sep 25 13:23:29 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 25 Sep 2020 13:23:29 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <5Tu_a_8mMPbGL5KyoZ4FzDhTswNVQFcg4aKuhMYd01k=.ec65dd18-e3fc-4c14-b2f3-40fa38df6af6@github.com> Message-ID: On Fri, 25 Sep 2020 12:07:34 GMT, Zhengyu Gu wrote: > Hi Thomas and David, > > I did notice the dead code (no one sets osthread's state to ZOMBIE before this patch) and plan to file a separate CR. > > I don't quite understand the race condition you described above. > > My understanding is that, os::create_thread() starts native thread, but it blocks until os::start_thread() to unblock > it. Therefore, there is no chance for os::create_thread() to see ZOMBIE state. > What do I miss? > Nothing, I just missed the double handshake. Sorry for the noise. > Thanks. Okay, the fix looks fine to me then. ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From zgu at openjdk.java.net Fri Sep 25 13:44:30 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 25 Sep 2020 13:44:30 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <5Tu_a_8mMPbGL5KyoZ4FzDhTswNVQFcg4aKuhMYd01k=.ec65dd18-e3fc-4c14-b2f3-40fa38df6af6@github.com> Message-ID: On Fri, 25 Sep 2020 13:21:07 GMT, Thomas Stuefe wrote: > > Hi Thomas and David, > > I did notice the dead code (no one sets osthread's state to ZOMBIE before this patch) and plan to file a separate CR. > > I don't quite understand the race condition you described above. > > My understanding is that, os::create_thread() starts native thread, but it blocks until os::start_thread() to unblock > > it. Therefore, there is no chance for os::create_thread() to see ZOMBIE state. What do I miss? > > Nothing, I just missed the double handshake. Sorry for the noise. > > > Thanks. > > Okay, the fix looks fine to me then. Thanks, Thomas. I filed JDK-8253647 to remove dead code. ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From hseigel at openjdk.java.net Fri Sep 25 14:17:26 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 25 Sep 2020 14:17:26 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 13:06:15 GMT, Severin Gehwolf wrote: >> Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit >> capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related >> information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then >> only memory usage is returned. The fix was tested by running container tests on systems with and without swap limit >> capabilities. Additionally, the changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, >> and Mac OS, and running tier3 - tier5 tests on Linux x64. > > test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java line 167: > >> 165: out.shouldContain("OperatingSystemMXBean.getTotalSwapSpaceSize: " + expectedSwap); >> 166: } catch(RuntimeException ex) { >> 167: out.shouldMatch("OperatingSystemMXBean.getTotalSwapSpaceSize: -?([0-9]+)"); > > Is the optional `-` intentional? My understanding is that it should never be negative, should it not? The optional - is intential in case UNLIMITED is return (-1). > test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java line 163: > >> 161: // or the cgroup is not mounted. Memory limited without swap." >> 162: // the getTotalSwapSpaceSize does not return the expected result and >> 163: // getFreeSwapSpaceSize returns 0 > > https://bugs.openjdk.java.net/browse/JDK-8244500 indicates a system where we have the kernel warning, but > getFreeSwapSpaceSize() returned the system values. *Not* 0 as indicated in this comment. This yields me to believe > there are inconsistently behaving systems out there. Could we rephrase this comment? Thanks for reviewing this! Does this change to the comment look better? ` // in case of warnings like : "Your kernel does not support swap limit capabilities // or the cgroup is not mounted. Memory limited without swap." // the getTotalSwapSpaceSize and getFreeSwapSpaceSize return the system // values as the container setup isn't supported in that case. ` ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From dholmes at openjdk.java.net Fri Sep 25 14:26:51 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 25 Sep 2020 14:26:51 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> Message-ID: <-MGl9WfKGYhIp3dk96wRn86qovh1QItkxm6occqWpao=.87a44d0d-681e-4118-9994-23faf1703614@github.com> On Thu, 24 Sep 2020 18:14:10 GMT, Zhengyu Gu wrote: > For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report > thread's state after thread terminated. > But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid > thread stack, etc. > This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated > thread from live thread. > Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). > > Note: Java thread does not have such issue, its thread object is deleted before thread terminates. src/hotspot/share/runtime/thread.cpp line 919: > 917: osthread()->print_on(st); > 918: > 919: if (osthread()->get_state() != ZOMBIE) { I'm not sure print_on(), as opposed to print_on_error() can ever be called with a ZOMBIE thread. I didn't expect any change in this method. src/hotspot/share/runtime/thread.cpp line 955: > 953: } > 954: } else { > 955: st->print(" Aborted"); Not sure this is reachable and if it is then I'm not sure what state the thread is actually in. If a Thread never gets an osThread() it isn't started so shouldn't be locatable by any means. ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From bobv at openjdk.java.net Fri Sep 25 14:28:14 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Fri, 25 Sep 2020 14:28:14 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> Message-ID: On Fri, 25 Sep 2020 08:42:04 GMT, Volker Simonis wrote: >> I was expecting to see some logic in this "else if" section that recorded the first occurance but did the validation on >> the second pass (cg_infos[CPUSET_IDX]._mount_path != NULL). When this situation is detected, we accept the mount with >> the /sys/fs/cgroup. >> } else if (strcmp(token, "cpuset") == 0) { >> assert(cg_infos[CPUSET_IDX]._mount_path == NULL, "stomping of _mount_path"); >> cg_infos[CPUSET_IDX]._mount_path = os::strdup(tmpmount); >> cg_infos[CPUSET_IDX]._root_mount_path = os::strdup(tmproot); >> cg_infos[CPUSET_IDX]._data_complete = true; > >> I was expecting to see some logic in this "else if" section that recorded the first occurance but did the validation on >> the second pass (cg_infos[CPUSET_IDX]._mount_path != NULL). When this situation is detected, we accept the mount with >> the /sys/fs/cgroup. ``` >> } else if (strcmp(token, "cpuset") == 0) { >> assert(cg_infos[CPUSET_IDX]._mount_path == NULL, "stomping of _mount_path"); >> cg_infos[CPUSET_IDX]._mount_path = os::strdup(tmpmount); >> cg_infos[CPUSET_IDX]._root_mount_path = os::strdup(tmproot); >> cg_infos[CPUSET_IDX]._data_complete = true; >> ``` > > You're right, the logic to ignore a `cpusets` entry should be moved into the `else if (strcmp(token, "cpuset") == 0)` > and I've done that in the new version of my patch. > However, the problem is that we don't know if the first or the second occurrence of `cpusets? is *the right one*. I'm > also not sure if the Cgroup controllers are ALWAYS mounted under `/sys/fs/cgroup` (see my answer to Severin's > comments). If you can confirm that that's really the case, we could go with the simple solution proposed by Severin. > Otherwise, my current solution tries to be conservative and does not assume a predefined mount point for Cgroup > controllers. Instead it records the mount point of the first controller out of `cpu`, `cpuacct` and `memory` and checks > that all the other controllers from this set are mounted under the same location. In addition, it only accepts > `cpusets` if it is under the same mount point like the other controllers or under `/sys/fs/cgroup` if no other > controller has been seen before. In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second one comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which we should respect. As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not better than assuming they are all mounted in the same subdirectory. If you follow my suggestion, then we will only be subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on /sys/fs/cgroup/cpuset. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From sgehwolf at openjdk.java.net Fri Sep 25 14:34:59 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Fri, 25 Sep 2020 14:34:59 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 14:12:02 GMT, Harold Seigel wrote: >> test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java line 163: >> >>> 161: // or the cgroup is not mounted. Memory limited without swap." >>> 162: // the getTotalSwapSpaceSize does not return the expected result and >>> 163: // getFreeSwapSpaceSize returns 0 >> >> https://bugs.openjdk.java.net/browse/JDK-8244500 indicates a system where we have the kernel warning, but >> getFreeSwapSpaceSize() returned the system values. *Not* 0 as indicated in this comment. This yields me to believe >> there are inconsistently behaving systems out there. Could we rephrase this comment? > > Thanks for reviewing this! > Does this change to the comment look better? > > ` // in case of warnings like : "Your kernel does not support swap limit capabilities > // or the cgroup is not mounted. Memory limited without swap." > // the getTotalSwapSpaceSize and getFreeSwapSpaceSize return the system > // values as the container setup isn't supported in that case. > ` This looks fine. Thanks! >> test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java line 167: >> >>> 165: out.shouldContain("OperatingSystemMXBean.getTotalSwapSpaceSize: " + expectedSwap); >>> 166: } catch(RuntimeException ex) { >>> 167: out.shouldMatch("OperatingSystemMXBean.getTotalSwapSpaceSize: -?([0-9]+)"); >> >> Is the optional `-` intentional? My understanding is that it should never be negative, should it not? > > The optional - is intential in case UNLIMITED is return (-1). Hmm, but the OperatingSystemMXBean impl falls back to returning the system (host) values if the container limits are unlimited. So the internal Metrics value of -1 for UNLIMITED should never be seen. Has this been actually seen in some tests somewhere? ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From sgehwolf at openjdk.java.net Fri Sep 25 14:38:47 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Fri, 25 Sep 2020 14:38:47 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 14:31:56 GMT, Severin Gehwolf wrote: >> The optional - is intential in case UNLIMITED is return (-1). > > Hmm, but the OperatingSystemMXBean impl falls back to returning the system (host) values if the container limits are > unlimited. So the internal Metrics value of -1 for UNLIMITED should never be seen. Has this been actually seen in some > tests somewhere? https://github.com/openjdk/jdk/blob/a75edc29c6ce41116cc99530aa1710efb62c6d5a/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java#L63 ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From sgehwolf at openjdk.java.net Fri Sep 25 14:40:27 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Fri, 25 Sep 2020 14:40:27 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> Message-ID: On Fri, 25 Sep 2020 07:46:05 GMT, Volker Simonis wrote: > If the manual mount point comes first in mountinfo, check_mount_path() would record its mount path as the correct one > and would bail out later, when checking the other controllers. Would it? In that case we'd have `mount_path == NULL` and then the path gets checked against `/sys/fs/cgroup`, which it isn't and gets skipped. If what you said was true, wouldn't [this patch](https://github.com/jerboaa/jdk/commit/97bb674c6fa9031d5ca34d3de74bba16c5afdb46) trigger the assertion in the else clause in `check_mount_path()` when running the new regression test, which has data that you describe? ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From sgehwolf at openjdk.java.net Fri Sep 25 14:45:09 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Fri, 25 Sep 2020 14:45:09 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> Message-ID: On Fri, 25 Sep 2020 14:24:22 GMT, Bob Vandette wrote: > In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second one > comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second > non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which > we should respect. As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not > better than assuming they are all mounted in the same subdirectory. If you follow my suggestion, then we will only be > subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on /sys/fs/cgroup/cpuset. Like this perhaps? https://github.com/jerboaa/jdk/commit/45608cf9a13068f3724cc65d12d8dc819bb2d066 I tend to think that in actual container workloads it might be unlikely to actually see multiple cpuset mounts. So in a sense that's a special case. The general case, before this bug, shouldn't be penalized. So perhaps the above would be worth considering. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From bobv at openjdk.java.net Fri Sep 25 14:48:01 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Fri, 25 Sep 2020 14:48:01 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> Message-ID: <18mmirKkzSCZx-CPAZFAMDDkIIh0oVPyqwVklq2QuMc=.64e46097-ced2-4207-b164-f006ac5c0d42@github.com> On Fri, 25 Sep 2020 14:42:06 GMT, Severin Gehwolf wrote: >> In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second >> one comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second >> non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which >> we should respect. As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not >> better than assuming they are all mounted in the same subdirectory. If you follow my suggestion, then we will only >> be subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on >> /sys/fs/cgroup/cpuset. > >> In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second one >> comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second >> non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which >> we should respect. As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not >> better than assuming they are all mounted in the same subdirectory. If you follow my suggestion, then we will only be >> subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on /sys/fs/cgroup/cpuset. > > Like this perhaps? > https://github.com/jerboaa/jdk/commit/45608cf9a13068f3724cc65d12d8dc819bb2d066 > > I tend to think that in actual container workloads it might be unlikely to actually see multiple cpuset mounts. So in a > sense that's a special case. The general case, before this bug, shouldn't be penalized. So perhaps the above would be > worth considering. > > In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second one > > comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second > > non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which > > we should respect. As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not > > better than assuming they are all mounted in the same subdirectory. If you follow my suggestion, then we will only be > > subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on /sys/fs/cgroup/cpuset. > > Like this perhaps? > [jerboaa at 45608cf](https://github.com/jerboaa/jdk/commit/45608cf9a13068f3724cc65d12d8dc819bb2d066) > > I tend to think that in actual container workloads it might be unlikely to actually see multiple cpuset mounts. So in a > sense that's a special case. The general case, before this bug, shouldn't be penalized. So perhaps the above would be > worth considering. Yes, that's exactly what I was thinking. In the manual case from your example, we'd record "/" as the mount point if it showed up first and then overwrite it if /sys/fs/cgroup came along. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From zgu at openjdk.java.net Fri Sep 25 14:58:30 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 25 Sep 2020 14:58:30 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: <-MGl9WfKGYhIp3dk96wRn86qovh1QItkxm6occqWpao=.87a44d0d-681e-4118-9994-23faf1703614@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <-MGl9WfKGYhIp3dk96wRn86qovh1QItkxm6occqWpao=.87a44d0d-681e-4118-9994-23faf1703614@github.com> Message-ID: On Fri, 25 Sep 2020 14:23:44 GMT, David Holmes wrote: >> For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report >> thread's state after thread terminated. >> But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid >> thread stack, etc. >> This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated >> thread from live thread. >> Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). >> >> Note: Java thread does not have such issue, its thread object is deleted before thread terminates. > > src/hotspot/share/runtime/thread.cpp line 955: > >> 953: } >> 954: } else { >> 955: st->print(" Aborted"); > > Not sure this is reachable and if it is then I'm not sure what state the thread is actually in. If a Thread never gets > an osThread() it isn't started so shouldn't be locatable by any means. so, you prefer "ShouldNotReachHere()" ? ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From zgu at openjdk.java.net Fri Sep 25 15:06:03 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 25 Sep 2020 15:06:03 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: <-MGl9WfKGYhIp3dk96wRn86qovh1QItkxm6occqWpao=.87a44d0d-681e-4118-9994-23faf1703614@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <-MGl9WfKGYhIp3dk96wRn86qovh1QItkxm6occqWpao=.87a44d0d-681e-4118-9994-23faf1703614@github.com> Message-ID: On Fri, 25 Sep 2020 14:20:30 GMT, David Holmes wrote: >> For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report >> thread's state after thread terminated. >> But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid >> thread stack, etc. >> This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated >> thread from live thread. >> Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). >> >> Note: Java thread does not have such issue, its thread object is deleted before thread terminates. > > src/hotspot/share/runtime/thread.cpp line 919: > >> 917: osthread()->print_on(st); >> 918: >> 919: if (osthread()->get_state() != ZOMBIE) { > > I'm not sure print_on(), as opposed to print_on_error() can ever be called with a ZOMBIE thread. I didn't expect any > change in this method. For thread, e.g. G1ConcurrentMarkThread, there is nothing to prevent calling _cm_thread->print_on(tty) after it terminated, although, I can not find a case right now. You prefer an assertion instead? ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From hseigel at openjdk.java.net Fri Sep 25 15:13:29 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 25 Sep 2020 15:13:29 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= [v2] In-Reply-To: References: Message-ID: > Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit > capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related > information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then > only memory usage is returned. The fix was tested by running container tests on systems with and without swap limit > capabilities. Additionally, the changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, > and Mac OS, and running tier3 - tier5 tests on Linux x64. Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 swap limit capabilities ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/342/files - new: https://git.openjdk.java.net/jdk/pull/342/files/27462257..bcddedd0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=342&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=342&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/342.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/342/head:pull/342 PR: https://git.openjdk.java.net/jdk/pull/342 From hseigel at openjdk.java.net Fri Sep 25 15:13:36 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 25 Sep 2020 15:13:36 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 14:35:46 GMT, Severin Gehwolf wrote: >> Hmm, but the OperatingSystemMXBean impl falls back to returning the system (host) values if the container limits are >> unlimited. So the internal Metrics value of -1 for UNLIMITED should never be seen. Has this been actually seen in some >> tests somewhere? > > https://github.com/openjdk/jdk/blob/a75edc29c6ce41116cc99530aa1710efb62c6d5a/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java#L63 The -1 has not been seen. I removed the option '-' from the pattern. Please see the changes in the latest commit. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From bobv at openjdk.java.net Fri Sep 25 15:24:55 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Fri, 25 Sep 2020 15:24:55 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 15:10:08 GMT, Harold Seigel wrote: >> https://github.com/openjdk/jdk/blob/a75edc29c6ce41116cc99530aa1710efb62c6d5a/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java#L63 > > The -1 has not been seen. I removed the option '-' from the pattern. Please see the changes in the latest commit. > Thanks! I thought we were seeing -1 but perhaps this was a previous result before we decided to change Swap+Memory to report Memory instead of failure. I'm going to rerun the tests with and without swap enabled to verify. ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From plevart at openjdk.java.net Fri Sep 25 15:50:48 2020 From: plevart at openjdk.java.net (Peter Levart) Date: Fri, 25 Sep 2020 15:50:48 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v3] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> <0H-sMIm0mwGW2f2OxwAwMGBtxDf2BUf7ds3tGEgYXrc=.849aab2b-4cf3-49f7-8e49-ad1df75cb0bb@github.com> Message-ID: On Fri, 25 Sep 2020 02:38:01 GMT, Vicente Romero wrote: >> I have modified the `@since`: 14 -> 16 > > [CSR: Record Classes](https://bugs.openjdk.java.net/browse/JDK-8253605) Hi @vicente-romero-oracle , note that besides tests, there is also a JMH benchmark that measures the performance of records deserialization which forced us to modify the build procedure for all benchmarks to include --enable-preview option in JDK 15 and backports (see https://bugs.openjdk.java.net/browse/JDK-8248135). If you undo this change in JDK 16 then also the problem described in https://bugs.openjdk.java.net/browse/JDK-8250669 and https://bugs.openjdk.java.net/browse/JDK-8248429 will disapear. After that, perhaps undoing the same for JDK 15 and backports together with removing the benchmark is also possible to resolve the issues in older releases as most developement will probably happen in JDK 16 then and so the need for performance testing will mostly be needed in there. We still have to figure out how to enable having benchmarks for preview features as in the future the sure will be a need for that. ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From sgehwolf at openjdk.java.net Fri Sep 25 15:53:03 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Fri, 25 Sep 2020 15:53:03 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 15:13:29 GMT, Harold Seigel wrote: >> Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit >> capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related >> information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then >> only memory usage is returned. The fix was tested by running container tests on systems with and without swap limit >> capabilities. Additionally, the changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, >> and Mac OS, and running tier3 - tier5 tests on Linux x64. > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 swap limit capabilities I'll also run this through testing and will report back. test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java line 167: > 165: out.shouldContain("OperatingSystemMXBean.getTotalSwapSpaceSize: " + expectedSwap); > 166: } catch(RuntimeException ex) { > 167: out.shouldMatch("OperatingSystemMXBean.getTotalSwapSpaceSize: [0-9]+"); This should probably be: out.shouldMatch("OperatingSystemMXBean\.getTotalSwapSpaceSize: [0-9]+"); ------------- Changes requested by sgehwolf (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/342 From sgehwolf at openjdk.java.net Fri Sep 25 15:53:04 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Fri, 25 Sep 2020 15:53:04 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 15:47:46 GMT, Severin Gehwolf wrote: >> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: >> >> 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 swap limit capabilities > > test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java line 167: > >> 165: out.shouldContain("OperatingSystemMXBean.getTotalSwapSpaceSize: " + expectedSwap); >> 166: } catch(RuntimeException ex) { >> 167: out.shouldMatch("OperatingSystemMXBean.getTotalSwapSpaceSize: [0-9]+"); > > This should probably be: > out.shouldMatch("OperatingSystemMXBean\.getTotalSwapSpaceSize: [0-9]+"); Heh, it was pre-existing :) ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From bobv at openjdk.java.net Fri Sep 25 15:59:55 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Fri, 25 Sep 2020 15:59:55 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 15:22:05 GMT, Bob Vandette wrote: >> The -1 has not been seen. I removed the option '-' from the pattern. Please see the changes in the latest commit. >> Thanks! > > I thought we were seeing -1 but perhaps this was a previous result before we decided to change Swap+Memory to report > Memory instead of failure. I'm going to rerun the tests with and without swap enabled to verify. All container tests continue to pass with and without swap enabled after removing the "-". ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From zgu at openjdk.java.net Fri Sep 25 17:13:34 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 25 Sep 2020 17:13:34 GMT Subject: RFR: 8253647: Remove dead code in os::create_thread() on Linux/BSD Message-ID: Please review this small patch to remove dead code that is left behind by JDK-8078513. Test: - [x] tier1 on Linux 86_64 ------------- Commit messages: - 8253647: Remove dead code in os::create_thread() on Linux/BSD Changes: https://git.openjdk.java.net/jdk/pull/361/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=361&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253647 Stats: 14 lines in 2 files changed: 0 ins; 14 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/361.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/361/head:pull/361 PR: https://git.openjdk.java.net/jdk/pull/361 From sgehwolf at openjdk.java.net Fri Sep 25 17:15:06 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Fri, 25 Sep 2020 17:15:06 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 15:13:29 GMT, Harold Seigel wrote: >> Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit >> capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related >> information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then >> only memory usage is returned. The fix was tested by running container tests on systems with and without swap limit >> capabilities. Additionally, the changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, >> and Mac OS, and running tier3 - tier5 tests on Linux x64. > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 swap limit capabilities Good to go with or without the tiny fixup. ------------- Marked as reviewed by sgehwolf (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/342 From sgehwolf at openjdk.java.net Fri Sep 25 17:15:06 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Fri, 25 Sep 2020 17:15:06 GMT Subject: RFR: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 15:49:45 GMT, Severin Gehwolf wrote: >> test/hotspot/jtreg/containers/docker/TestMemoryAwareness.java line 167: >> >>> 165: out.shouldContain("OperatingSystemMXBean.getTotalSwapSpaceSize: " + expectedSwap); >>> 166: } catch(RuntimeException ex) { >>> 167: out.shouldMatch("OperatingSystemMXBean.getTotalSwapSpaceSize: [0-9]+"); >> >> This should probably be: >> out.shouldMatch("OperatingSystemMXBean\.getTotalSwapSpaceSize: [0-9]+"); > > Heh, it was pre-existing :) This isn't critical for this patch. As an pre-existing issue I won't insist on this. Either way is fine for me. ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From hseigel at openjdk.java.net Fri Sep 25 17:19:17 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 25 Sep 2020 17:19:17 GMT Subject: Integrated: 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 =?UTF-8?B?4oCm?= In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 18:32:38 GMT, Harold Seigel wrote: > Please review this change to fix memory docker tests failures on some Linux kernels w/o cgroupv1 swap limit > capabilities. The fix works by detecting that swap limit capabilities are not available and returning non-swap related > information. For example, if memory and swap usage is requested, and swap limit capabilities are not available, then > only memory usage is returned. The fix was tested by running container tests on systems with and without swap limit > capabilities. Additionally, the changes were regression tested by running tier1 and tier2 tests on Windows, Linux x64, > and Mac OS, and running tier3 - tier5 tests on Linux x64. This pull request has now been integrated. Changeset: 01875677 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/01875677 Stats: 115 lines in 8 files changed: 56 ins; 11 del; 48 mod 8250984: Memory Docker tests fail on some Linux kernels w/o cgroupv1 ? Reviewed-by: bobv, sgehwolf ------------- PR: https://git.openjdk.java.net/jdk/pull/342 From stuefe at openjdk.java.net Fri Sep 25 17:47:07 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 25 Sep 2020 17:47:07 GMT Subject: RFR: 8253647: Remove dead code in os::create_thread() on Linux/BSD In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 17:06:58 GMT, Zhengyu Gu wrote: > Please review this small patch to remove dead code that is left behind by JDK-8078513. > > Test: > - [x] tier1 on Linux 86_64 All good. Thanks for cleaning this. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/361 From stuefe at openjdk.java.net Fri Sep 25 18:13:57 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 25 Sep 2020 18:13:57 GMT Subject: RFR: 8253647: Remove dead code in os::create_thread() on Linux/BSD In-Reply-To: References: Message-ID: <9cUv7x7Kimt7O7j_uuYhKZhN3Naje8kb4k4BlEPydrc=.8c0263f3-92a4-48c9-8aa0-c791670c992b@github.com> On Fri, 25 Sep 2020 17:43:53 GMT, Thomas Stuefe wrote: >> Please review this small patch to remove dead code that is left behind by JDK-8078513. >> >> Test: >> - [x] tier1 on Linux 86_64 > > All good. Thanks for cleaning this. (I also think this is trivial) ------------- PR: https://git.openjdk.java.net/jdk/pull/361 From stuefe at openjdk.java.net Fri Sep 25 18:14:45 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 25 Sep 2020 18:14:45 GMT Subject: RFR: 8253572: [windows] CDS archive may fail to open with long file names In-Reply-To: References: <3zPvyYsEOg4LMcQ2kKf95hOgPH8liURcvf8Yk-W3dys=.b50fb1d8-aa64-4d2f-bf67-413eaad5cc63@github.com> Message-ID: On Fri, 25 Sep 2020 03:48:08 GMT, Thomas Stuefe wrote: >> Thanks for identifying the bug and fixing it. The patch looks good. > >> Thanks for identifying the bug and fixing it. The patch looks good. > > Thank you Calvin! Ping... may I have a second review please? ------------- PR: https://git.openjdk.java.net/jdk/pull/332 From ccheung at openjdk.java.net Fri Sep 25 19:06:12 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Fri, 25 Sep 2020 19:06:12 GMT Subject: RFR: 8247666: Support Lambda proxy classes in static CDS archive Message-ID: Following up on archiving lambda proxy classes in dynamic CDS archive ([JDK-8198698](https://bugs.openjdk.java.net/browse/JDK-8198698)), this RFE adds the functionality of archiving of lambda proxy classes in static CDS archive. When the -XX:DumpLoadedClassList is enabled, the constant pool index related to LambdaMetafactory that are resolved during application execution will be included in the classlist. The entry for a lambda proxy class in a class list will be of the following format: `@lambda-proxy: ` e.g. `@lambda-proxy: test/java/lang/invoke/MethodHandlesGeneralTest 233` `@lambda-proxy: test/java/lang/invoke/MethodHandlesGeneralTest 355` When dumping a CDS archive using the -Xshare:dump and -XX:ExtraSharedClassListFile options, when the above `@lambda-proxy` entry is encountered while parsing the classlist, we will resolve the corresponding constant pool indices (233 and 355 in the above example). As a result, lambda proxy classes will be generated for the constant pool entries, and will be cached using a similar mechanism to JDK-8198698. During dumping, there is check on the cp index and on the created BootstrapInfo using the cp index. VM will exit with an error message if the check has failed. During runtime when looking up a lambda proxy class, the lookup will be perform on the static CDS archive and if not found, then lookup from the dynamic archive if one is specified. (Only name change (IsDynamicDumpingEnabled -> IsCDSDumpingEnabled) is involved in the core-libs code.) Testing: tiers 1,2,3,4. Performance results (javac on HelloWorld on linux-x64): Results of " perf stat -r 40 bin/javac -J-Xshare:on -J-XX:SharedArchiveFile=javac2.jsa Bench_HelloWorld.java " 1: 2228016795 2067752708 (-160264087) ----- 377.760 349.110 (-28.650) ----- 2: 2223051476 2063016483 (-160034993) ----- 374.580 350.620 (-23.960) ---- 3: 2225908334 2067673847 (-158234487) ----- 375.220 350.990 (-24.230) ---- 4: 2225835999 2064596883 (-161239116) ----- 374.670 349.840 (-24.830) ---- 5: 2226005510 2061694332 (-164311178) ----- 373.512 351.120 (-22.392) ---- 6: 2225574949 2062657482 (-162917467) ----- 374.710 348.380 (-26.330) ----- 7: 2224702424 2064634122 (-160068302) ----- 373.670 349.510 (-24.160) ---- 8: 2226662277 2066301134 (-160361143) ----- 375.350 349.790 (-25.560) ---- 9: 2226761470 2063162795 (-163598675) ----- 374.260 351.290 (-22.970) ---- 10: 2230149089 2066203307 (-163945782) ----- 374.760 350.620 (-24.140) ---- ============================================================ 2226266109 2064768307 (-161497801) ----- 374.848 350.126 (-24.722) ---- instr delta = -161497801 -7.2542% time delta = -24.722 ms -6.5951% ------------- Commit messages: - fix extraneous whitespace - 8247666 (initial commit) Changes: https://git.openjdk.java.net/jdk/pull/364/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=364&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8247666 Stats: 1682 lines in 30 files changed: 1613 ins; 12 del; 57 mod Patch: https://git.openjdk.java.net/jdk/pull/364.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/364/head:pull/364 PR: https://git.openjdk.java.net/jdk/pull/364 From dcubed at openjdk.java.net Fri Sep 25 19:30:03 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 25 Sep 2020 19:30:03 GMT Subject: RFR: 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 17:15:01 GMT, Daniel D. Daugherty wrote: > Reduce noise in CI Tier2 by ProblemListing this test. My Tier2 test job ran sun/security/ec/TestEC.java on all regular Oracle platforms except for linux-aarch64. I also updated the generic-ARCH header comment to include 'aarch64' in the list so folks don't have to wonder whether to use 'aarch64' or 'arm' or 'arm64'. ------------- PR: https://git.openjdk.java.net/jdk/pull/362 From dcubed at openjdk.java.net Fri Sep 25 19:30:03 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 25 Sep 2020 19:30:03 GMT Subject: RFR: 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 Message-ID: Reduce noise in CI Tier2 by ProblemListing this test. ------------- Commit messages: - Add 'aarch64' to the generic-ARCH comment so folks know what to use. - Use the correct bug ID: 8253637. - 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 Changes: https://git.openjdk.java.net/jdk/pull/362/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=362&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253659 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/362.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/362/head:pull/362 PR: https://git.openjdk.java.net/jdk/pull/362 From iklam at openjdk.java.net Fri Sep 25 19:35:56 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 25 Sep 2020 19:35:56 GMT Subject: RFR: 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 In-Reply-To: References: Message-ID: <6NJpEia398_KuMI-K8RAhX0fbhw7jvPtqne2VVerEvY=.5d728aae-b091-4272-89b1-d46c75fb97e6@github.com> On Fri, 25 Sep 2020 17:15:01 GMT, Daniel D. Daugherty wrote: > Reduce noise in CI Tier2 by ProblemListing this test. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/362 From iklam at openjdk.java.net Fri Sep 25 19:35:56 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 25 Sep 2020 19:35:56 GMT Subject: RFR: 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 In-Reply-To: <6NJpEia398_KuMI-K8RAhX0fbhw7jvPtqne2VVerEvY=.5d728aae-b091-4272-89b1-d46c75fb97e6@github.com> References: <6NJpEia398_KuMI-K8RAhX0fbhw7jvPtqne2VVerEvY=.5d728aae-b091-4272-89b1-d46c75fb97e6@github.com> Message-ID: On Fri, 25 Sep 2020 19:31:36 GMT, Ioi Lam wrote: >> Reduce noise in CI Tier2 by ProblemListing this test. > > Marked as reviewed by iklam (Reviewer). Looks like a trivial change to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/362 From dcubed at openjdk.java.net Fri Sep 25 19:42:34 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 25 Sep 2020 19:42:34 GMT Subject: RFR: 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 In-Reply-To: References: <6NJpEia398_KuMI-K8RAhX0fbhw7jvPtqne2VVerEvY=.5d728aae-b091-4272-89b1-d46c75fb97e6@github.com> Message-ID: On Fri, 25 Sep 2020 19:32:57 GMT, Ioi Lam wrote: >> Marked as reviewed by iklam (Reviewer). > > Looks like a trivial change to me. @iklam - thanks for the fast review. ------------- PR: https://git.openjdk.java.net/jdk/pull/362 From dcubed at openjdk.java.net Fri Sep 25 19:42:34 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 25 Sep 2020 19:42:34 GMT Subject: Integrated: 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 17:15:01 GMT, Daniel D. Daugherty wrote: > Reduce noise in CI Tier2 by ProblemListing this test. This pull request has now been integrated. Changeset: 9150b902 Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/9150b902 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 Reviewed-by: iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/362 From gziemski at openjdk.java.net Fri Sep 25 19:58:22 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Fri, 25 Sep 2020 19:58:22 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 15:07:43 GMT, Gerard Ziemski wrote: >> I believe the reason for not using POSIX semaphores was that it is not supported on as400 PASE which we don't support >> in OpenJDK. I'm not aware of any problems when using this common POSIX code on AIX 7.2. Is SA supported on AIX? That >> would be new to me. But I'm not an expert for these topics. I hope that Thomas can find some time to take a look. > > Reverted the last commit. Thank you for taking a look and testing - please let me know how it goes. > > I have already tested this change here at Oracle a while ago, but I will do so once again... Mach5 hs-tier1,2,3,4,5,6,7 testing looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From dcubed at openjdk.java.net Fri Sep 25 20:11:48 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 25 Sep 2020 20:11:48 GMT Subject: RFR: 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 In-Reply-To: References: <6NJpEia398_KuMI-K8RAhX0fbhw7jvPtqne2VVerEvY=.5d728aae-b091-4272-89b1-d46c75fb97e6@github.com> Message-ID: On Fri, 25 Sep 2020 19:37:25 GMT, Daniel D. Daugherty wrote: >> Looks like a trivial change to me. > > @iklam - thanks for the fast review. Tony - sorry about that. The last time I checked your bug (about 2 hours ago) it wasn't "in progress". Please update your fix to remove the test from the ProblemList. ------------- PR: https://git.openjdk.java.net/jdk/pull/362 From zhuoren.wz at alibaba-inc.com Thu Sep 3 02:08:04 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Thu, 03 Sep 2020 02:08:04 -0000 Subject: =?UTF-8?B?UmU6IFsxNl1SRlIoUyk6ODI0OTA5MjpJbnRlcm5hbEVycm9yOiBhIGZhdWx0IG9jY3VycmVk?= =?UTF-8?B?IGluIGEgcmVjZW50IHVuc2FmZSBtZW1vcnkgYWNjZXNzIG9wZXJhdGlvbiBpbiBjb21waWxl?= =?UTF-8?B?ZCBKYXZhIGNvZGU=?= In-Reply-To: <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> References: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com>, <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> Message-ID: <05967ec2-ec8e-48f3-bd03-09d29dbcfbba.zhuoren.wz@alibaba-inc.com> Hi Patric, The original problem(https://bugs.openjdk.java.net/browse/JDK-8246051) is architecture specific. When running TestUnsafeUnalignedSwap.java on aarch64 platforms, JVM crashes without the fix because aarch64 does not support unaligned compare_and_swap. On X86 platforms the crash cannot be reproduced because X86 support unaligned compare_and_swap. Regards, Zhuoren ------------------------------------------------------------------ From:Patric Hedlin Sent At:2020 Sep. 2 (Wed.) 20:22 To:Sandler ; aarch64-port-dev ; hotspot-runtime-dev Cc:david.holmes ; rahul.v.raghavan Subject:Re: [16]RFR(S):8249092:InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code Hi Zhuoren, I don't actually know what behaviour to expect from the Unsafe atomics in this test-case but perhaps you could re-cap the original problem (addressed in JDK-8246051) since it seems to raise some questions. Did you have an example (real code) where this behaviour is essential? Best regards, Patric Hedlin (Including hotspot-runtime-dev at openjdk.java.net) On 2020-09-01 07:35, Wang Zhuo(Zhuoren) wrote: Hi, this is a fix for a test case. In -Xcomp mode, compiler/unsafe/TestUnsafeUnalignedSwap.java will fail because the catch misses the error due to async exception. This patch uses a loop to make sure the error can be caught. Also -Xcomp is added in test. BUG: https://bugs.openjdk.java.net/browse/JDK-8249092 Patch: http://cr.openjdk.java.net/~wzhuo/8249092/webrev.00/ Regards, Zhuoren From zhuoren.wz at alibaba-inc.com Thu Sep 3 03:49:22 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Thu, 03 Sep 2020 03:49:22 -0000 Subject: =?UTF-8?B?UmU6IFsxNl1SRlIoUyk6ODI0OTA5MjpJbnRlcm5hbEVycm9yOiBhIGZhdWx0IG9jY3VycmVk?= =?UTF-8?B?IGluIGEgcmVjZW50IHVuc2FmZSBtZW1vcnkgYWNjZXNzIG9wZXJhdGlvbiBpbiBjb21waWxl?= =?UTF-8?B?ZCBKYXZhIGNvZGU=?= In-Reply-To: References: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com> <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> <05967ec2-ec8e-48f3-bd03-09d29dbcfbba.zhuoren.wz@alibaba-inc.com>, Message-ID: <2b2f0d07-c0b2-4e8b-ad2c-fd0046273851.zhuoren.wz@alibaba-inc.com> David, Thank you very much for the explanation. The original crash indeed happened in a third party code using Unsafe to handle unaligned data. So you mean that it is the third party code's bug, we should not fix it in JVM, right? Regards, Zhuoren ------------------------------------------------------------------ From:David Holmes Sent At:2020 Sep. 3 (Thu.) 10:58 To:Sandler ; Patric Hedlin ; aarch64-port-dev ; hotspot-runtime-dev Cc:rahul.v.raghavan Subject:Re: [16]RFR(S):8249092:InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code Hi, On 3/09/2020 12:07 pm, Wang Zhuo(Zhuoren) wrote: > Hi Patric, > The original problem(https://bugs.openjdk.java.net/browse/JDK-8246051) > is architecture specific. When running TestUnsafeUnalignedSwap.java on > aarch64 platforms, JVM crashes without the fix because aarch64 does not > support unaligned compare_and_swap. On X86 platforms the crash cannot be > reproduced because X86 support unaligned compare_and_swap. Patric is asking about the original situation where an unaligned CAS was performed, which motivated the change made by JDK-8246051. The GuardUnsafeAccess mechanism (or more specifically the signal handling tricks underneath it) was not intended for general use, but was specifically created to deal with the case of page faults related to mapped ByteBuffers where we wanted application code using JDK exported APIs to get an InternalError when they did the wrong thing with their mapped files, instead of crashing. No part of the JDK should be calling Unsafe.compareAndSwap* with unaligned data - if it does that is a bug. If third-party code is using Unsafe directly then that is their problem and we do not try to make things easier for them. The use of GuardUnsafeAccess with the CAS primitives on some platforms can result in an infinite loop as the mechanism cannot be applied to arbitrary code sequences. Somewhat ironically Aarch64 is one platform that can suffer from this. Cheers, David ----- > > Regards, > Zhuoren > > ------------------------------------------------------------------ > From:Patric Hedlin > Sent At:2020 Sep. 2 (Wed.) 20:22 > To:Sandler ; aarch64-port-dev > ; hotspot-runtime-dev > > Cc:david.holmes ; rahul.v.raghavan > > Subject:Re: [16]RFR(S):8249092:InternalError: a fault occurred in a > recent unsafe memory access operation in compiled Java code > > Hi Zhuoren, > > I don't actually know what behaviour to expect from the Unsafe > atomics in this test-case but perhaps you could re-cap the original > problem (addressed in JDK-8246051) since it seems to raise some > questions. Did you have an example (real code) where this behaviour > is essential? > > Best regards, > Patric Hedlin > > (Including hotspot-runtime-dev at openjdk.java.net) > > > On 2020-09-01 07:35, Wang Zhuo(Zhuoren) wrote: > Hi, this is a fix for a test case. > In -Xcomp mode, compiler/unsafe/TestUnsafeUnalignedSwap.java will > fail because the catch misses the error due to async exception. > This patch uses a loop to make sure the error can be caught. Also > -Xcomp is added in test. > BUG: https://bugs.openjdk.java.net/browse/JDK-8249092 > Patch: http://cr.openjdk.java.net/~wzhuo/8249092/webrev.00/ > > > Regards, > Zhuoren > > From zhuoren.wz at alibaba-inc.com Thu Sep 3 05:52:49 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Thu, 03 Sep 2020 05:52:49 -0000 Subject: =?UTF-8?B?UmU6IFsxNl1SRlIoUyk6ODI0OTA5MjpJbnRlcm5hbEVycm9yOiBhIGZhdWx0IG9jY3VycmVk?= =?UTF-8?B?IGluIGEgcmVjZW50IHVuc2FmZSBtZW1vcnkgYWNjZXNzIG9wZXJhdGlvbiBpbiBjb21waWxl?= =?UTF-8?B?ZCBKYXZhIGNvZGU=?= In-Reply-To: References: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com> <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> <05967ec2-ec8e-48f3-bd03-09d29dbcfbba.zhuoren.wz@alibaba-inc.com> <2b2f0d07-c0b2-4e8b-ad2c-fd0046273851.zhuoren.wz@alibaba-inc.com>, Message-ID: OK, thanks David. > The use of GuardUnsafeAccess with the CAS primitives on some platforms > can result in an infinite loop as the mechanism cannot be applied to > arbitrary code sequences. Do you have a test for this? If the patch for JDK-8246051 indeed causes such problem. I am considering to do reverting or some other modifications. Regards, Zhuoren ------------------------------------------------------------------ From:David Holmes Sent At:2020 Sep. 3 (Thu.) 12:12 To:Sandler ; Patric Hedlin ; aarch64-port-dev ; hotspot-runtime-dev Cc:rahul.v.raghavan Subject:Re: [16]RFR(S):8249092:InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code On 3/09/2020 1:49 pm, Wang Zhuo(Zhuoren) wrote: > David, Thank you very much for the explanation. > The original crash indeed happened in a third party code using Unsafe to > handle unaligned data. > So you mean that it is the third party code's bug, we should not fix it > in JVM, right? Right. Unsafe is not a supported API and is by definition unsafe. If you use it and it crashes then you need to change your code. Cheers, David ----- > Regards, > Zhuoren > > ------------------------------------------------------------------ > From:David Holmes > Sent At:2020 Sep. 3 (Thu.) 10:58 > To:Sandler ; Patric Hedlin > ; aarch64-port-dev > ; hotspot-runtime-dev > > Cc:rahul.v.raghavan > Subject:Re: [16]RFR(S):8249092:InternalError: a fault occurred in a > recent unsafe memory access operation in compiled Java code > > Hi, > > On 3/09/2020 12:07 pm, Wang Zhuo(Zhuoren) wrote: > > Hi Patric, > > The original problem(https://bugs.openjdk.java.net/browse/JDK-8246051) > > is architecture specific. When running TestUnsafeUnalignedSwap.java on > > aarch64 platforms, JVM crashes without the fix because aarch64 does not > > support unaligned compare_and_swap. On X86 platforms the crash cannot be > > reproduced because X86 support unaligned compare_and_swap. > > Patric is asking about the original situation where an unaligned CAS was > > performed, which motivated the change made by JDK-8246051. > > The GuardUnsafeAccess mechanism (or more specifically the signal > handling tricks underneath it) was not intended for general use, but was > > specifically created to deal with the case of page faults related to > mapped ByteBuffers where we wanted application code using JDK exported > APIs to get an InternalError when they did the wrong thing with their > mapped files, instead of crashing. No part of the JDK should be calling > Unsafe.compareAndSwap* with unaligned data - if it does that is a bug. > If third-party code is using Unsafe directly then that is their problem > and we do not try to make things easier for them. > > The use of GuardUnsafeAccess with the CAS primitives on some platforms > can result in an infinite loop as the mechanism cannot be applied to > arbitrary code sequences. Somewhat ironically Aarch64 is one platform > that can suffer from this. > > Cheers, > David > ----- > > > > > Regards, > > Zhuoren > > > > ------------------------------------------------------------------ > > From:Patric Hedlin > > Sent At:2020 Sep. 2 (Wed.) 20:22 > > To:Sandler ; aarch64-port-dev > > ; hotspot-runtime-dev > > > > Cc:david.holmes ; rahul.v.raghavan > > > > Subject:Re: [16]RFR(S):8249092:InternalError: a fault occurred in a > > recent unsafe memory access operation in compiled Java code > > > > Hi Zhuoren, > > > > I don't actually know what behaviour to expect from the Unsafe > > atomics in this test-case but perhaps you could re-cap the original > > problem (addressed in JDK-8246051) since it seems to raise some > > questions. Did you have an example (real code) where this behaviour > > is essential? > > > > Best regards, > > Patric Hedlin > > > > (Including hotspot-runtime-dev at openjdk.java.net) > > > > > > On 2020-09-01 07:35, Wang Zhuo(Zhuoren) wrote: > > Hi, this is a fix for a test case. > > In -Xcomp mode, compiler/unsafe/TestUnsafeUnalignedSwap.java will > > fail because the catch misses the error due to async exception. > > This patch uses a loop to make sure the error can be caught. Also > > -Xcomp is added in test. > > BUG: https://bugs.openjdk.java.net/browse/JDK-8249092 > > Patch: http://cr.openjdk.java.net/~wzhuo/8249092/webrev.00/ > > > > > > Regards, > > Zhuoren > > > > > From zhuoren.wz at alibaba-inc.com Fri Sep 4 02:00:18 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Fri, 04 Sep 2020 02:00:18 -0000 Subject: =?UTF-8?B?UmU6IFsxNl1SRlIoUyk6ODI0OTA5MjpJbnRlcm5hbEVycm9yOiBhIGZhdWx0IG9jY3VycmVk?= =?UTF-8?B?IGluIGEgcmVjZW50IHVuc2FmZSBtZW1vcnkgYWNjZXNzIG9wZXJhdGlvbiBpbiBjb21waWxl?= =?UTF-8?B?ZCBKYXZhIGNvZGU=?= In-Reply-To: <5F0E7A5C-6FC8-4DE2-8D2B-D7D4AEBB382A@oracle.com> References: <90fda75f-62f9-4d96-b434-6dc15a5537af.zhuoren.wz@alibaba-inc.com> <82b0f0e3-b782-00d5-a778-264fa1e64eda@oracle.com> <05967ec2-ec8e-48f3-bd03-09d29dbcfbba.zhuoren.wz@alibaba-inc.com> <2b2f0d07-c0b2-4e8b-ad2c-fd0046273851.zhuoren.wz@alibaba-inc.com> <5845a59f-3790-906a-0ea4-fc0e862eb103@redhat.com>, <5F0E7A5C-6FC8-4DE2-8D2B-D7D4AEBB382A@oracle.com> Message-ID: <1367ca1b-262e-4092-9222-0483efd3606d.zhuoren.wz@alibaba-inc.com> OK, thanks all. I am creating new bug and making patch to revert 8246051. Regards, Zhuoren ------------------------------------------------------------------ From:Paul Sandoz Sent At:2020 Sep. 4 (Fri.) 03:22 To:Andrew Haley Cc:David Holmes ; Sandler ; Patric Hedlin ; aarch64-port-dev ; hotspot-runtime-dev ; rahul.v.raghavan Subject:Re: [16]RFR(S):8249092:InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code I also agree, the fix should be backed out. VarHandles can be used as a replacement for Unsafe. If a VarHandle is used to access the contents of a byte[] or ByteBuffer [1] then only the set/get methods are supported for misaligned access, use of all other methods will throw an IllegalStateException. Paul. [1] https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/lang/invoke/MethodHandles.html#byteBufferViewVarHandle(java.lang.Class,java.nio.ByteOrder) On Sep 3, 2020, at 1:41 AM, Andrew Haley wrote: On 03/09/2020 05:11, David Holmes wrote: On 3/09/2020 1:49 pm, Wang Zhuo(Zhuoren) wrote: David, Thank you very much for the explanation. The original crash indeed happened in a third party code using Unsafe to handle unaligned data. So you mean that it is the third party code's bug, we should not fix it in JVM, right? Right. Unsafe is not a supported API and is by definition unsafe. If you use it and it crashes then you need to change your code. I agree. I hindsight, I should probably not have approved 8246051. I'm happy that it should be backed out. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From zhuoren.wz at alibaba-inc.com Mon Sep 7 11:49:48 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Mon, 07 Sep 2020 11:49:48 -0000 Subject: =?UTF-8?B?WzE2XVtSRlJdW3NdOjgyNTI4MzU6UmV2ZXJ0IGZpeCBmb3IgSkRLLTgyNDYwNTE=?= Message-ID: <743d73d5-36cc-4a1e-bc13-96bb27c8ab8e.zhuoren.wz@alibaba-inc.com> As discussed before, this patch is to revert JDK-8246051(SIGBUS by unaligned Unsafe compare_and_swap). Please review. JDK bug: https://bugs.openjdk.java.net/browse/JDK-8252835 Patch: http://cr.openjdk.java.net/~wzhuo/8252835/webrev.00/ Regards, Zhuoren From zhuoren.wz at alibaba-inc.com Tue Sep 22 12:07:16 2020 From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=) Date: Tue, 22 Sep 2020 20:07:16 +0800 Subject: =?UTF-8?B?UmU6IFsxNl1bUkZSXVtzXTo4MjUyODM1OlJldmVydCBmaXggZm9yIEpESy04MjQ2MDUx?= In-Reply-To: <989bb8a4-f33a-5a10-5b65-d9684e79a2e2@oracle.com> References: <743d73d5-36cc-4a1e-bc13-96bb27c8ab8e.zhuoren.wz@alibaba-inc.com>, <989bb8a4-f33a-5a10-5b65-d9684e79a2e2@oracle.com> Message-ID: PR : https://github.com/openjdk/jdk/pull/297 Please have a look. Regards, Zhuoren ------------------------------------------------------------------ From:David Holmes Sent At:2020 Sep. 7 (Mon.) 21:33 To:Sandler ; Andrew Haley ; Paul Sandoz ; Patric Hedlin ; aarch64-port-dev ; hotspot-runtime-dev at openjdk.java.net Subject:Re: [16][RFR][s]:8252835:Revert fix for JDK-8246051 Hi Zhuoren, This needs to be done as a Pull Request (PR) now that we have transitoned to git/gitbub. Thanks, David On 7/09/2020 9:49 pm, Wang Zhuo(Zhuoren) wrote: > As discussed before, this patch is to revert > JDK-8246051(SIGBUS by unaligned Unsafe compare_and_swap). Please review. > JDK bug: https://bugs.openjdk.java.net/browse/JDK-8252835 > Patch: http://cr.openjdk.java.net/~wzhuo/8252835/webrev.00/ > > > Regards, > Zhuoren > From anthony.scarpino at oracle.com Fri Sep 25 19:44:49 2020 From: anthony.scarpino at oracle.com (Anthony Scarpino) Date: Fri, 25 Sep 2020 12:44:49 -0700 Subject: RFR: 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 In-Reply-To: References: Message-ID: <8be53405-6b1d-a7d9-f012-7a7f5e6cd97f@oracle.com> I had the fix in review if you could have waited. Tony On 9/25/20 12:30 PM, Daniel D.Daugherty wrote: > Reduce noise in CI Tier2 by ProblemListing this test. > > ------------- > > Commit messages: > - Add 'aarch64' to the generic-ARCH comment so folks know what to use. > - Use the correct bug ID: 8253637. > - 8253659: ProblemList sun/security/ec/TestEC.java on linux-aarch64 > > Changes: https://git.openjdk.java.net/jdk/pull/362/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=362&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8253659 > Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod > Patch: https://git.openjdk.java.net/jdk/pull/362.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/362/head:pull/362 > > PR: https://git.openjdk.java.net/jdk/pull/362 > From iklam at openjdk.java.net Fri Sep 25 22:22:07 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 25 Sep 2020 22:22:07 GMT Subject: RFR: 8253548: jvmFlagAccess.cpp: clang 9.0.0 format specifier error Message-ID: Please review this trivial fix for older clang compiler printf format warning. As mentioned on [JDK-8253548](https://bugs.openjdk.java.net/browse/JDK-8253548), these's an existing typecast in [logFileOutput.cpp]( https://github.com/openjdk/jdk/blame/efd10546865998028aa6d34cf939ca0de67a90fc/hotspot/src/share/vm/logging/logFileOutput.cpp#L194) for the same reason that SIZE_MAX is not actually of the size_t type for older clang compilers. Testing with mach5 tier1. ------------- Commit messages: - 8253548: jvmFlagAccess.cpp: clang 9.0.0 format specifier error Changes: https://git.openjdk.java.net/jdk/pull/365/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=365&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253548 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/365.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/365/head:pull/365 PR: https://git.openjdk.java.net/jdk/pull/365 From dholmes at openjdk.java.net Fri Sep 25 22:57:02 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 25 Sep 2020 22:57:02 GMT Subject: RFR: 8253647: Remove dead code in os::create_thread() on Linux/BSD In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 17:06:58 GMT, Zhengyu Gu wrote: > Please review this small patch to remove dead code that is left behind by JDK-8078513. > > Test: > - [x] tier1 on Linux 86_64 Looks good and trivial. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/361 From zgu at openjdk.java.net Fri Sep 25 23:36:56 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 25 Sep 2020 23:36:56 GMT Subject: Integrated: 8253647: Remove dead code in os::create_thread() on Linux/BSD In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 17:06:58 GMT, Zhengyu Gu wrote: > Please review this small patch to remove dead code that is left behind by JDK-8078513. > > Test: > - [x] tier1 on Linux 86_64 This pull request has now been integrated. Changeset: 41675400 Author: Zhengyu Gu URL: https://git.openjdk.java.net/jdk/commit/41675400 Stats: 14 lines in 2 files changed: 0 ins; 14 del; 0 mod 8253647: Remove dead code in os::create_thread() on Linux/BSD Reviewed-by: stuefe, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/361 From iklam at openjdk.java.net Sat Sep 26 03:53:45 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 26 Sep 2020 03:53:45 GMT Subject: RFR: 8253572: [windows] CDS archive may fail to open with long file names In-Reply-To: References: Message-ID: On Thu, 24 Sep 2020 10:04:00 GMT, Thomas Stuefe wrote: > Hi all, > > there is a long standing bug in the windows version of os::pd_map_memory() which may cause it to fail if the path to > the underlying file is longer than the OS limit. This is mainly of interest for CDS, which uses this functionality to > map in sections of the archive into the memory. This bug may cause the CDS mapping to fail. See also > https://bugs.openjdk.java.net/browse/JDK-8249943. As with similar cases, the fix is to translate the input file name > to a wide character UNC path name and use the Unicode variant of CreateFile which accepts long path names. LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/332 From stuefe at openjdk.java.net Sat Sep 26 04:15:31 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 26 Sep 2020 04:15:31 GMT Subject: RFR: 8253572: [windows] CDS archive may fail to open with long file names In-Reply-To: References: Message-ID: <2VvOp2Kz3gn_Rcf30eiNCw7wjg9ICHjDn6s5W6g75OU=.6d261dcc-47c1-4ff3-887f-84f51a362172@github.com> On Sat, 26 Sep 2020 03:51:08 GMT, Ioi Lam wrote: > LGTM Thanks Ioi! ------------- PR: https://git.openjdk.java.net/jdk/pull/332 From stuefe at openjdk.java.net Sat Sep 26 04:15:32 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 26 Sep 2020 04:15:32 GMT Subject: Integrated: 8253572: [windows] CDS archive may fail to open with long file names In-Reply-To: References: Message-ID: <4JW4I3YhTc-MsxiA5_2l7sNzjwDBEaDC6UZUzN8HWls=.28eaa842-7257-46e9-832e-35c19a7c2565@github.com> On Thu, 24 Sep 2020 10:04:00 GMT, Thomas Stuefe wrote: > Hi all, > > there is a long standing bug in the windows version of os::pd_map_memory() which may cause it to fail if the path to > the underlying file is longer than the OS limit. This is mainly of interest for CDS, which uses this functionality to > map in sections of the archive into the memory. This bug may cause the CDS mapping to fail. See also > https://bugs.openjdk.java.net/browse/JDK-8249943. As with similar cases, the fix is to translate the input file name > to a wide character UNC path name and use the Unicode variant of CreateFile which accepts long path names. This pull request has now been integrated. Changeset: b66fa8f4 Author: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/b66fa8f4 Stats: 12 lines in 1 file changed: 10 ins; 0 del; 2 mod 8253572: [windows] CDS archive may fail to open with long file names 8249943: [TESTBUG] runtime/cds/serviceability/transformRelatedClasses/TransformInterfaceAndImplementor.java Reviewed-by: ccheung, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/332 From dholmes at openjdk.java.net Mon Sep 28 04:42:29 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Sep 2020 04:42:29 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Thu, 24 Sep 2020 15:04:01 GMT, Gerard Ziemski wrote: >> hi all, >> >> Please review this change that refactors common POSIX code into a separate >> file. >> >> Currently there appears to be quite a bit of duplicated code among POSIX >> platforms, which makes it difficult to apply single fix to the signal code. >> With this fix, we will only need to touch single file for common POSIX >> code fixes from now on. >> >> ---------------------------------------------------------------------------- >> The APIs which moved from os/bsd/os_bsd.cpp to to os/posix/PosixSignals.cpp: >> >> //////////////////////////////////////////////////////////////////////////////// >> // signal support >> void os::Bsd::signal_sets_init() >> sigset_t* os::Bsd::unblocked_signals() >> sigset_t* os::Bsd::vm_signals() >> void os::Bsd::hotspot_sigmask(Thread* thread) >> //////////////////////////////////////////////////////////////////////////////// >> // sun.misc.Signal support >> static void UserHandler(int sig, void *siginfo, void *context) >> void* os::user_handler() >> void* os::signal(int signal_number, void* handler) >> void os::signal_raise(int signal_number) >> int os::sigexitnum_pd() >> static void jdk_misc_signal_init() >> void os::signal_notify(int sig) >> static int check_pending_signals() >> int os::signal_wait() >> //////////////////////////////////////////////////////////////////////////////// >> // suspend/resume support >> static void resume_clear_context(OSThread *osthread) >> static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, ucontext_t* context) >> static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) >> static int SR_initialize() >> static int sr_notify(OSThread* osthread) >> static bool do_suspend(OSThread* osthread) >> static void do_resume(OSThread* osthread) >> /////////////////////////////////////////////////////////////////////////////////// >> // signal handling (except suspend/resume) >> static void signalHandler(int sig, siginfo_t* info, void* uc) >> struct sigaction* os::Bsd::get_chained_signal_action(int sig) >> static bool call_chained_handler(struct sigaction *actp, int sig, >> siginfo_t *siginfo, void *context) >> bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) >> int os::Bsd::get_our_sigflags(int sig) >> void os::Bsd::set_our_sigflags(int sig, int flags) >> void os::Bsd::set_signal_handler(int sig, bool set_installed) >> void os::Bsd::install_signal_handlers() >> static const char* get_signal_handler_name(address handler, >> char* buf, int buflen) >> static void print_signal_handler(outputStream* st, int sig, >> char* buf, size_t buflen) >> void os::run_periodic_checks() >> void os::Bsd::check_signal_handler(int sig) >> >> ----------------------------------------------------------------------------- >> The APIs which moved from os/posix/os_posix.cpp to os/posix/PosixSignals.cpp: >> >> const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) >> int os::Posix::get_signal_number(const char* signal_name) >> int os::get_signal_number(const char* signal_name) >> bool os::Posix::is_valid_signal(int sig) >> bool os::Posix::is_sig_ignored(int sig) >> const char* os::exception_name(int sig, char* buf, size_t size) >> const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) >> void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) >> const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) >> oid os::Posix::print_sa_flags(outputStream* st, int flags) >> static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) >> void os::print_siginfo(outputStream* os, const void* si0) >> bool os::signal_thread(Thread* thread, int sig, const char* reason) >> int os::Posix::unblock_thread_signal_mask(const sigset_t *set) >> address os::Posix::ucontext_get_pc(const ucontext_t* ctx) >> void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) >> struct sigaction* os::Posix::get_preinstalled_handler(int sig) >> void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) >> >> >> -------------------------------------------------------- >> -------------------------------------------------------- >> >> DETAILS: >> >> -------------------------------------------------------- >> Public APIs which are now internal static PosixSignals:: >> >> sigset_t* os::Bsd::vm_signals() >> struct sigaction* os::Bsd::get_chained_signal_action(int sig) >> int os::Bsd::get_our_sigflags(int sig) >> void os::Bsd::set_our_sigflags(int sig, int flags) >> void os::Bsd::set_signal_handler(int sig, bool set_installed) >> void os::Bsd::check_signal_handler(int sig) >> const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) >> bool os::Posix::is_valid_signal(int sig) >> const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) >> void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) >> const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) >> oid os::Posix::print_sa_flags(outputStream* st, int flags) >> static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) >> void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) >> >> ------------------------------------------------ >> Public APIs which moved to public PosixSignals:: >> >> void os::Bsd::signal_sets_init() >> void os::Bsd::hotspot_sigmask(Thread* thread) >> bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) >> void os::Bsd::install_signal_handlers() >> bool os::Posix::is_sig_ignored(int sig) >> int os::Posix::unblock_thread_signal_mask(const sigset_t *set) >> address os::Posix::ucontext_get_pc(const ucontext_t* ctx) >> void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) >> >> ---------------------------------------------------- >> Internal APIs which are now public in PosixSignals:: >> >> static void jdk_misc_signal_init() >> static int SR_initialize() >> static bool do_suspend(OSThread* osthread) >> static void do_resume(OSThread* osthread) >> static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) >> >> -------------------------- >> New APIs in PosixSignals:: >> >> static bool are_signal_handlers_installed(); > > Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Add AIX specific SA code" > > This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. Hi Gerard, Thank you for tackling this long overdue cleanup and conciliation of the the signal management code! Great work on a tedious task. Overall this looks okay but I confess it is very hard to compare each previous platform version with the new shared version. I'm glad the AIX folk have take an indepth look there. I have a few minor comments in specific files below. I would also suggest naming the file posixSignal.cpp/hpp as that is the more common naming pattern. Future cleanup: do we really need JVM_handle__signal to be cpu specific? I can imagine one copy of this with a few ifdefs or newly introduced os::pd_* functions. Thanks, David src/hotspot/os/linux/os_linux.cpp line 944: > 942: // is no gap between the last two virtual memory regions. > 943: > 944: JavaThread *jt = (JavaThread *)thread; thread is already a JavaThread* - see line 903 src/hotspot/os/linux/os_linux.cpp line 1685: > 1683: filename); > 1684: > 1685: assert(Thread::current()->is_Java_thread(), "must be Java thread"); This assertion is already inside JavaThread::current(). src/hotspot/os/posix/os_posix.cpp line 1456: > 1454: Thread* thread = Thread::current(); > 1455: assert(thread->is_Java_thread(), "Must be JavaThread"); > 1456: JavaThread *jt = (JavaThread *)thread; This change is unnecessary. Please restore the original line JavaThread::current() call. src/hotspot/os/posix/os_posix.cpp line 1501: > 1499: } > 1500: > 1501: OSThreadWaitState osts(thread->osthread(), false /* not Object.wait() */); Unnecessary change - just use jt src/hotspot/os/posix/signals_posix.cpp line 46: > 44: // suspend/resume > 45: > 46: // glibc on Bsd platform uses non-documented flag This was added for Linux and then copied to the BSD port, so the comment is inaccurate. This seems to be referring to the SA_RESTORER flag which seems to be a linux kernel flag. But I'm unsure why we would need to even be aware of this. I think future cleanup could be done here. src/hotspot/os/posix/signals_posix.cpp line 442: > 440: // > 441: > 442: #if defined(__APPLE__) This should be checking for BSD, which would include macOS. src/hotspot/os/posix/signals_posix.cpp line 456: > 454: #endif > 455: > 456: // Set thread signal mask (for some reason on AIX sigthreadmask() seems This comment block seems inappropriate for shared code. Are you suggesting this code might be wrong for AIX? If so it shouldn't be pushed in this form. src/hotspot/os/posix/signals_posix.cpp line 462: > 460: const int rc = ::pthread_sigmask(how, set, oset); > 461: // return value semantics differ slightly for error case: > 462: // pthread_sigmask returns error number, sigthreadmask -1 and sets global errno There is no sigthreadmask in use. src/hotspot/os/posix/signals_posix.cpp line 471: > 469: // to POSIX, typical program error signals. If they happen while being blocked, > 470: // they typically will bring down the process immediately. > 471: bool unblock_program_error_signals() { This is AIX specific and should just be a file local function for AIX only. src/hotspot/os/posix/signals_posix.cpp line 494: > 492: > 493: int orig_errno = errno; // Preserve errno value over signal handler. > 494: #if defined(__APPLE__) Again this should be checking for BSD, whcih will include macOS. There should be a better way to dispatch here. If the handler has a platform-independent name, and is declared in the platform specific files, then it will link to that one definition. src/hotspot/os/posix/signals_posix.cpp line 571: > 569: { SA_NODEFER, "SA_NODEFER" }, > 570: #if defined(AIX) > 571: { SA_ONSTACK, "SA_ONSTACK" }, Existing issue but we already have an entry for SA_ONSTACK. src/hotspot/os/posix/signals_posix.cpp line 1310: > 1308: > 1309: address PosixSignals::ucontext_get_pc(const ucontext_t* ctx) { > 1310: #if defined(AIX) Again there must be a better way to do this dispatch. If the target were os::ucontext_get_pc, defined is os.cpp then we would link to the current platforms version. src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp line 232: > 230: if (t != NULL) { > 231: if(t->is_Java_thread()) { > 232: thread = (JavaThread*)t; Mis-merge? Please put this back to t->as_Java_thread() src/hotspot/os_cpu/bsd_x86/os_bsd_x86.cpp line 461: > 459: if (t != NULL ){ > 460: if(t->is_Java_thread()) { > 461: thread = (JavaThread*)t; Mis-merge? Please put this back to t->as_Java_thread() src/hotspot/os_cpu/bsd_zero/os_bsd_zero.cpp line 160: > 158: if (t != NULL ){ > 159: if(t->is_Java_thread()) { > 160: thread = (JavaThread*)t; Mis-merge? Please put this back to t->as_Java_thread() src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp line 241: > 239: if (t != NULL ){ > 240: if(t->is_Java_thread()) { > 241: thread = (JavaThread*)t; Mis-merge? Please put this back to t->as_Java_thread() src/hotspot/os_cpu/linux_arm/os_linux_arm.cpp line 300: > 298: if (t != NULL ){ > 299: if(t->is_Java_thread()) { > 300: thread = (JavaThread*)t; Mis-merge? Please put this back to t->as_Java_thread() src/hotspot/os_cpu/linux_ppc/os_linux_ppc.cpp line 284: > 282: if (t != NULL) { > 283: if(t->is_Java_thread()) { > 284: thread = (JavaThread*)t; Mis-merge? Please put this back to t->as_Java_thread() src/hotspot/os_cpu/linux_s390/os_linux_s390.cpp line 284: > 282: if (t != NULL) { > 283: if(t->is_Java_thread()) { > 284: thread = (JavaThread*)t; Mis-merge? Please put this back to t->as_Java_thread() src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp line 280: > 278: if (t != NULL ){ > 279: if(t->is_Java_thread()) { > 280: thread = (JavaThread*)t; Mis-merge? Please put this back to t->as_Java_thread() src/hotspot/os_cpu/linux_zero/os_linux_zero.cpp line 156: > 154: if (t != NULL ){ > 155: if(t->is_Java_thread()) { > 156: thread = (JavaThread*)t; Mis-merge? Please put this back to t->as_Java_thread() ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/157 From dholmes at openjdk.java.net Mon Sep 28 05:19:15 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Sep 2020 05:19:15 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> Message-ID: On Thu, 24 Sep 2020 18:14:10 GMT, Zhengyu Gu wrote: > For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report > thread's state after thread terminated. > But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid > thread stack, etc. > This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated > thread from live thread. > Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). > > Note: Java thread does not have such issue, its thread object is deleted before thread terminates. Changes requested by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From dholmes at openjdk.java.net Mon Sep 28 05:19:17 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Sep 2020 05:19:17 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <-MGl9WfKGYhIp3dk96wRn86qovh1QItkxm6occqWpao=.87a44d0d-681e-4118-9994-23faf1703614@github.com> Message-ID: On Fri, 25 Sep 2020 15:03:29 GMT, Zhengyu Gu wrote: >> src/hotspot/share/runtime/thread.cpp line 919: >> >>> 917: osthread()->print_on(st); >>> 918: >>> 919: if (osthread()->get_state() != ZOMBIE) { >> >> I'm not sure print_on(), as opposed to print_on_error() can ever be called with a ZOMBIE thread. I didn't expect any >> change in this method. > > For thread, e.g. G1ConcurrentMarkThread, there is nothing to prevent calling _cm_thread->print_on(tty) after it > terminated, although, I can not find a case right now. > You prefer an assertion instead? I prefer no change to this method. I don't see that we need to do anything special even if a ZOMBIE could be encountered. >> src/hotspot/share/runtime/thread.cpp line 955: >> >>> 953: } >>> 954: } else { >>> 955: st->print(" Aborted"); >> >> Not sure this is reachable and if it is then I'm not sure what state the thread is actually in. If a Thread never gets >> an osThread() it isn't started so shouldn't be locatable by any means. > > so, you prefer "ShouldNotReachHere()" ? There's no point putting a ShouldNotReachHere() in error handling code as we will just trip a secondary error. If we want to print something the perhaps "unknown state (no osThread)" ? Also I only wanted the ThreadSMRSupport::print_info_on to be excluded for Zombies, but you've excluded it for the no-osThread case as well. I think based on what Dan said we can just put that back and call it unconditionally. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From wzhuo at openjdk.java.net Mon Sep 28 07:04:36 2020 From: wzhuo at openjdk.java.net (Wang Zhuo) Date: Mon, 28 Sep 2020 07:04:36 GMT Subject: RFR: 8252835: Revert fix for JDK-8246051 Message-ID: Reverting JDK-8246051(SIGBUS by unaligned Unsafe compare_and_swap) because it may cause some runtime issues. ------------- Commit messages: - 8252835:Revert fix for JDK-8246051 Changes: https://git.openjdk.java.net/jdk/pull/297/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=297&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252835 Stats: 83 lines in 2 files changed: 0 ins; 83 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/297.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/297/head:pull/297 PR: https://git.openjdk.java.net/jdk/pull/297 From psandoz at openjdk.java.net Mon Sep 28 07:04:36 2020 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Mon, 28 Sep 2020 07:04:36 GMT Subject: RFR: 8252835: Revert fix for JDK-8246051 In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 12:03:31 GMT, Wang Zhuo wrote: > Reverting JDK-8246051(SIGBUS by unaligned Unsafe compare_and_swap) because it may cause some > runtime issues. Marked as reviewed by psandoz (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/297 From dholmes at openjdk.java.net Mon Sep 28 07:04:37 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Sep 2020 07:04:37 GMT Subject: RFR: 8252835: Revert fix for JDK-8246051 In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 12:03:31 GMT, Wang Zhuo wrote: > Reverting JDK-8246051(SIGBUS by unaligned Unsafe compare_and_swap) because it may cause some > runtime issues. Reverting of changes looks good. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/297 From mikael at openjdk.java.net Mon Sep 28 07:04:37 2020 From: mikael at openjdk.java.net (Mikael Vidstedt) Date: Mon, 28 Sep 2020 07:04:37 GMT Subject: RFR: 8252835: Revert fix for JDK-8246051 In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 12:03:31 GMT, Wang Zhuo wrote: > Reverting JDK-8246051(SIGBUS by unaligned Unsafe compare_and_swap) because it may cause some > runtime issues. Marked as reviewed by mikael (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/297 From wzhuo at openjdk.java.net Mon Sep 28 07:04:37 2020 From: wzhuo at openjdk.java.net (Wang Zhuo) Date: Mon, 28 Sep 2020 07:04:37 GMT Subject: RFR: 8252835: Revert fix for JDK-8246051 In-Reply-To: References: Message-ID: <9Moi2K5IJweaCfnOFFbvph2omf9S5w9R68iOs7Uw0hw=.cc0aec05-d182-4f06-baef-048b29db6156@github.com> On Tue, 22 Sep 2020 12:03:31 GMT, Wang Zhuo wrote: > Reverting JDK-8246051(SIGBUS by unaligned Unsafe compare_and_swap) because it may cause some > runtime issues. https://bugs.openjdk.java.net/browse/JDK-8252835 ------------- PR: https://git.openjdk.java.net/jdk/pull/297 From dholmes at openjdk.java.net Mon Sep 28 07:04:37 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 28 Sep 2020 07:04:37 GMT Subject: RFR: 8252835: Revert fix for JDK-8246051 In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 16:14:52 GMT, Paul Sandoz wrote: >> Reverting JDK-8246051(SIGBUS by unaligned Unsafe compare_and_swap) because it may cause some >> runtime issues. > > Marked as reviewed by psandoz (Reviewer). The PR title needs to be edited to add a space before "Revert in "8252835:Revert" ------------- PR: https://git.openjdk.java.net/jdk/pull/297 From wzhuo at openjdk.java.net Mon Sep 28 07:36:35 2020 From: wzhuo at openjdk.java.net (Wang Zhuo) Date: Mon, 28 Sep 2020 07:36:35 GMT Subject: Integrated: 8252835: Revert fix for JDK-8246051 In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 12:03:31 GMT, Wang Zhuo wrote: > Reverting JDK-8246051(SIGBUS by unaligned Unsafe compare_and_swap) because it may cause some > runtime issues. This pull request has now been integrated. Changeset: 276fcee7 Author: Wang Zhuo Committer: David Holmes URL: https://git.openjdk.java.net/jdk/commit/276fcee7 Stats: 83 lines in 2 files changed: 0 ins; 83 del; 0 mod 8252835: Revert fix for JDK-8246051 Reviewed-by: psandoz, dholmes, mikael ------------- PR: https://git.openjdk.java.net/jdk/pull/297 From simonis at openjdk.java.net Mon Sep 28 08:58:08 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Mon, 28 Sep 2020 08:58:08 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2] In-Reply-To: <18mmirKkzSCZx-CPAZFAMDDkIIh0oVPyqwVklq2QuMc=.64e46097-ced2-4207-b164-f006ac5c0d42@github.com> References: <7OVdqQaa1-ZcnKzV7UcNDupnoOmmwNEjVBeea00TwlM=.3406a0c4-c440-4468-b9ab-9ab2f11d564d@github.com> <18mmirKkzSCZx-CPAZFAMDDkIIh0oVPyqwVklq2QuMc=.64e46097-ced2-4207-b164-f006ac5c0d42@github.com> Message-ID: On Fri, 25 Sep 2020 14:45:16 GMT, Bob Vandette wrote: >>> In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second one >>> comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second >>> non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which >>> we should respect. As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not >>> better than assuming they are all mounted in the same subdirectory. If you follow my suggestion, then we will only be >>> subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on /sys/fs/cgroup/cpuset. >> >> Like this perhaps? >> https://github.com/jerboaa/jdk/commit/45608cf9a13068f3724cc65d12d8dc819bb2d066 >> >> I tend to think that in actual container workloads it might be unlikely to actually see multiple cpuset mounts. So in a >> sense that's a special case. The general case, before this bug, shouldn't be penalized. So perhaps the above would be >> worth considering. > >> > In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second one >> > comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second >> > non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which >> > we should respect. As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not >> > better than assuming they are all mounted in the same subdirectory. If you follow my suggestion, then we will only be >> > subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on /sys/fs/cgroup/cpuset. >> >> Like this perhaps? >> [jerboaa at 45608cf](https://github.com/jerboaa/jdk/commit/45608cf9a13068f3724cc65d12d8dc819bb2d066) >> >> I tend to think that in actual container workloads it might be unlikely to actually see multiple cpuset mounts. So in a >> sense that's a special case. The general case, before this bug, shouldn't be penalized. So perhaps the above would be >> worth considering. > > Yes, that's exactly what I was thinking. In the manual case from your example, we'd record "/" as the mount point if > it showed up first and then overwrite it if /sys/fs/cgroup came along. OK, I just want to get this done so I can finally use debug builds again. I've picked your version now in the hope that you'll review that :) I only changed the condition strcmp(cg_infos[CPUSET_IDX]._mount_path, "/sys/fs/cgroup") < 0 to strstr(cg_infos[CPUSET_IDX]._mount_path, "/sys/fs/cgroup") != cg_infos[CPUSET_IDX]._mount_path otherwise you'd still choose the alternative cpuset if that was mounted on a mount point which is lexicographically after `/sys/fs/cgroup` (e.g. `/tmp/cgroups`). I've also adapted the logs to reflect the fact that the current solution will simply choose the cpusets from the second mount in the (unusual?) case where the "*normal*" Cgroups are not mounted to `/sys/fs/cgroup`. Hope that's still fine. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From mdoerr at openjdk.java.net Mon Sep 28 08:59:39 2020 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 28 Sep 2020 08:59:39 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: <_36GxJpHXE99YIuPuffzje3An9FPe1zGqZqdfMnioJs=.150fffa3-ca6f-4590-b216-eb1c94c58fb6@github.com> On Mon, 28 Sep 2020 04:37:03 GMT, David Holmes wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert "Add AIX specific SA code" >> >> This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. > > Hi Gerard, > Thank you for tackling this long overdue cleanup and conciliation of the the signal management code! Great work on a > tedious task. > Overall this looks okay but I confess it is very hard to compare each previous platform version with the new shared > version. I'm glad the AIX folk have take an indepth look there. > I have a few minor comments in specific files below. > > I would also suggest naming the file posixSignal.cpp/hpp as that is the more common naming pattern. > > Future cleanup: do we really need JVM_handle__signal to be cpu specific? I can imagine one copy of this with a few > ifdefs or newly introduced os::pd_* functions. > Thanks, > David Tests on AIX look good so far. Maybe we should re-check after conflict resolution, but basically, I think it can get pushed when reviews are done. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From simonis at openjdk.java.net Mon Sep 28 09:14:21 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Mon, 28 Sep 2020 09:14:21 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v4] In-Reply-To: References: Message-ID: > Hi, > > can I please have a review (or an idea for a better fix) for this PR? > > If a tool like [cpuset](https://github.com/lpechacek/cpuset) is used to manually create and manage > [cpusets](https://man7.org/linux/man-pages/man7/cpuset.7.html) the cgroups detections will be confused and crash in a > debug build or behave unexpectedly in a product build. The problem is that the additionally mounted cpuset will be > interpreted as if it was belonging to Cgroup controller: $ grep cgroup /proc/self/mountinfo > 36 25 0:30 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 > 49 36 0:43 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,memory > 50 36 0:44 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,rdma > ... > 43 36 0:37 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset > 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset > The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet > another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup. > Thanks, > Volker Volker Simonis has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/295/files - new: https://git.openjdk.java.net/jdk/pull/295/files/7753bc5d..b3d0f28a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=295&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=295&range=02-03 Stats: 40 lines in 1 file changed: 8 ins; 25 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/295.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/295/head:pull/295 PR: https://git.openjdk.java.net/jdk/pull/295 From sgehwolf at openjdk.java.net Mon Sep 28 10:44:01 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Mon, 28 Sep 2020 10:44:01 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v4] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 09:14:21 GMT, Volker Simonis wrote: >> Hi, >> >> can I please have a review (or an idea for a better fix) for this PR? >> >> If a tool like [cpuset](https://github.com/lpechacek/cpuset) is used to manually create and manage >> [cpusets](https://man7.org/linux/man-pages/man7/cpuset.7.html) the cgroups detections will be confused and crash in a >> debug build or behave unexpectedly in a product build. The problem is that the additionally mounted cpuset will be >> interpreted as if it was belonging to Cgroup controller: $ grep cgroup /proc/self/mountinfo >> 36 25 0:30 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 >> 49 36 0:43 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,memory >> 50 36 0:44 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,rdma >> ... >> 43 36 0:37 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset >> 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset >> The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet >> another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup. >> Thanks, >> Volker > > Volker Simonis has refreshed the contents of this pull request, and previous commits have been removed. The incremental > views will show differences compared to the previous content of the PR. The pull request contains one new commit since > the last revision: > 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist Looks fine to me. Thanks for your patience! ------------- Marked as reviewed by sgehwolf (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/295 From lkorinth at openjdk.java.net Mon Sep 28 11:18:36 2020 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Mon, 28 Sep 2020 11:18:36 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 11:00:20 GMT, Thomas Stuefe wrote: >> Hi all, >> >> this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the >> delay, had vacation then the entrance of Skara delayed things a bit. >> For the delta diff please see [3]. >> >> This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all >> feedback individually in this PR body, but I incorporated almost all into the new revision. >> What changed since the last version: >> >> - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the >> group consent. >> >> - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for >> overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth >> class loader which made it rather pointless. Now I run it always. >> >> - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. >> >> - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) >> >> Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. >> >> - In ChunkManager::purge() I improved the comments after discussions with Leo. >> >> - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due >> to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run >> every time. >> >> - Fixed indentation issues as Leo requested >> >> - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested >> >> - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes >> and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more >> flexible since it allows us to have list with purge-able and non-purge-able nodes. >> >> - and various smaller fixes, mainly on request of Leo. >> >> @lkorinth: >> >>> VirtualSpaceNode.hpp >>> >>>102 // Start pointer of the area. >>>103 MetaWord* const _base; >>> >>>How does this differ from _rs._base? Really needed? >>> >>>105 // Size, in words, of the whole node >>>106 const size_t _word_size; >>> >>>Can we not calculate this from _rs.size()? >> >> You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it >> is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly >> improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at >> some point and make those members const - as they should be - then I would rewrite this as you suggest. >> Thanks, again, for all your review work! >> >> ------ >> >> >> [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html >> [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html >> [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Remove empty lines from include sections I just have a few cosmetic comments. Otherwise it looks good to me. If you sort by ASCII order, I get a few unordered includes: ("." < "/" < "a-z,A-Z") In file b/src/hotspot/share/gc/shared/genCollectedHeap.cpp: [not sorted]: #include "memory/metaspaceCounters.hpp" (Context) #include "memory/metaspace/metaspaceSizesSnapshot.hpp" (Addition) #include "memory/resourceArea.hpp" (Context) In file b/src/hotspot/share/memory/metaspace.cpp: [not sorted]: #include "memory/metaspace/virtualSpaceList.hpp" (Context) #include "memory/metaspace.hpp" (Addition) #include "memory/metaspaceShared.hpp" (Context) In file b/src/hotspot/share/memory/metaspace/chunkManager.cpp: [not sorted]: #include "memory/metaspace/metaspaceCommon.hpp" (Context) #include "memory/metaspace/metaspaceContext.hpp" (Addition) #include "memory/metaspace/metachunk.hpp" (Addition) In file b/src/hotspot/share/memory/metaspace/chunkManager.cpp: [not sorted]: #include "memory/metaspace/metaspaceContext.hpp" (Addition) #include "memory/metaspace/metachunk.hpp" (Addition) #include "memory/metaspace/metaspaceSettings.hpp" (Addition) blockTree.hpp add a space after loop keyword "for(;;) {" -> "for (;;) {" blockTree.cpp add a space after loop keyword "} while(0)" -> "} while (0)" (twice in file) I still think we should try to get the initializer list indented somewhat consistently. (This is really boring, and hard as we do not have precise indentation rules and no mechanical indenter). Sorry for mentioning this, but now might be the time to get indentation consistent at least in metaspace. The indentation level seems to most often be 2 or 4. Sometimes "Haskell" indentation with "," at the beginning of each line is used (I like this, but I think I am in a small minority) and most often it is not used. Sometimes several field members are initialised on multi line initializer lists, sometimes only one per line (I prefer one per line, if not all fits on a single line). Open curly braces seem to mostly (but not always) be on a new line. If you could normalize the style I would be happy, I yield my own preferences to you and the other reviewers on what style to use, so that we have a possibility to agree :-) Thanks, Leo ------------- Changes requested by lkorinth (Committer). PR: https://git.openjdk.java.net/jdk/pull/336 From simonis at openjdk.java.net Mon Sep 28 12:14:05 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Mon, 28 Sep 2020 12:14:05 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v4] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 10:41:19 GMT, Severin Gehwolf wrote: > Looks fine to me. Thanks for your patience! Thanks Severin. I'll wait one more day to also give Bob a chance to look at the final version and push after that. Best regards, Volker ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From coleenp at openjdk.java.net Mon Sep 28 12:15:07 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 28 Sep 2020 12:15:07 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 04:37:03 GMT, David Holmes wrote: > Future cleanup: do we really need JVM_handle__signal to be cpu specific? I can imagine one copy of this with a few > ifdefs or newly introduced os::pd_* functions. The small bit of code that I looked at last week defied refactoring unfortunately, but yes, we should try to do this. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From coleenp at openjdk.java.net Mon Sep 28 12:20:27 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 28 Sep 2020 12:20:27 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Thu, 24 Sep 2020 15:04:01 GMT, Gerard Ziemski wrote: >> hi all, >> >> Please review this change that refactors common POSIX code into a separate >> file. >> >> Currently there appears to be quite a bit of duplicated code among POSIX >> platforms, which makes it difficult to apply single fix to the signal code. >> With this fix, we will only need to touch single file for common POSIX >> code fixes from now on. >> >> ---------------------------------------------------------------------------- >> The APIs which moved from os/bsd/os_bsd.cpp to to os/posix/PosixSignals.cpp: >> >> //////////////////////////////////////////////////////////////////////////////// >> // signal support >> void os::Bsd::signal_sets_init() >> sigset_t* os::Bsd::unblocked_signals() >> sigset_t* os::Bsd::vm_signals() >> void os::Bsd::hotspot_sigmask(Thread* thread) >> //////////////////////////////////////////////////////////////////////////////// >> // sun.misc.Signal support >> static void UserHandler(int sig, void *siginfo, void *context) >> void* os::user_handler() >> void* os::signal(int signal_number, void* handler) >> void os::signal_raise(int signal_number) >> int os::sigexitnum_pd() >> static void jdk_misc_signal_init() >> void os::signal_notify(int sig) >> static int check_pending_signals() >> int os::signal_wait() >> //////////////////////////////////////////////////////////////////////////////// >> // suspend/resume support >> static void resume_clear_context(OSThread *osthread) >> static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, ucontext_t* context) >> static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) >> static int SR_initialize() >> static int sr_notify(OSThread* osthread) >> static bool do_suspend(OSThread* osthread) >> static void do_resume(OSThread* osthread) >> /////////////////////////////////////////////////////////////////////////////////// >> // signal handling (except suspend/resume) >> static void signalHandler(int sig, siginfo_t* info, void* uc) >> struct sigaction* os::Bsd::get_chained_signal_action(int sig) >> static bool call_chained_handler(struct sigaction *actp, int sig, >> siginfo_t *siginfo, void *context) >> bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) >> int os::Bsd::get_our_sigflags(int sig) >> void os::Bsd::set_our_sigflags(int sig, int flags) >> void os::Bsd::set_signal_handler(int sig, bool set_installed) >> void os::Bsd::install_signal_handlers() >> static const char* get_signal_handler_name(address handler, >> char* buf, int buflen) >> static void print_signal_handler(outputStream* st, int sig, >> char* buf, size_t buflen) >> void os::run_periodic_checks() >> void os::Bsd::check_signal_handler(int sig) >> >> ----------------------------------------------------------------------------- >> The APIs which moved from os/posix/os_posix.cpp to os/posix/PosixSignals.cpp: >> >> const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) >> int os::Posix::get_signal_number(const char* signal_name) >> int os::get_signal_number(const char* signal_name) >> bool os::Posix::is_valid_signal(int sig) >> bool os::Posix::is_sig_ignored(int sig) >> const char* os::exception_name(int sig, char* buf, size_t size) >> const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) >> void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) >> const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) >> oid os::Posix::print_sa_flags(outputStream* st, int flags) >> static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) >> void os::print_siginfo(outputStream* os, const void* si0) >> bool os::signal_thread(Thread* thread, int sig, const char* reason) >> int os::Posix::unblock_thread_signal_mask(const sigset_t *set) >> address os::Posix::ucontext_get_pc(const ucontext_t* ctx) >> void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) >> struct sigaction* os::Posix::get_preinstalled_handler(int sig) >> void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) >> >> >> -------------------------------------------------------- >> -------------------------------------------------------- >> >> DETAILS: >> >> -------------------------------------------------------- >> Public APIs which are now internal static PosixSignals:: >> >> sigset_t* os::Bsd::vm_signals() >> struct sigaction* os::Bsd::get_chained_signal_action(int sig) >> int os::Bsd::get_our_sigflags(int sig) >> void os::Bsd::set_our_sigflags(int sig, int flags) >> void os::Bsd::set_signal_handler(int sig, bool set_installed) >> void os::Bsd::check_signal_handler(int sig) >> const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) >> bool os::Posix::is_valid_signal(int sig) >> const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) >> void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) >> const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) >> oid os::Posix::print_sa_flags(outputStream* st, int flags) >> static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) >> void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) >> >> ------------------------------------------------ >> Public APIs which moved to public PosixSignals:: >> >> void os::Bsd::signal_sets_init() >> void os::Bsd::hotspot_sigmask(Thread* thread) >> bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) >> void os::Bsd::install_signal_handlers() >> bool os::Posix::is_sig_ignored(int sig) >> int os::Posix::unblock_thread_signal_mask(const sigset_t *set) >> address os::Posix::ucontext_get_pc(const ucontext_t* ctx) >> void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) >> >> ---------------------------------------------------- >> Internal APIs which are now public in PosixSignals:: >> >> static void jdk_misc_signal_init() >> static int SR_initialize() >> static bool do_suspend(OSThread* osthread) >> static void do_resume(OSThread* osthread) >> static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) >> >> -------------------------- >> New APIs in PosixSignals:: >> >> static bool are_signal_handlers_installed(); > > Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Add AIX specific SA code" > > This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. Gerard, thank you for doing this! Thank you to Martin and Thomas for looking at the AIX code. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/157 From bob.vandette at oracle.com Mon Sep 28 13:28:21 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Mon, 28 Sep 2020 09:28:21 -0400 Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v4] In-Reply-To: References: Message-ID: <98C37F46-0C22-4C60-8E77-C381C932CCA1@oracle.com> Looks good Volker. Bob. > On Sep 28, 2020, at 8:14 AM, Volker Simonis wrote: > > On Mon, 28 Sep 2020 10:41:19 GMT, Severin Gehwolf wrote: > >> Looks fine to me. Thanks for your patience! > > Thanks Severin. > > I'll wait one more day to also give Bob a chance to look at the final version and push after that. > > Best regards, > Volker > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/295 From bobv at openjdk.java.net Mon Sep 28 13:48:01 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Mon, 28 Sep 2020 13:48:01 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v4] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 11:35:04 GMT, Volker Simonis wrote: >> Looks fine to me. Thanks for your patience! > >> Looks fine to me. Thanks for your patience! > > Thanks Severin. > > I'll wait one more day to also give Bob a chance to look at the final version and push after that. > > Best regards, > Volker Looks good Volker. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From lfoltan at openjdk.java.net Mon Sep 28 14:04:31 2020 From: lfoltan at openjdk.java.net (Lois Foltan) Date: Mon, 28 Sep 2020 14:04:31 GMT Subject: RFR: 8253548: jvmFlagAccess.cpp: clang 9.0.0 format specifier error In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 18:19:26 GMT, Ioi Lam wrote: > Please review this trivial fix for older clang compiler printf format warning. As mentioned on > [JDK-8253548](https://bugs.openjdk.java.net/browse/JDK-8253548), these's an existing typecast in [logFileOutput.cpp]( > https://github.com/openjdk/jdk/blame/efd10546865998028aa6d34cf939ca0de67a90fc/hotspot/src/share/vm/logging/logFileOutput.cpp#L194) > for the same reason that SIZE_MAX is not actually of the size_t type for older clang compilers. Testing with mach5 > tier1. Looks good & trivial. ------------- Marked as reviewed by lfoltan (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/365 From thomas.stuefe at gmail.com Mon Sep 28 14:04:36 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 28 Sep 2020 16:04:36 +0200 Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: Hi Leo, > > I just have a few cosmetic comments. Otherwise it looks good to me. > > If you sort by ASCII order, I get a few unordered includes: > ("." < "/" < "a-z,A-Z") > I prefer that order too, but unless I misunderstood one of your earlier mails you preferred the reverse: Also, some of the includes could change order, I prefer the directory metaspace being ordered before header names starting with metaspace. which would mean "/" < ".", eg: #include "memory/metaspace/chunkManager.hpp" #include "memory/metaspace.hpp" #include "memory/metaspaceShared.hpp" I wrote myself a little python script to autofix the include sections of source files. It uses plain python sort, which uses ascii sort. So above section would be rewritten as: #include "memory/metaspace.hpp" #include "memory/metaspace/chunkManager.hpp" #include "memory/metaspaceShared.hpp" If that is fine by you, I'll use this technique. Cheers, Thomas > From pchilanomate at openjdk.java.net Mon Sep 28 14:24:29 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Mon, 28 Sep 2020 14:24:29 GMT Subject: RFR: 8253694: Remove Thread::muxAcquire() from ThreadCrashProtection() Message-ID: Hi all, Please review the following patch. Current ThreadCrashProtection() implementation uses static members which requires the use of Thread::muxAcquire() to allow only one user at a time. We can avoid this synchronization requirement if each thread has its own ThreadCrashProtection *data. I tested it builds on Linux, macOS and Windows. Since the JfrThreadSampler is the only one using this I run all the tests from test/jdk/jdk/jfr/. I also run some tests with JFR enabled while forcing a crash in OSThreadSampler::protected_task() and tests passed with several "Thread method sampler crashed" UL output. Also run tiers1-3. Thanks, Patricio ------------- Commit messages: - v1 Changes: https://git.openjdk.java.net/jdk/pull/376/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=376&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253694 Stats: 115 lines in 13 files changed: 36 ins; 58 del; 21 mod Patch: https://git.openjdk.java.net/jdk/pull/376.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/376/head:pull/376 PR: https://git.openjdk.java.net/jdk/pull/376 From lkorinth at openjdk.java.net Mon Sep 28 14:27:40 2020 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Mon, 28 Sep 2020 14:27:40 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 11:15:30 GMT, Leo Korinth wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove empty lines from include sections > > I just have a few cosmetic comments. Otherwise it looks good to me. > > If you sort by ASCII order, I get a few unordered includes: > ("." < "/" < "a-z,A-Z") > > > In file b/src/hotspot/share/gc/shared/genCollectedHeap.cpp: > [not sorted]: > #include "memory/metaspaceCounters.hpp" (Context) > #include "memory/metaspace/metaspaceSizesSnapshot.hpp" (Addition) > #include "memory/resourceArea.hpp" (Context) > > In file b/src/hotspot/share/memory/metaspace.cpp: > [not sorted]: > #include "memory/metaspace/virtualSpaceList.hpp" (Context) > #include "memory/metaspace.hpp" (Addition) > #include "memory/metaspaceShared.hpp" (Context) > > In file b/src/hotspot/share/memory/metaspace/chunkManager.cpp: > [not sorted]: > #include "memory/metaspace/metaspaceCommon.hpp" (Context) > #include "memory/metaspace/metaspaceContext.hpp" (Addition) > #include "memory/metaspace/metachunk.hpp" (Addition) > > In file b/src/hotspot/share/memory/metaspace/chunkManager.cpp: > [not sorted]: > #include "memory/metaspace/metaspaceContext.hpp" (Addition) > #include "memory/metaspace/metachunk.hpp" (Addition) > #include "memory/metaspace/metaspaceSettings.hpp" (Addition) > > > blockTree.hpp > add a space after loop keyword "for(;;) {" -> "for (;;) {" > > blockTree.cpp > add a space after loop keyword "} while(0)" -> "} while (0)" (twice in file) > > I still think we should try to get the initializer list indented > somewhat consistently. (This is really boring, and hard as we do not > have precise indentation rules and no mechanical indenter). Sorry for > mentioning this, but now might be the time to get indentation > consistent at least in metaspace. The indentation level seems to most > often be 2 or 4. Sometimes "Haskell" indentation with "," at the > beginning of each line is used (I like this, but I think I am in a > small minority) and most often it is not used. Sometimes several field > members are initialised on multi line initializer lists, sometimes > only one per line (I prefer one per line, if not all fits on a single > line). Open curly braces seem to mostly (but not always) be on a new line. > > If you could normalize the style I would be happy, I yield my own > preferences to you and the other reviewers on what style to use, so > that we have a possibility to agree :-) > > Thanks, Leo Regarding the order of "." and "/" take what you prefer. ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From coleenp at openjdk.java.net Mon Sep 28 14:39:31 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 28 Sep 2020 14:39:31 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 11:00:20 GMT, Thomas Stuefe wrote: >> Hi all, >> >> this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the >> delay, had vacation then the entrance of Skara delayed things a bit. >> For the delta diff please see [3]. >> >> This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all >> feedback individually in this PR body, but I incorporated almost all into the new revision. >> What changed since the last version: >> >> - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the >> group consent. >> >> - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for >> overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth >> class loader which made it rather pointless. Now I run it always. >> >> - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. >> >> - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) >> >> Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. >> >> - In ChunkManager::purge() I improved the comments after discussions with Leo. >> >> - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due >> to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run >> every time. >> >> - Fixed indentation issues as Leo requested >> >> - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested >> >> - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes >> and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more >> flexible since it allows us to have list with purge-able and non-purge-able nodes. >> >> - and various smaller fixes, mainly on request of Leo. >> >> @lkorinth: >> >>> VirtualSpaceNode.hpp >>> >>>102 // Start pointer of the area. >>>103 MetaWord* const _base; >>> >>>How does this differ from _rs._base? Really needed? >>> >>>105 // Size, in words, of the whole node >>>106 const size_t _word_size; >>> >>>Can we not calculate this from _rs.size()? >> >> You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it >> is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly >> improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at >> some point and make those members const - as they should be - then I would rewrite this as you suggest. >> Thanks, again, for all your review work! >> >> ------ >> >> >> [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html >> [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html >> [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Remove empty lines from include sections Besides my comment about globals.hpp option MetaspaceGuardAllocations, my comments are minor things and I approve this change. There might be some additional things we find that we'll want to change once this code is integrated. This is a significant improvement to metaspace memory management. Great work, @tstuefe ! src/hotspot/share/memory/metaspace/blockTree.cpp line 72: > 70: }; > 71: > 72: void BlockTree::verify() const { I think this sort of deep verification should be in a gtest instead, at least for blockTree. Note, this can be fixed/discussed in a future RFE. src/hotspot/share/memory/metaspace/chunkHeaderPool.hpp line 107: > 105: return c; > 106: > 107: } I don't usually comment on style, but there are so many blank lines in this function. src/hotspot/share/memory/metaspace/classLoaderMetaspace.cpp line 64: > 62: , _space_type(space_type) > 63: , _non_class_space_arena(NULL) > 64: , _class_space_arena(NULL) Ok, I see what @lkorinth was commenting on. In the coding standard, the normal indentation level is 2, but we don't specify it for initializers. Generally it seems what looks good, maybe 4, maybe aligned somewhat with the arguments. That said to me it doesn't matter that much, BUT most of the code has the punctuation at the end of the line, not the beginning. I think it looks weird at the beginning. Can you change these? src/hotspot/share/memory/metaspace/commitMask.hpp line 92: > 90: check_pointer(start + word_size - 1); > 91: } > 92: #endif If these are for asserts, they can be defined in the .cpp file. src/hotspot/share/memory/metaspace/metachunk.hpp line 272: > 270: _vsnode = node; _base = base; _level = lvl; > 271: _used_words = _committed_words = 0; _state = State::Free; > 272: _next = _prev = _next_in_vs = _prev_in_vs = NULL; Is this the same as clear() ? src/hotspot/share/memory/metaspace/metachunkList.hpp line 60: > 58: void add(Metachunk* c) { > 59: // Note: contains is expensive (linear search). > 60: ASSERT_SOMETIMES(contains(c) == false, "Chunk already in this list"); Can you make this something like: DEBUG_ONLY(verify_contains();) and hide ASSERT_SOMETIMES in the .cpp file? src/hotspot/share/memory/metaspace/metachunkList.hpp line 29: > 27: #define SHARE_MEMORY_METASPACE_METACHUNKLIST_HPP > 28: > 29: #include "memory/metaspace/counters.hpp" Is this header file needed now? src/hotspot/share/runtime/globals.hpp line 1589: > 1587: \ > 1588: product(bool, MetaspaceGuardAllocations, false, \ > 1589: "Metapace allocations are guarded.") \ This should be DIAGNOSTIC or develop() but not product. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/336 From coleenp at openjdk.java.net Mon Sep 28 15:06:21 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 28 Sep 2020 15:06:21 GMT Subject: RFR: 8253694: Remove Thread::muxAcquire() from ThreadCrashProtection() In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 06:07:58 GMT, Patricio Chilano Mateo wrote: > Hi all, > > Please review the following patch. Current ThreadCrashProtection() implementation uses static members which requires > the use of Thread::muxAcquire() to allow only one user at a time. We can avoid this synchronization requirement if each > thread has its own ThreadCrashProtection *data. I tested it builds on Linux, macOS and Windows. Since the > JfrThreadSampler is the only one using this I run all the tests from test/jdk/jdk/jfr/. I also run some tests with JFR > enabled while forcing a crash in OSThreadSampler::protected_task() and tests passed with several "Thread method sampler > crashed" UL output. Also run tiers1-3. Thanks, Patricio src/hotspot/share/runtime/thread.hpp line 756: > 754: } > 755: #endif > 756: Could this be pushed down into osThread ? ------------- PR: https://git.openjdk.java.net/jdk/pull/376 From simonis at openjdk.java.net Mon Sep 28 15:41:23 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Mon, 28 Sep 2020 15:41:23 GMT Subject: RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v4] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 13:34:13 GMT, Bob Vandette wrote: > Looks good Volker. Thanks Bob. ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From simonis at openjdk.java.net Mon Sep 28 15:45:58 2020 From: simonis at openjdk.java.net (Volker Simonis) Date: Mon, 28 Sep 2020 15:45:58 GMT Subject: Integrated: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 11:11:36 GMT, Volker Simonis wrote: > Hi, > > can I please have a review (or an idea for a better fix) for this PR? > > If a tool like [cpuset](https://github.com/lpechacek/cpuset) is used to manually create and manage > [cpusets](https://man7.org/linux/man-pages/man7/cpuset.7.html) the cgroups detections will be confused and crash in a > debug build or behave unexpectedly in a product build. The problem is that the additionally mounted cpuset will be > interpreted as if it was belonging to Cgroup controller: $ grep cgroup /proc/self/mountinfo > 36 25 0:30 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 > 49 36 0:43 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,memory > 50 36 0:44 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,rdma > ... > 43 36 0:37 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset > 121 32 0:37 / /cpusets rw,relatime shared:69 - cgroup none rw,cpuset > The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet > another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup. > Thanks, > Volker This pull request has now been integrated. Changeset: 0054c15f Author: Volker Simonis URL: https://git.openjdk.java.net/jdk/commit/0054c15f Stats: 40 lines in 2 files changed: 37 ins; 0 del; 3 mod 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist Reviewed-by: sgehwolf, bobv ------------- PR: https://git.openjdk.java.net/jdk/pull/295 From sgehwolf at openjdk.java.net Mon Sep 28 16:09:52 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Mon, 28 Sep 2020 16:09:52 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high Message-ID: Tests using `--memory-reservation` started to fail with newer `crun` cgroups v2-capable runtime. It turns out it was incorrectly setting `memory.high` in an early version and got fixed to set `memory.low` now instead. This change accounts for that. ------------- Commit messages: - 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high Changes: https://git.openjdk.java.net/jdk/pull/381/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=381&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253714 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/381.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/381/head:pull/381 PR: https://git.openjdk.java.net/jdk/pull/381 From sgehwolf at openjdk.java.net Mon Sep 28 16:09:52 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Mon, 28 Sep 2020 16:09:52 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 15:59:35 GMT, Severin Gehwolf wrote: > Tests using `--memory-reservation` started to fail with newer `crun` cgroups v2-capable runtime. It turns out it was > incorrectly setting `memory.high` in an early version and got fixed to set `memory.low` now instead. This change > accounts for that. @bobvandette Could you please take a look? Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/381 From sgehwolf at openjdk.java.net Mon Sep 28 16:09:52 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Mon, 28 Sep 2020 16:09:52 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:00:57 GMT, Severin Gehwolf wrote: >> Tests using `--memory-reservation` started to fail with newer `crun` cgroups v2-capable runtime. It turns out it was >> incorrectly setting `memory.high` in an early version and got fixed to set `memory.low` now instead. This change >> accounts for that. > > @bobvandette Could you please take a look? Thanks! Note that due to JDK-8253727 not all tests are passing with newer crun runtime (0.8 was last working for me). ------------- PR: https://git.openjdk.java.net/jdk/pull/381 From dfuchs at openjdk.java.net Mon Sep 28 16:21:08 2020 From: dfuchs at openjdk.java.net (Daniel Fuchs) Date: Mon, 28 Sep 2020 16:21:08 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:05:48 GMT, Daniel D. Daugherty wrote: > 8253667: ProblemList tools/jlink/JLinkReproducible{,3}Test.java on linux-aarch64 Marked as reviewed by dfuchs (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From dcubed at openjdk.java.net Mon Sep 28 16:21:07 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 28 Sep 2020 16:21:07 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 Message-ID: 8253667: ProblemList tools/jlink/JLinkReproducible{,3}Test.java on linux-aarch64 ------------- Commit messages: - Merge branch 'master' into JDK-8253667 - Also ProblemList tools/jlink/JLinkReproducible3Test.java due to 8253688. - 8253667: ProblemList tools/jlink/JLinkReproducibleTest.java on linux-aarch64 Changes: https://git.openjdk.java.net/jdk/pull/382/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=382&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253667 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/382.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/382/head:pull/382 PR: https://git.openjdk.java.net/jdk/pull/382 From dcubed at openjdk.java.net Mon Sep 28 16:21:09 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 28 Sep 2020 16:21:09 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:05:48 GMT, Daniel D. Daugherty wrote: > 8253667: ProblemList tools/jlink/JLinkReproducible{,3}Test.java on linux-aarch64 @AlanBateman - not sure if 'core-libs' was the right label to add so I figured I would ping you directly for a review. ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From iignatyev at openjdk.java.net Mon Sep 28 16:21:08 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 28 Sep 2020 16:21:08 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:05:48 GMT, Daniel D. Daugherty wrote: > 8253667: ProblemList tools/jlink/JLinkReproducible{,3}Test.java on linux-aarch64 Marked as reviewed by iignatyev (Reviewer). test/jdk/ProblemList.txt line 859: > 857: tools/jlink/JLinkReproducibleTest.java 8217166 windows-all > 858: tools/jlink/plugins/CompressorPluginTest.java 8247407 generic-all > 859: tools/jlink/JLinkReproducibleTest.java 8217166 linux-aarch64 wouldn't it be better to "join" JLinkReproducibleTest entries? ```suggestion tools/jlink/JLinkReproducibleTest.java 8217166 windows-all,linux-aarch64 tools/jlink/plugins/CompressorPluginTest.java 8247407 generic-all ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From dcubed at openjdk.java.net Mon Sep 28 16:27:10 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 28 Sep 2020 16:27:10 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 In-Reply-To: <-gAU0LOjZ46cZQiWUIM4mvrUDKdliwR6iK2cK1zRW6I=.0884b1ea-4d58-497c-93e7-40e64c780adc@github.com> References: <-gAU0LOjZ46cZQiWUIM4mvrUDKdliwR6iK2cK1zRW6I=.0884b1ea-4d58-497c-93e7-40e64c780adc@github.com> Message-ID: <7nF31lfzjQ8TEtH4jPzKCxXYtYPhk-Kl0j4amivaJIo=.8c611556-9958-4145-870a-d24d9b470d47@github.com> On Mon, 28 Sep 2020 16:22:17 GMT, Igor Ignatyev wrote: >> test/jdk/ProblemList.txt line 859: >> >>> 857: tools/jlink/JLinkReproducibleTest.java 8217166 windows-all >>> 858: tools/jlink/plugins/CompressorPluginTest.java 8247407 generic-all >>> 859: tools/jlink/JLinkReproducibleTest.java 8217166 linux-aarch64 >> >> wouldn't it be better to "join" JLinkReproducibleTest entries? >> ```suggestion >> tools/jlink/JLinkReproducibleTest.java 8217166 windows-all,linux-aarch64 >> tools/jlink/plugins/CompressorPluginTest.java 8247407 generic-all > > actually, you have to join them, as only jtreg honors only the last entry for a test -- > [CODETOOLS-7902481](https://bugs.openjdk.java.net/browse/CODETOOLS-7902481) Yes, I should use the existing entry. I didn't think to check for that. ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From iignatyev at openjdk.java.net Mon Sep 28 16:27:10 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 28 Sep 2020 16:27:10 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 In-Reply-To: References: Message-ID: <-gAU0LOjZ46cZQiWUIM4mvrUDKdliwR6iK2cK1zRW6I=.0884b1ea-4d58-497c-93e7-40e64c780adc@github.com> On Mon, 28 Sep 2020 16:16:52 GMT, Igor Ignatyev wrote: >> 8253667: ProblemList tools/jlink/JLinkReproducible{,3}Test.java on linux-aarch64 > > test/jdk/ProblemList.txt line 859: > >> 857: tools/jlink/JLinkReproducibleTest.java 8217166 windows-all >> 858: tools/jlink/plugins/CompressorPluginTest.java 8247407 generic-all >> 859: tools/jlink/JLinkReproducibleTest.java 8217166 linux-aarch64 > > wouldn't it be better to "join" JLinkReproducibleTest entries? > ```suggestion > tools/jlink/JLinkReproducibleTest.java 8217166 windows-all,linux-aarch64 > tools/jlink/plugins/CompressorPluginTest.java 8247407 generic-all actually, you have to join them, as only jtreg honors only the last entry for a test -- [CODETOOLS-7902481](https://bugs.openjdk.java.net/browse/CODETOOLS-7902481) ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From minqi at openjdk.java.net Mon Sep 28 16:32:52 2020 From: minqi at openjdk.java.net (Yumin Qi) Date: Mon, 28 Sep 2020 16:32:52 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 [v2] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:17:21 GMT, Igor Ignatyev wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> Update existing entry for tools/jlink/JLinkReproducibleTest.java instead of creating a new one. > > Marked as reviewed by iignatyev (Reviewer). Should we put this for all platforms? ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From dcubed at openjdk.java.net Mon Sep 28 16:32:52 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 28 Sep 2020 16:32:52 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 [v2] In-Reply-To: References: Message-ID: > 8253667: ProblemList tools/jlink/JLinkReproducible{,3}Test.java on linux-aarch64 Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: Update existing entry for tools/jlink/JLinkReproducibleTest.java instead of creating a new one. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/382/files - new: https://git.openjdk.java.net/jdk/pull/382/files/38997e41..1574c820 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=382&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=382&range=00-01 Stats: 4 lines in 1 file changed: 1 ins; 2 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/382.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/382/head:pull/382 PR: https://git.openjdk.java.net/jdk/pull/382 From dcubed at openjdk.java.net Mon Sep 28 16:32:52 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 28 Sep 2020 16:32:52 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 [v2] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:28:48 GMT, Yumin Qi wrote: >> Marked as reviewed by iignatyev (Reviewer). > > Should we put this for all platforms? Nope. We're currently seeing these failures on linux-aarch64 so that's what started this ProblemListing adventure... ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From dcubed at openjdk.java.net Mon Sep 28 16:36:33 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 28 Sep 2020 16:36:33 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 [v2] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:16:08 GMT, Daniel Fuchs wrote: >> Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: >> >> Update existing entry for tools/jlink/JLinkReproducibleTest.java instead of creating a new one. > > Marked as reviewed by dfuchs (Reviewer). @dfuch and @iignatev - please re-review when you the chance... Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From iignatyev at openjdk.java.net Mon Sep 28 16:36:33 2020 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 28 Sep 2020 16:36:33 GMT Subject: RFR: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 [v2] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:32:52 GMT, Daniel D. Daugherty wrote: >> 8253667: ProblemList tools/jlink/JLinkReproducible{,3}Test.java on linux-aarch64 > > Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: > > Update existing entry for tools/jlink/JLinkReproducibleTest.java instead of creating a new one. Marked as reviewed by iignatyev (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From dcubed at openjdk.java.net Mon Sep 28 16:44:11 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 28 Sep 2020 16:44:11 GMT Subject: Integrated: 8253667: ProblemList tools/jlink/JLinkReproducible{, 3}Test.java on linux-aarch64 In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:05:48 GMT, Daniel D. Daugherty wrote: > 8253667: ProblemList tools/jlink/JLinkReproducible{,3}Test.java on linux-aarch64 This pull request has now been integrated. Changeset: 821bd08c Author: Daniel D. Daugherty URL: https://git.openjdk.java.net/jdk/commit/821bd08c Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod 8253667: ProblemList tools/jlink/JLinkReproducible{,3}Test.java on linux-aarch64 Reviewed-by: dfuchs, iignatyev ------------- PR: https://git.openjdk.java.net/jdk/pull/382 From sgehwolf at openjdk.java.net Mon Sep 28 17:28:40 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Mon, 28 Sep 2020 17:28:40 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly Message-ID: Account for interface files for swap and memory being reported independently. The cgroup v1-like value is now reported by adding the memory.max value to the memory.swap.max value. Testing: Container tests on Linux x86_64 on cgroups v2 with crun 0.15 ------------- Commit messages: - 8253727: [cgroups v2] Memory and swap limits reported incorrectly Changes: https://git.openjdk.java.net/jdk/pull/384/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=384&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253727 Stats: 33 lines in 3 files changed: 30 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/384.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/384/head:pull/384 PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Mon Sep 28 17:34:54 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Mon, 28 Sep 2020 17:34:54 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high [v2] In-Reply-To: References: Message-ID: > Tests using `--memory-reservation` started to fail with newer `crun` cgroups v2-capable runtime. It turns out it was > incorrectly setting `memory.high` in an early version and got fixed to set `memory.low` now instead. This change > accounts for that. Severin Gehwolf has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/381/files - new: https://git.openjdk.java.net/jdk/pull/381/files/c3337bc6..25d131d8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=381&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=381&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/381.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/381/head:pull/381 PR: https://git.openjdk.java.net/jdk/pull/381 From sgehwolf at openjdk.java.net Mon Sep 28 17:34:54 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Mon, 28 Sep 2020 17:34:54 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:06:22 GMT, Severin Gehwolf wrote: > Note that due to JDK-8253727 not all tests are passing with newer crun runtime (0.8 was last working for me). All cgroups v2 container tests pass with #384 and this one on `crun 0.15`. ------------- PR: https://git.openjdk.java.net/jdk/pull/381 From bobv at openjdk.java.net Mon Sep 28 17:44:36 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Mon, 28 Sep 2020 17:44:36 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high [v2] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 17:34:54 GMT, Severin Gehwolf wrote: >> Tests using `--memory-reservation` started to fail with newer `crun` cgroups v2-capable runtime. It turns out it was >> incorrectly setting `memory.high` in an early version and got fixed to set `memory.low` now instead. This change >> accounts for that. > > Severin Gehwolf has refreshed the contents of this pull request, and previous commits have been removed. The > incremental views will show differences compared to the previous content of the PR. Marked as reviewed by bobv (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/381 From bobv at openjdk.java.net Mon Sep 28 17:44:36 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Mon, 28 Sep 2020 17:44:36 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 16:00:57 GMT, Severin Gehwolf wrote: > @bobvandette Could you please take a look? Thanks! I just verified that using --memory-reservation using my crun version 0.11.42 does result in memory.low being set to the reservation value. ------------- PR: https://git.openjdk.java.net/jdk/pull/381 From gziemski at openjdk.java.net Mon Sep 28 18:18:40 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 28 Sep 2020 18:18:40 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 02:19:08 GMT, David Holmes wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert "Add AIX specific SA code" >> >> This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. > > src/hotspot/os/linux/os_linux.cpp line 944: > >> 942: // is no gap between the last two virtual memory regions. >> 943: >> 944: JavaThread *jt = (JavaThread *)thread; > > thread is already a JavaThread* - see line 903 Wow, good catch! I think that change came from a merge - I will have to carefully review the code again... ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Mon Sep 28 19:38:10 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 28 Sep 2020 19:38:10 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 02:20:00 GMT, David Holmes wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert "Add AIX specific SA code" >> >> This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. > > src/hotspot/os/linux/os_linux.cpp line 1685: > >> 1683: filename); >> 1684: >> 1685: assert(Thread::current()->is_Java_thread(), "must be Java thread"); > > This assertion is already inside JavaThread::current(). Merge issue, fixed... > src/hotspot/os/posix/os_posix.cpp line 1456: > >> 1454: Thread* thread = Thread::current(); >> 1455: assert(thread->is_Java_thread(), "Must be JavaThread"); >> 1456: JavaThread *jt = (JavaThread *)thread; > > This change is unnecessary. Please restore the original line JavaThread::current() call. Merge issue, fixed. > src/hotspot/os/posix/os_posix.cpp line 1501: > >> 1499: } >> 1500: >> 1501: OSThreadWaitState osts(thread->osthread(), false /* not Object.wait() */); > > Unnecessary change - just use jt Merge issue, fixed. > src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp line 232: > >> 230: if (t != NULL) { >> 231: if(t->is_Java_thread()) { >> 232: thread = (JavaThread*)t; > > Mis-merge? Please put this back to t->as_Java_thread() Merge issue, fixed. > src/hotspot/os_cpu/bsd_x86/os_bsd_x86.cpp line 461: > >> 459: if (t != NULL ){ >> 460: if(t->is_Java_thread()) { >> 461: thread = (JavaThread*)t; > > Mis-merge? Please put this back to t->as_Java_thread() Merge issue, fixed. > src/hotspot/os_cpu/bsd_zero/os_bsd_zero.cpp line 160: > >> 158: if (t != NULL ){ >> 159: if(t->is_Java_thread()) { >> 160: thread = (JavaThread*)t; > > Mis-merge? Please put this back to t->as_Java_thread() Merge issue, fixed. > src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp line 241: > >> 239: if (t != NULL ){ >> 240: if(t->is_Java_thread()) { >> 241: thread = (JavaThread*)t; > > Mis-merge? Please put this back to t->as_Java_thread() Merge issue, fixed. > src/hotspot/os_cpu/linux_arm/os_linux_arm.cpp line 300: > >> 298: if (t != NULL ){ >> 299: if(t->is_Java_thread()) { >> 300: thread = (JavaThread*)t; > > Mis-merge? Please put this back to t->as_Java_thread() Merge issue, fixed. > src/hotspot/os_cpu/linux_ppc/os_linux_ppc.cpp line 284: > >> 282: if (t != NULL) { >> 283: if(t->is_Java_thread()) { >> 284: thread = (JavaThread*)t; > > Mis-merge? Please put this back to t->as_Java_thread() Merge issue, fixed. > src/hotspot/os_cpu/linux_s390/os_linux_s390.cpp line 284: > >> 282: if (t != NULL) { >> 283: if(t->is_Java_thread()) { >> 284: thread = (JavaThread*)t; > > Mis-merge? Please put this back to t->as_Java_thread() Merge issue, fixed. > src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp line 280: > >> 278: if (t != NULL ){ >> 279: if(t->is_Java_thread()) { >> 280: thread = (JavaThread*)t; > > Mis-merge? Please put this back to t->as_Java_thread() Merge issue, fixed. > src/hotspot/os_cpu/linux_zero/os_linux_zero.cpp line 156: > >> 154: if (t != NULL ){ >> 155: if(t->is_Java_thread()) { >> 156: thread = (JavaThread*)t; > > Mis-merge? Please put this back to t->as_Java_thread() Merge issue, fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Mon Sep 28 20:02:29 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 28 Sep 2020 20:02:29 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 03:40:46 GMT, David Holmes wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert "Add AIX specific SA code" >> >> This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. > > src/hotspot/os/posix/signals_posix.cpp line 442: > >> 440: // >> 441: >> 442: #if defined(__APPLE__) > > This should be checking for BSD, which would include macOS. Fixed. > src/hotspot/os/posix/signals_posix.cpp line 456: > >> 454: #endif >> 455: >> 456: // Set thread signal mask (for some reason on AIX sigthreadmask() seems > > This comment block seems inappropriate for shared code. Are you suggesting this code might be wrong for AIX? If so it > shouldn't be pushed in this form. This is just existing AIX code, nothing new. I assume it's fine since that's how it was before. In our POSIX impl it's guarded by #if def(AIX) > src/hotspot/os/posix/signals_posix.cpp line 462: > >> 460: const int rc = ::pthread_sigmask(how, set, oset); >> 461: // return value semantics differ slightly for error case: >> 462: // pthread_sigmask returns error number, sigthreadmask -1 and sets global errno > > There is no sigthreadmask in use. That's how it was in the original AIX code, I will just add #if def(AIX) around set_thread_signal_mask() and unblock_program_error_signals() for now. I was hoping to figure out later, whether the other POSIX platforms can use that code as well... Added to the followup JDK issue > src/hotspot/os/posix/signals_posix.cpp line 471: > >> 469: // to POSIX, typical program error signals. If they happen while being blocked, >> 470: // they typically will bring down the process immediately. >> 471: bool unblock_program_error_signals() { > > This is AIX specific and should just be a file local function for AIX only. Yes, same with set_thread_signal_mask(), fixed. > src/hotspot/os/posix/signals_posix.cpp line 494: > >> 492: >> 493: int orig_errno = errno; // Preserve errno value over signal handler. >> 494: #if defined(__APPLE__) > > Again this should be checking for BSD, whcih will include macOS. > > There should be a better way to dispatch here. If the handler has a platform-independent name, and is declared in the > platform specific files, then it will link to that one definition. Right, will leave for future cleanup - here I am just putting the code together, no other fixes. Added to the followup JDK issue > src/hotspot/os/posix/signals_posix.cpp line 571: > >> 569: { SA_NODEFER, "SA_NODEFER" }, >> 570: #if defined(AIX) >> 571: { SA_ONSTACK, "SA_ONSTACK" }, > > Existing issue but we already have an entry for SA_ONSTACK. Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Mon Sep 28 20:08:54 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 28 Sep 2020 20:08:54 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 12:15:49 GMT, Coleen Phillimore wrote: > Gerard, thank you for doing this! Thank you to Martin and Thomas for looking at the AIX code. Thank you for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Mon Sep 28 20:08:55 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 28 Sep 2020 20:08:55 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 04:10:15 GMT, David Holmes wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert "Add AIX specific SA code" >> >> This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. > > src/hotspot/os/posix/signals_posix.cpp line 1310: > >> 1308: >> 1309: address PosixSignals::ucontext_get_pc(const ucontext_t* ctx) { >> 1310: #if defined(AIX) > > Again there must be a better way to do this dispatch. If the target were os::ucontext_get_pc, defined is os.cpp > then we would link to the current platforms version. Will be addressed in JDK-8253742 > src/hotspot/os/posix/signals_posix.cpp line 46: > >> 44: // suspend/resume >> 45: >> 46: // glibc on Bsd platform uses non-documented flag > > This was added for Linux and then copied to the BSD port, so the comment is inaccurate. This seems to be referring to > the SA_RESTORER flag which seems to be a linux kernel flag. But I'm unsure why we would need to even be aware of this. > I think future cleanup could be done here. Will be addressed in JDK-8253742 ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Mon Sep 28 20:22:56 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 28 Sep 2020 20:22:56 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v5] In-Reply-To: References: Message-ID: > hi all, > > Please review this change that refactors common POSIX code into a separate > file. > > Currently there appears to be quite a bit of duplicated code among POSIX > platforms, which makes it difficult to apply single fix to the signal code. > With this fix, we will only need to touch single file for common POSIX > code fixes from now on. > > ---------------------------------------------------------------------------- > The APIs which moved from os/bsd/os_bsd.cpp to to os/posix/PosixSignals.cpp: > > //////////////////////////////////////////////////////////////////////////////// > // signal support > void os::Bsd::signal_sets_init() > sigset_t* os::Bsd::unblocked_signals() > sigset_t* os::Bsd::vm_signals() > void os::Bsd::hotspot_sigmask(Thread* thread) > //////////////////////////////////////////////////////////////////////////////// > // sun.misc.Signal support > static void UserHandler(int sig, void *siginfo, void *context) > void* os::user_handler() > void* os::signal(int signal_number, void* handler) > void os::signal_raise(int signal_number) > int os::sigexitnum_pd() > static void jdk_misc_signal_init() > void os::signal_notify(int sig) > static int check_pending_signals() > int os::signal_wait() > //////////////////////////////////////////////////////////////////////////////// > // suspend/resume support > static void resume_clear_context(OSThread *osthread) > static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, ucontext_t* context) > static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) > static int SR_initialize() > static int sr_notify(OSThread* osthread) > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > /////////////////////////////////////////////////////////////////////////////////// > // signal handling (except suspend/resume) > static void signalHandler(int sig, siginfo_t* info, void* uc) > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > static bool call_chained_handler(struct sigaction *actp, int sig, > siginfo_t *siginfo, void *context) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::install_signal_handlers() > static const char* get_signal_handler_name(address handler, > char* buf, int buflen) > static void print_signal_handler(outputStream* st, int sig, > char* buf, size_t buflen) > void os::run_periodic_checks() > void os::Bsd::check_signal_handler(int sig) > > ----------------------------------------------------------------------------- > The APIs which moved from os/posix/os_posix.cpp to os/posix/PosixSignals.cpp: > > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > int os::Posix::get_signal_number(const char* signal_name) > int os::get_signal_number(const char* signal_name) > bool os::Posix::is_valid_signal(int sig) > bool os::Posix::is_sig_ignored(int sig) > const char* os::exception_name(int sig, char* buf, size_t size) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::print_siginfo(outputStream* os, const void* si0) > bool os::signal_thread(Thread* thread, int sig, const char* reason) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > struct sigaction* os::Posix::get_preinstalled_handler(int sig) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > > -------------------------------------------------------- > -------------------------------------------------------- > > DETAILS: > > -------------------------------------------------------- > Public APIs which are now internal static PosixSignals:: > > sigset_t* os::Bsd::vm_signals() > struct sigaction* os::Bsd::get_chained_signal_action(int sig) > int os::Bsd::get_our_sigflags(int sig) > void os::Bsd::set_our_sigflags(int sig, int flags) > void os::Bsd::set_signal_handler(int sig, bool set_installed) > void os::Bsd::check_signal_handler(int sig) > const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) > bool os::Posix::is_valid_signal(int sig) > const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) > void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) > const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) > oid os::Posix::print_sa_flags(outputStream* st, int flags) > static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) > void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) > > ------------------------------------------------ > Public APIs which moved to public PosixSignals:: > > void os::Bsd::signal_sets_init() > void os::Bsd::hotspot_sigmask(Thread* thread) > bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) > void os::Bsd::install_signal_handlers() > bool os::Posix::is_sig_ignored(int sig) > int os::Posix::unblock_thread_signal_mask(const sigset_t *set) > address os::Posix::ucontext_get_pc(const ucontext_t* ctx) > void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) > > ---------------------------------------------------- > Internal APIs which are now public in PosixSignals:: > > static void jdk_misc_signal_init() > static int SR_initialize() > static bool do_suspend(OSThread* osthread) > static void do_resume(OSThread* osthread) > static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) > > -------------------------- > New APIs in PosixSignals:: > > static bool are_signal_handlers_installed(); Gerard Ziemski has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Merge branch 'master' into JDK-8252324 - Revert "Add AIX specific SA code" This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. - Add AIX specific SA code - Remove leftover AIX signal code - removed white spaces - Refactored common POSIX signal code into seperate file ------------- Changes: https://git.openjdk.java.net/jdk/pull/157/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=157&range=04 Stats: 5376 lines in 21 files changed: 1740 ins; 3529 del; 107 mod Patch: https://git.openjdk.java.net/jdk/pull/157.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/157/head:pull/157 PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Mon Sep 28 20:22:56 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 28 Sep 2020 20:22:56 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 04:37:03 GMT, David Holmes wrote: > Hi Gerard, > Thank you for tackling this long overdue cleanup and conciliation of the the signal management code! Great work on a > tedious task. > Overall this looks okay but I confess it is very hard to compare each previous platform version with the new shared > version. Sorry about that, I do understand how hard it was to review it and really appreciate it! > I'm glad the AIX folk have take an indepth look there. Me too! > I have a few minor comments in specific files below. > > I would also suggest naming the file posixSignal.cpp/hpp as that is the more common naming pattern. I based the name on the other files in that directory, i.e.: jvm_posix.cpp os_posix.cpp semaphore_posix.cpp threadLocalStorage_posix.cpp vmError_posix.cpp so I introduced: signals_posix.cpp > Future cleanup: do we really need JVM_handle__signal to be cpu specific? I can imagine one copy of this with a few > ifdefs or newly introduced os::pd_* functions. > Thanks, > David Thank you David very much for catching the merge issues and review! ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Mon Sep 28 20:22:56 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 28 Sep 2020 20:22:56 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 20:12:36 GMT, Gerard Ziemski wrote: >> Hi Gerard, >> Thank you for tackling this long overdue cleanup and conciliation of the the signal management code! Great work on a >> tedious task. >> Overall this looks okay but I confess it is very hard to compare each previous platform version with the new shared >> version. I'm glad the AIX folk have take an indepth look there. >> I have a few minor comments in specific files below. >> >> I would also suggest naming the file posixSignal.cpp/hpp as that is the more common naming pattern. >> >> Future cleanup: do we really need JVM_handle__signal to be cpu specific? I can imagine one copy of this with a few >> ifdefs or newly introduced os::pd_* functions. >> Thanks, >> David > >> Hi Gerard, >> Thank you for tackling this long overdue cleanup and conciliation of the the signal management code! Great work on a >> tedious task. >> Overall this looks okay but I confess it is very hard to compare each previous platform version with the new shared >> version. > > Sorry about that, I do understand how hard it was to review it and really appreciate it! > > >> I'm glad the AIX folk have take an indepth look there. > > Me too! > >> I have a few minor comments in specific files below. >> >> I would also suggest naming the file posixSignal.cpp/hpp as that is the more common naming pattern. > > I based the name on the other files in that directory, i.e.: > > jvm_posix.cpp > os_posix.cpp > semaphore_posix.cpp > threadLocalStorage_posix.cpp > vmError_posix.cpp > > so I introduced: > > signals_posix.cpp > >> Future cleanup: do we really need JVM_handle__signal to be cpu specific? I can imagine one copy of this with a few >> ifdefs or newly introduced os::pd_* functions. >> Thanks, >> David > > Thank you David very much for catching the merge issues and review! I will merge again with current, review the changes again looking for weird merge issue, push here, then re-test... ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Mon Sep 28 20:22:56 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Mon, 28 Sep 2020 20:22:56 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 20:14:12 GMT, Gerard Ziemski wrote: >>> Hi Gerard, >>> Thank you for tackling this long overdue cleanup and conciliation of the the signal management code! Great work on a >>> tedious task. >>> Overall this looks okay but I confess it is very hard to compare each previous platform version with the new shared >>> version. >> >> Sorry about that, I do understand how hard it was to review it and really appreciate it! >> >> >>> I'm glad the AIX folk have take an indepth look there. >> >> Me too! >> >>> I have a few minor comments in specific files below. >>> >>> I would also suggest naming the file posixSignal.cpp/hpp as that is the more common naming pattern. >> >> I based the name on the other files in that directory, i.e.: >> >> jvm_posix.cpp >> os_posix.cpp >> semaphore_posix.cpp >> threadLocalStorage_posix.cpp >> vmError_posix.cpp >> >> so I introduced: >> >> signals_posix.cpp >> >>> Future cleanup: do we really need JVM_handle__signal to be cpu specific? I can imagine one copy of this with a few >>> ifdefs or newly introduced os::pd_* functions. >>> Thanks, >>> David >> >> Thank you David very much for catching the merge issues and review! > > I will merge again with current, review the changes again looking for weird merge issue, push here, then re-test... Darn, I closed it by mistake (I thought I was closing a comment), reopening. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From pchilanomate at openjdk.java.net Mon Sep 28 21:08:24 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Mon, 28 Sep 2020 21:08:24 GMT Subject: RFR: 8253694: Remove Thread::muxAcquire() from ThreadCrashProtection() In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 15:00:20 GMT, Coleen Phillimore wrote: >> Hi all, >> >> Please review the following patch. Current ThreadCrashProtection() implementation uses static members which requires >> the use of Thread::muxAcquire() to allow only one user at a time. We can avoid this synchronization requirement if each >> thread has its own ThreadCrashProtection *data. I tested it builds on Linux, macOS and Windows. Since the >> JfrThreadSampler is the only one using this I run all the tests from test/jdk/jdk/jfr/. I also run some tests with JFR >> enabled while forcing a crash in OSThreadSampler::protected_task() and tests passed with several "Thread method sampler >> crashed" UL output. Also run tiers1-3. Thanks, Patricio > > src/hotspot/share/runtime/thread.hpp line 756: > >> 754: } >> 755: #endif >> 756: > > Could this be pushed down into osThread ? Yes, it can but it's not that clean I think given the osthread indirections. Here's how it looks: https://github.com/pchilano/jdk/commit/73a41ad867b4b7466cdddc87173828b4e80f8179 I think another alternative could be to remove the "#ifndef _WINDOWS" clause and define an empty check_crash_protection() method in os_windows.hpp ------------- PR: https://git.openjdk.java.net/jdk/pull/376 From stuefe at openjdk.java.net Tue Sep 29 05:10:15 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 29 Sep 2020 05:10:15 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 14:36:42 GMT, Coleen Phillimore wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove empty lines from include sections > > Besides my comment about globals.hpp option MetaspaceGuardAllocations, my comments are minor things and I approve this > change. There might be some additional things we find that we'll want to change once this code is integrated. This is > a significant improvement to metaspace memory management. Great work, @tstuefe ! Hi Leo, > I just have a few cosmetic comments. Otherwise it looks good to me. > > If you sort by ASCII order, I get a few unordered includes: > ("." < "/" < "a-z,A-Z") > .... I will fix all those. > > blockTree.hpp > add a space after loop keyword "for(;;) {" -> "for (;;) {" > > blockTree.cpp > add a space after loop keyword "} while(0)" -> "} while (0)" (twice in file) > Okay > I still think we should try to get the initializer list indented > somewhat consistently. I agree, will do. > I think this sort of deep verification should be in a gtest instead, at least for blockTree. Note, this can be > fixed/discussed in a future RFE. :( This would affect a lot of my code since I do a lot of in-file verifications. Yeah, lets discuss this separately. > src/hotspot/share/memory/metaspace/classLoaderMetaspace.cpp line 64: > >> 62: , _space_type(space_type) >> 63: , _non_class_space_arena(NULL) >> 64: , _class_space_arena(NULL) > > Ok, I see what @lkorinth was commenting on. In the coding standard, the normal indentation level is 2, but we don't > specify it for initializers. Generally it seems what looks good, maybe 4, maybe aligned somewhat with the arguments. > That said to me it doesn't matter that much, BUT most of the code has the punctuation at the end of the line, not the > beginning. I think it looks weird at the beginning. Can you change these? Yes, I agree with you two. I will find a common form and fix all my initializations. > src/hotspot/share/memory/metaspace/commitMask.hpp line 92: > >> 90: check_pointer(start + word_size - 1); >> 91: } >> 92: #endif > > If these are for asserts, they can be defined in the .cpp file. Ok > src/hotspot/share/memory/metaspace/metachunk.hpp line 272: > >> 270: _vsnode = node; _base = base; _level = lvl; >> 271: _used_words = _committed_words = 0; _state = State::Free; >> 272: _next = _prev = _next_in_vs = _prev_in_vs = NULL; > > Is this the same as clear() ? Yes, I'll reuse clear() instead. > src/hotspot/share/memory/metaspace/metachunkList.hpp line 60: > >> 58: void add(Metachunk* c) { >> 59: // Note: contains is expensive (linear search). >> 60: ASSERT_SOMETIMES(contains(c) == false, "Chunk already in this list"); > > Can you make this something like: DEBUG_ONLY(verify_contains();) and hide ASSERT_SOMETIMES in the .cpp file? Ok > src/hotspot/share/memory/metaspace/metachunkList.hpp line 29: > >> 27: #define SHARE_MEMORY_METASPACE_METACHUNKLIST_HPP >> 28: >> 29: #include "memory/metaspace/counters.hpp" > > Is this header file needed now? Yes, since we still use IntCounter. > src/hotspot/share/runtime/globals.hpp line 1589: > >> 1587: \ >> 1588: product(bool, MetaspaceGuardAllocations, false, \ >> 1589: "Metapace allocations are guarded.") \ > > This should be DIAGNOSTIC or develop() but not product. Ok! ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From stuefe at openjdk.java.net Tue Sep 29 07:46:52 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 29 Sep 2020 07:46:52 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v3] In-Reply-To: References: Message-ID: > Hi all, > > this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the > delay, had vacation then the entrance of Skara delayed things a bit. > For the delta diff please see [3]. > > This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all > feedback individually in this PR body, but I incorporated almost all into the new revision. > What changed since the last version: > > - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the > group consent. > > - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for > overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth > class loader which made it rather pointless. Now I run it always. > > - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. > > - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) > > Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. > > - In ChunkManager::purge() I improved the comments after discussions with Leo. > > - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due > to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run > every time. > > - Fixed indentation issues as Leo requested > > - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested > > - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes > and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more > flexible since it allows us to have list with purge-able and non-purge-able nodes. > > - and various smaller fixes, mainly on request of Leo. > > @lkorinth: > >> VirtualSpaceNode.hpp >> >>102 // Start pointer of the area. >>103 MetaWord* const _base; >> >>How does this differ from _rs._base? Really needed? >> >>105 // Size, in words, of the whole node >>106 const size_t _word_size; >> >>Can we not calculate this from _rs.size()? > > You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it > is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly > improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at > some point and make those members const - as they should be - then I would rewrite this as you suggest. > Thanks, again, for all your review work! > > ------ > > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html > [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html > [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: - Make MetaspaceGuardAllocations a diagnostic flag - Style fixes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/336/files - new: https://git.openjdk.java.net/jdk/pull/336/files/9f68bab7..0e414946 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=01-02 Stats: 401 lines in 55 files changed: 122 ins; 91 del; 188 mod Patch: https://git.openjdk.java.net/jdk/pull/336.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/336/head:pull/336 PR: https://git.openjdk.java.net/jdk/pull/336 From dholmes at openjdk.java.net Tue Sep 29 07:49:09 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 29 Sep 2020 07:49:09 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v4] In-Reply-To: References: <2ynfprWXACEqmw547JbfuTzIhMtref4P9tzNjVxagYs=.810d0024-d572-4d7e-83a5-4b8fc97e5b10@github.com> Message-ID: On Mon, 28 Sep 2020 20:12:36 GMT, Gerard Ziemski wrote: > > I would also suggest naming the file posixSignal.cpp/hpp as that is the more common naming pattern. > > I based the name on the other files in that directory, i.e.: > > jvm_posix.cpp Yes my bad. I was thinking of the class->file naming convention, but in this directory we're following the foo_ convention. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From stuefe at openjdk.java.net Tue Sep 29 07:58:06 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 29 Sep 2020 07:58:06 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v4] In-Reply-To: References: Message-ID: > Hi all, > > this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the > delay, had vacation then the entrance of Skara delayed things a bit. > For the delta diff please see [3]. > > This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all > feedback individually in this PR body, but I incorporated almost all into the new revision. > What changed since the last version: > > - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the > group consent. > > - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for > overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth > class loader which made it rather pointless. Now I run it always. > > - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. > > - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) > > Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. > > - In ChunkManager::purge() I improved the comments after discussions with Leo. > > - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due > to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run > every time. > > - Fixed indentation issues as Leo requested > > - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested > > - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes > and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more > flexible since it allows us to have list with purge-able and non-purge-able nodes. > > - and various smaller fixes, mainly on request of Leo. > > @lkorinth: > >> VirtualSpaceNode.hpp >> >>102 // Start pointer of the area. >>103 MetaWord* const _base; >> >>How does this differ from _rs._base? Really needed? >> >>105 // Size, in words, of the whole node >>106 const size_t _word_size; >> >>Can we not calculate this from _rs.size()? > > You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it > is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly > improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at > some point and make those members const - as they should be - then I would rewrite this as you suggest. > Thanks, again, for all your review work! > > ------ > > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html > [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html > [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Make MetaspaceGuardAllocations a diagnostic flag (2) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/336/files - new: https://git.openjdk.java.net/jdk/pull/336/files/0e414946..e64d8f02 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/336.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/336/head:pull/336 PR: https://git.openjdk.java.net/jdk/pull/336 From stuefe at openjdk.java.net Tue Sep 29 07:58:06 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 29 Sep 2020 07:58:06 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 05:03:11 GMT, Thomas Stuefe wrote: >> Besides my comment about globals.hpp option MetaspaceGuardAllocations, my comments are minor things and I approve this >> change. There might be some additional things we find that we'll want to change once this code is integrated. This is >> a significant improvement to metaspace memory management. Great work, @tstuefe ! > > Hi Leo, > >> I just have a few cosmetic comments. Otherwise it looks good to me. >> >> If you sort by ASCII order, I get a few unordered includes: >> ("." < "/" < "a-z,A-Z") >> .... > > I will fix all those. > >> >> blockTree.hpp >> add a space after loop keyword "for(;;) {" -> "for (;;) {" >> >> blockTree.cpp >> add a space after loop keyword "} while(0)" -> "} while (0)" (twice in file) >> > > Okay > >> I still think we should try to get the initializer list indented >> somewhat consistently. > > I agree, will do. Hi, thanks for the reviews! New version: - Hotspot style fixes - sorted all include sections, and removed empty lines from them; fixed whitespace issues in loop constructs; fixed include guard names; squashed empty lines - Moved verification code from commitmask.hpp to commitmask.cpp - Squashed multiple empty lines to one and condensed code a bit more - Changed ctor initializer lists: I decided on two formats. Two space indentation, colon follows prototype name, punctuation at the end, all in one line or one line per member: XXX() : _m1(xx), _m2(xx) { blub } XXX() : _m1(xx), _m2(xx), _m3(xx), _m4(xx) { blub } - Made MetaspaceGuardAllocations diagnostic ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From stuefe at openjdk.java.net Tue Sep 29 07:58:07 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 29 Sep 2020 07:58:07 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v2] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 14:36:42 GMT, Coleen Phillimore wrote: > Besides my comment about globals.hpp option MetaspaceGuardAllocations, my comments are minor things and I approve this > change. There might be some additional things we find that we'll want to change once this code is integrated. This is a > significant improvement to metaspace memory management. Great work, @tstuefe ! Thank you Coleen! :) ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From sgehwolf at openjdk.java.net Tue Sep 29 11:16:56 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 11:16:56 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v2] In-Reply-To: References: Message-ID: > Account for interface files for swap and memory being reported independently. > The cgroup v1-like value is now reported by adding the memory.max value to > the memory.swap.max value. > > Testing: Container tests on Linux x86_64 on cgroups v2 with crun 0.15 Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: Account for unlimited values in the test for memory.swap.max/memory.max ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/384/files - new: https://git.openjdk.java.net/jdk/pull/384/files/7dd62cfd..37a5979b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=384&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=384&range=00-01 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/384.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/384/head:pull/384 PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Tue Sep 29 11:19:45 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 11:19:45 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly In-Reply-To: References: Message-ID: <5BbK1XDcEv4UExbyrtrAXbmMna6iLJ9K2vt_u_UP1R0=.60ad58fb-5ecf-47bd-a9e5-6222e237561b@github.com> On Mon, 28 Sep 2020 17:22:21 GMT, Severin Gehwolf wrote: > Account for interface files for swap and memory being reported independently. > The cgroup v1-like value is now reported by adding the memory.max value to > the memory.swap.max value. > > Testing: Container tests on Linux x86_64 on cgroups v2 with crun 0.15 @bobvandette Could you please look at this as well? It would be much appreciated. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From jlahoda at openjdk.java.net Tue Sep 29 14:07:57 2020 From: jlahoda at openjdk.java.net (Jan Lahoda) Date: Tue, 29 Sep 2020 14:07:57 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v6] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Fri, 25 Sep 2020 00:41:47 GMT, Vicente Romero wrote: >> Co-authored-by: Vicente Romero >> Co-authored-by: Harold Seigel >> Co-authored-by: Jonathan Gibbons >> Co-authored-by: Brian Goetz >> Co-authored-by: Maurizio Cimadamore >> Co-authored-by: Joe Darcy >> Co-authored-by: Chris Hegarty >> Co-authored-by: Jan Lahoda > > Vicente Romero has updated the pull request incrementally with one additional commit since the last revision: > > adding missing changes to some tests Marked as reviewed by jlahoda (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From jlahoda at openjdk.java.net Tue Sep 29 14:07:57 2020 From: jlahoda at openjdk.java.net (Jan Lahoda) Date: Tue, 29 Sep 2020 14:07:57 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v3] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> <0H-sMIm0mwGW2f2OxwAwMGBtxDf2BUf7ds3tGEgYXrc=.849aab2b-4cf3-49f7-8e49-ad1df75cb0bb@github.com> Message-ID: On Fri, 25 Sep 2020 15:47:30 GMT, Peter Levart wrote: >> [CSR: Record Classes](https://bugs.openjdk.java.net/browse/JDK-8253605) > > Hi @vicente-romero-oracle , note that besides tests, there is also a JMH benchmark that measures the performance of > records deserialization (org.openjdk.bench.java.io.RecordDeserialization) which forced us to modify the build procedure > for all benchmarks to include --enable-preview option in JDK 15 and backports (see > https://bugs.openjdk.java.net/browse/JDK-8248135). If you undo this change in JDK 16 then also the problem described in > https://bugs.openjdk.java.net/browse/JDK-8250669 and https://bugs.openjdk.java.net/browse/JDK-8248429 will disapear. > After that, perhaps undoing the same for JDK 15 and backports together with removing the benchmark is also possible to > resolve the issues in older releases as most developement will probably happen in JDK 16 then and so the need for > performance testing will mostly be needed in there. We still have to figure out how to enable having benchmarks for > preview features as in the future there most probably will be a need for that. Langtools code looks good to me. We probably can also remove the RECORDS entry from PreviewFeature.Feature. ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From adinn at openjdk.java.net Tue Sep 29 14:13:12 2020 From: adinn at openjdk.java.net (Andrew Dinn) Date: Tue, 29 Sep 2020 14:13:12 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high [v2] In-Reply-To: References: Message-ID: <6J36GkHW2lcFezGcVVgLwHJJeOveUsoJdPgqExAFw40=.94a29ae2-bad5-4b8c-9e6e-67af2e60eb66@github.com> On Mon, 28 Sep 2020 17:34:54 GMT, Severin Gehwolf wrote: >> Tests using `--memory-reservation` started to fail with newer `crun` cgroups v2-capable runtime. It turns out it was >> incorrectly setting `memory.high` in an early version and got fixed to set `memory.low` now instead. This change >> accounts for that. > > Severin Gehwolf has refreshed the contents of this pull request, and previous commits have been removed. The > incremental views will show differences compared to the previous content of the PR. Looks fine to me, especially if Bob says it's ok. ------------- Marked as reviewed by adinn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/381 From sgehwolf at openjdk.java.net Tue Sep 29 14:42:15 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 14:42:15 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high [v2] In-Reply-To: <6J36GkHW2lcFezGcVVgLwHJJeOveUsoJdPgqExAFw40=.94a29ae2-bad5-4b8c-9e6e-67af2e60eb66@github.com> References: <6J36GkHW2lcFezGcVVgLwHJJeOveUsoJdPgqExAFw40=.94a29ae2-bad5-4b8c-9e6e-67af2e60eb66@github.com> Message-ID: <_VuSfXAc97_Xbyi1cJw8NC4HHniysvJfMog8M4LSOgU=.5fd671a4-4e0b-4bb1-9c4f-a4672479c388@github.com> On Tue, 29 Sep 2020 14:10:42 GMT, Andrew Dinn wrote: > Looks fine to me, especially if Bob says it's ok. Thanks for the review, Andrew! ------------- PR: https://git.openjdk.java.net/jdk/pull/381 From bobv at openjdk.java.net Tue Sep 29 14:43:58 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Tue, 29 Sep 2020 14:43:58 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v2] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 11:16:56 GMT, Severin Gehwolf wrote: >> Account for interface files for swap and memory being reported independently. >> The cgroup v1-like value is now reported by adding the memory.max value to >> the memory.swap.max value. >> >> Testing: Container tests on Linux x86_64 on cgroups v2 with crun 0.15 > > Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: > > Account for unlimited values in the test for memory.swap.max/memory.max > @bobvandette Could you please look at this as well? It would be much appreciated. Thanks! I made some comments before this most recent change. Did you see them? src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 168: > 166: char* mem_swp_limit_str = mem_swp_limit_val(); > 167: jlong swap_limit = limit_from_str(mem_swp_limit_str); > 168: if (swap_limit >= 0) { In the recent fix for JDK-8250984, we added support for systems that don't have swap accounting enabled. CgroupV2 currently has a bug where you can't even set memory limits in this config but assuming they will fix that, we might want to add that support for v2 now. If the memory.swap.max file is not avail, then we return whatever memory limit is. src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 170: > 168: if (swap_limit >= 0) { > 169: jlong memory_limit = read_memory_limit_in_bytes(); > 170: if (memory_limit == -1) { Shouldn't this be an assert. If swap limit is set, you need a memory limit set. src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java line 297: > 295: return swapLimit; > 296: } > 297: Do you have the same issue with getMemoryAndSwapUsage "memory.swap.current", where you have a to add memory usage + swap usage? I did a quick test and found that the memory.swap.current file does not appear to include memory usage. I set a memory and swap limit and memory.current was 3M but memory.swap.current was 0. If you apply the JDK-8250984 fix for v2, getMemoryAndSwapUsage will have to be fixed as well. ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Tue Sep 29 14:50:40 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 14:50:40 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v2] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 14:41:08 GMT, Bob Vandette wrote: > > @bobvandette Could you please look at this as well? It would be much appreciated. Thanks! > > I made some comments before this most recent change. Did you see them? @bobvandette No, unfortunately not. There should be a mailing list reply on the lists if there was anything public, but I don't see any here: http://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/069262.html Not sure what went wrong :-( ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From bobv at openjdk.java.net Tue Sep 29 14:53:43 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Tue, 29 Sep 2020 14:53:43 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v2] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 14:47:50 GMT, Severin Gehwolf wrote: > > > @bobvandette Could you please look at this as well? It would be much appreciated. Thanks! > > > > > > I made some comments before this most recent change. Did you see them? > > @bobvandette No, unfortunately not. There should be a mailing list reply on the lists if there was anything public, but > I don't see any here: http://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/069262.html > > Not sure what went wrong :-( I responded directly in the github page under "Files Changed" and annotated the source changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Tue Sep 29 15:07:32 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 15:07:32 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v2] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 18:01:28 GMT, Bob Vandette wrote: >> Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: >> >> Account for unlimited values in the test for memory.swap.max/memory.max > > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 168: > >> 166: char* mem_swp_limit_str = mem_swp_limit_val(); >> 167: jlong swap_limit = limit_from_str(mem_swp_limit_str); >> 168: if (swap_limit >= 0) { > > In the recent fix for JDK-8250984, we added support for systems that don't have swap accounting enabled. CgroupV2 > currently has a bug where you can't even set memory limits in this config but assuming they will fix that, we might > want to add that support for v2 now. If the memory.swap.max file is not avail, then we return whatever memory limit is. Hmm, I'd prefer if we kept those two changes separate. This makes it easier to backport and makes intent more obvious. FWIW, I'd like to bring this to 15u. I've filed https://bugs.openjdk.java.net/browse/JDK-8253797 to track this. > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 170: > >> 168: if (swap_limit >= 0) { >> 169: jlong memory_limit = read_memory_limit_in_bytes(); >> 170: if (memory_limit == -1) { > > Shouldn't this be an assert. If swap limit is set, you need a memory limit set. OK, fine with me. I'll fix it. > Do you have the same issue with getMemoryAndSwapUsage "memory.swap.current", where you have a to add memory usage + > swap usage? That's a good point. I'll fix that as well. ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Tue Sep 29 15:07:31 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 15:07:31 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v2] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 14:51:25 GMT, Bob Vandette wrote: > > > > @bobvandette Could you please look at this as well? It would be much appreciated. Thanks! > > > > > > > > > I made some comments before this most recent change. Did you see them? > > > > > > @bobvandette No, unfortunately not. There should be a mailing list reply on the lists if there was anything public, but > > I don't see any here: http://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/069262.html > > Not sure what went wrong :-( > > I responded directly in the github page under "Files Changed" and annotated the source changes. I see. If you did it as part of a "Review" and didn't finish the review, then it would only be visible to yourself (as a draft). "Add a single comment" option (or whatever it's called) shows up right away, AFAIK. ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Tue Sep 29 15:14:24 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 15:14:24 GMT Subject: RFR: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high [v2] In-Reply-To: <_VuSfXAc97_Xbyi1cJw8NC4HHniysvJfMog8M4LSOgU=.5fd671a4-4e0b-4bb1-9c4f-a4672479c388@github.com> References: <6J36GkHW2lcFezGcVVgLwHJJeOveUsoJdPgqExAFw40=.94a29ae2-bad5-4b8c-9e6e-67af2e60eb66@github.com> <_VuSfXAc97_Xbyi1cJw8NC4HHniysvJfMog8M4LSOgU=.5fd671a4-4e0b-4bb1-9c4f-a4672479c388@github.com> Message-ID: On Tue, 29 Sep 2020 14:39:50 GMT, Severin Gehwolf wrote: >> Looks fine to me, especially if Bob says it's ok. > >> Looks fine to me, especially if Bob says it's ok. > > Thanks for the review, Andrew! The early implementation of cgroups v2 support was done with crun 0.8 and it contained a bug which set memory.high over memory.low when --memory-reservation was being used as a CLI option. This bug has been fixed in later crun versions, starting with crun 0.11. Use memory.low in OpenJDK as well. ------------- PR: https://git.openjdk.java.net/jdk/pull/381 From bobv at openjdk.java.net Tue Sep 29 15:15:53 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Tue, 29 Sep 2020 15:15:53 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v2] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 15:00:43 GMT, Severin Gehwolf wrote: >> src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 168: >> >>> 166: char* mem_swp_limit_str = mem_swp_limit_val(); >>> 167: jlong swap_limit = limit_from_str(mem_swp_limit_str); >>> 168: if (swap_limit >= 0) { >> >> In the recent fix for JDK-8250984, we added support for systems that don't have swap accounting enabled. CgroupV2 >> currently has a bug where you can't even set memory limits in this config but assuming they will fix that, we might >> want to add that support for v2 now. If the memory.swap.max file is not avail, then we return whatever memory limit is. > > Hmm, I'd prefer if we kept those two changes separate. This makes it easier to backport and makes intent more obvious. > FWIW, I'd like to bring this to 15u. I've filed https://bugs.openjdk.java.net/browse/JDK-8253797 to track this. Ok. ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Tue Sep 29 15:18:58 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 15:18:58 GMT Subject: Integrated: 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 15:59:35 GMT, Severin Gehwolf wrote: > Tests using `--memory-reservation` started to fail with newer `crun` cgroups v2-capable runtime. It turns out it was > incorrectly setting `memory.high` in an early version and got fixed to set `memory.low` now instead. This change > accounts for that. This pull request has now been integrated. Changeset: ff6843ca Author: Severin Gehwolf URL: https://git.openjdk.java.net/jdk/commit/ff6843ca Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod 8253714: [cgroups v2] Soft memory limit incorrectly using memory.high The early implementation of cgroups v2 support was done with crun 0.8 and it contained a bug which set memory.high over memory.low when --memory-reservation was being used as a CLI option. This bug has been fixed in later crun versions, starting with crun 0.11. Use memory.low in OpenJDK as well. Reviewed-by: bobv, adinn ------------- PR: https://git.openjdk.java.net/jdk/pull/381 From iklam at openjdk.java.net Tue Sep 29 15:21:06 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 29 Sep 2020 15:21:06 GMT Subject: Integrated: 8253548: jvmFlagAccess.cpp: clang 9.0.0 format specifier error In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 18:19:26 GMT, Ioi Lam wrote: > Please review this trivial fix for older clang compiler printf format warning. As mentioned on > [JDK-8253548](https://bugs.openjdk.java.net/browse/JDK-8253548), these's an existing typecast in [logFileOutput.cpp]( > https://github.com/openjdk/jdk/blame/efd10546865998028aa6d34cf939ca0de67a90fc/hotspot/src/share/vm/logging/logFileOutput.cpp#L194) > for the same reason that SIZE_MAX is not actually of the size_t type for older clang compilers. Testing with mach5 > tier1. This pull request has now been integrated. Changeset: b1ce6bdb Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/b1ce6bdb Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8253548: jvmFlagAccess.cpp: clang 9.0.0 format specifier error Reviewed-by: lfoltan ------------- PR: https://git.openjdk.java.net/jdk/pull/365 From stuefe at openjdk.java.net Tue Sep 29 15:21:13 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 29 Sep 2020 15:21:13 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> Message-ID: On Thu, 24 Sep 2020 18:14:10 GMT, Zhengyu Gu wrote: > For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report > thread's state after thread terminated. > But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid > thread stack, etc. > This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated > thread from live thread. > Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). > > Note: Java thread does not have such issue, its thread object is deleted before thread terminates. Hi Zhengyu, I'm updating my review after reading through your conversation with David. Save for small nits this seem fine. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From stuefe at openjdk.java.net Tue Sep 29 15:21:13 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 29 Sep 2020 15:21:13 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <-MGl9WfKGYhIp3dk96wRn86qovh1QItkxm6occqWpao=.87a44d0d-681e-4118-9994-23faf1703614@github.com> Message-ID: On Mon, 28 Sep 2020 04:47:47 GMT, David Holmes wrote: >> For thread, e.g. G1ConcurrentMarkThread, there is nothing to prevent calling _cm_thread->print_on(tty) after it >> terminated, although, I can not find a case right now. >> You prefer an assertion instead? > > I prefer no change to this method. I don't see that we need to do anything special even if a ZOMBIE could be > encountered. If Thread::print_on() (and ThreadsSMRSupport::print_info_on(Thread..)) cannot be called for a Zombie thread, I'd prefer an assertion testing that. >> so, you prefer "ShouldNotReachHere()" ? > > There's no point putting a ShouldNotReachHere() in error handling code as we will just trip a secondary error. > If we want to print something the perhaps "unknown state (no osThread)" ? > Also I only wanted the ThreadSMRSupport::print_info_on to be excluded for Zombies, but you've excluded it for the > no-osThread case as well. I think based on what Dan said we can just put that back and call it unconditionally. Thanks. +1 for "unknown state". "Aborted" is misleading since we do not know. Could have crashed right at thread creation (or is that what "Aborted" means?) ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From vromero at openjdk.java.net Tue Sep 29 16:49:36 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Tue, 29 Sep 2020 16:49:36 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v7] In-Reply-To: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: > Co-authored-by: Vicente Romero > Co-authored-by: Harold Seigel > Co-authored-by: Jonathan Gibbons > Co-authored-by: Brian Goetz > Co-authored-by: Maurizio Cimadamore > Co-authored-by: Joe Darcy > Co-authored-by: Chris Hegarty > Co-authored-by: Jan Lahoda Vicente Romero has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - Merge branch 'JDK-8246774' of https://github.com/vicente-romero-oracle/jdk into JDK-8246774 - adding missing changes to some tests - Merge branch 'master' into JDK-8246774 - modifiying @since from 14 to 16 - Merge pull request #1 from ChrisHegarty/record-serial-tests Remove preview args from JDK tests - Remove preview args from ObjectMethodsTest - Remove preview args from record serialization tests - removing the javax.lang.model related code to be moved to a separate bug - 8246774: Record Classes (final) implementation Co-authored-by: Vicente Romero Co-authored-by: Harold Seigel Co-authored-by: Jonathan Gibbons Co-authored-by: Brian Goetz Co-authored-by: Maurizio Cimadamore Co-authored-by: Joe Darcy Co-authored-by: Chris Hegarty Co-authored-by: Jan Lahoda - adding missing changes to some tests - ... and 5 more: https://git.openjdk.java.net/jdk/compare/d5be8294...482fedec ------------- Changes: https://git.openjdk.java.net/jdk/pull/290/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=06 Stats: 535 lines in 109 files changed: 32 ins; 278 del; 225 mod Patch: https://git.openjdk.java.net/jdk/pull/290.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/290/head:pull/290 PR: https://git.openjdk.java.net/jdk/pull/290 From sgehwolf at openjdk.java.net Tue Sep 29 18:22:40 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 18:22:40 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v3] In-Reply-To: References: Message-ID: > Account for interface files for swap and memory being reported independently. > The cgroup v1-like value is now reported by adding the memory.max value to > the memory.swap.max value. > > Testing: Container tests on Linux x86_64 on cgroups v2 with crun 0.15 Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: 8253727: [cgroups v2] Memory and swap limits reported incorrectly ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/384/files - new: https://git.openjdk.java.net/jdk/pull/384/files/37a5979b..380cd49e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=384&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=384&range=01-02 Stats: 2342 lines in 104 files changed: 1251 ins; 577 del; 514 mod Patch: https://git.openjdk.java.net/jdk/pull/384.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/384/head:pull/384 PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Tue Sep 29 18:22:40 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 18:22:40 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v3] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 14:51:25 GMT, Bob Vandette wrote: >>> > @bobvandette Could you please look at this as well? It would be much appreciated. Thanks! >>> >>> I made some comments before this most recent change. Did you see them? >> >> @bobvandette No, unfortunately not. There should be a mailing list reply on the lists if there was anything public, but >> I don't see any here: http://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/069262.html >> >> Not sure what went wrong :-( > >> > > @bobvandette Could you please look at this as well? It would be much appreciated. Thanks! >> > >> > >> > I made some comments before this most recent change. Did you see them? >> >> @bobvandette No, unfortunately not. There should be a mailing list reply on the lists if there was anything public, but >> I don't see any here: http://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/069262.html >> >> Not sure what went wrong :-( > > I responded directly in the github page under "Files Changed" and annotated the source changes. @bobvandette How does the updated patch look to you? ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Tue Sep 29 18:29:47 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Tue, 29 Sep 2020 18:29:47 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v4] In-Reply-To: References: Message-ID: > Account for interface files for swap and memory being reported independently. > The cgroup v1-like value is now reported by adding the memory.max value to > the memory.swap.max value. > > Testing: Container tests on Linux x86_64 on cgroups v2 with crun 0.15 Severin Gehwolf has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8253727: [cgroups v2] Memory and swap limits reported incorrectly ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/384/files - new: https://git.openjdk.java.net/jdk/pull/384/files/380cd49e..cc9b1087 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=384&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=384&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/384.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/384/head:pull/384 PR: https://git.openjdk.java.net/jdk/pull/384 From vromero at openjdk.java.net Tue Sep 29 18:34:07 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Tue, 29 Sep 2020 18:34:07 GMT Subject: RFR: 8246774: Record Classes (final) implementation [v8] In-Reply-To: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: > Co-authored-by: Vicente Romero > Co-authored-by: Harold Seigel > Co-authored-by: Jonathan Gibbons > Co-authored-by: Brian Goetz > Co-authored-by: Maurizio Cimadamore > Co-authored-by: Joe Darcy > Co-authored-by: Chris Hegarty > Co-authored-by: Jan Lahoda Vicente Romero has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/290/files - new: https://git.openjdk.java.net/jdk/pull/290/files/482fedec..76e3d278 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=290&range=06-07 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/290.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/290/head:pull/290 PR: https://git.openjdk.java.net/jdk/pull/290 From bobv at openjdk.java.net Tue Sep 29 19:03:11 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Tue, 29 Sep 2020 19:03:11 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v4] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 18:29:47 GMT, Severin Gehwolf wrote: >> Account for interface files for swap and memory being reported independently. >> The cgroup v1-like value is now reported by adding the memory.max value to >> the memory.swap.max value. >> >> Testing: Container tests on Linux x86_64 on cgroups v2 with crun 0.15 > > Severin Gehwolf has refreshed the contents of this pull request, and previous commits have been removed. The > incremental views will show differences compared to the previous content of the PR. The pull request contains one new > commit since the last revision: > 8253727: [cgroups v2] Memory and swap limits reported incorrectly Marked as reviewed by bobv (Committer). test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java line 248: > 246: newVal = valSwap; > 247: } else { > 248: // ignore error values for valMemory, since the container runtime Add an assert?? ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From bobv at openjdk.java.net Tue Sep 29 19:03:12 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Tue, 29 Sep 2020 19:03:12 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v4] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 19:00:15 GMT, Bob Vandette wrote: >> Severin Gehwolf has refreshed the contents of this pull request, and previous commits have been removed. The >> incremental views will show differences compared to the previous content of the PR. The pull request contains one new >> commit since the last revision: >> 8253727: [cgroups v2] Memory and swap limits reported incorrectly > > test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java line 248: > >> 246: newVal = valSwap; >> 247: } else { >> 248: // ignore error values for valMemory, since the container runtime > > Add an assert?? Other than that nit, it looks fine. ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From bobv at openjdk.java.net Tue Sep 29 19:50:58 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Tue, 29 Sep 2020 19:50:58 GMT Subject: RFR: 8253476: TestUseContainerSupport.java fails on some Linux kernels =?UTF-8?B?dy9v4oCm?= In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 15:52:43 GMT, Harold Seigel wrote: > Please review this small change to remove "--memory 200m" option from TestUseContainerSupport.java. This can cause > test failures on systems where swap accounting is disabled. Marked as reviewed by bobv (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/303 From bobv at openjdk.java.net Tue Sep 29 20:02:13 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Tue, 29 Sep 2020 20:02:13 GMT Subject: RFR: 8253476: TestUseContainerSupport.java fails on some Linux kernels w/o swap limit capabilities In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 15:52:43 GMT, Harold Seigel wrote: > Please review this small change to remove "--memory 200m" option from TestUseContainerSupport.java. This can cause > test failures on systems where swap accounting is disabled. Marked as reviewed by bobv (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/303 From coleenp at openjdk.java.net Tue Sep 29 20:02:13 2020 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 29 Sep 2020 20:02:13 GMT Subject: RFR: 8253476: TestUseContainerSupport.java fails on some Linux kernels w/o swap limit capabilities In-Reply-To: References: Message-ID: On Tue, 22 Sep 2020 15:52:43 GMT, Harold Seigel wrote: > Please review this small change to remove "--memory 200m" option from TestUseContainerSupport.java. This can cause > test failures on systems where swap accounting is disabled. Marked as reviewed by coleenp (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/303 From hseigel at openjdk.java.net Tue Sep 29 20:02:13 2020 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 29 Sep 2020 20:02:13 GMT Subject: Integrated: 8253476: TestUseContainerSupport.java fails on some Linux kernels w/o swap limit capabilities In-Reply-To: References: Message-ID: <98-B5_UGIrZgMaHOXvhStdjWYvkg4o-3TmUuDkMabns=.7f1a6ceb-7c2a-4d84-8c1c-f84a5692f59f@github.com> On Tue, 22 Sep 2020 15:52:43 GMT, Harold Seigel wrote: > Please review this small change to remove "--memory 200m" option from TestUseContainerSupport.java. This can cause > test failures on systems where swap accounting is disabled. This pull request has now been integrated. Changeset: 2fe0a5d7 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/2fe0a5d7 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod 8253476: TestUseContainerSupport.java fails on some Linux kernels w/o swap limit capabilities Reviewed-by: bobv, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/303 From bobv at openjdk.java.net Tue Sep 29 20:02:13 2020 From: bobv at openjdk.java.net (Bob Vandette) Date: Tue, 29 Sep 2020 20:02:13 GMT Subject: RFR: 8253476: TestUseContainerSupport.java fails on some Linux kernels w/o swap limit capabilities In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 19:57:14 GMT, Coleen Phillimore wrote: >> Please review this small change to remove "--memory 200m" option from TestUseContainerSupport.java. This can cause >> test failures on systems where swap accounting is disabled. > > Marked as reviewed by coleenp (Reviewer). Setting a 200MB container Limit when swap accounting is not enabled in the Kernel will cause a SEGV depending on how much RAM is available in the host. When we test with UseContainerSupport disabled and a container limit imposed by docker, the VM assumes it has all host memory available to it. The VM ergonomics will then assume it has a lot of memory and will potentially select a very large heap. When the heap is initialized, it will exceed the containers limit causing the SEGV. ------------- PR: https://git.openjdk.java.net/jdk/pull/303 From sspitsyn at openjdk.java.net Wed Sep 30 00:08:44 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 30 Sep 2020 00:08:44 GMT Subject: RFR: 8246774: implementing Record Classes as a standard feature in Java [v6] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Tue, 29 Sep 2020 14:05:27 GMT, Jan Lahoda wrote: >> Vicente Romero has updated the pull request incrementally with one additional commit since the last revision: >> >> adding missing changes to some tests > > Marked as reviewed by jlahoda (Reviewer). The instrument tests update looks good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From vromero at openjdk.java.net Wed Sep 30 00:21:20 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Wed, 30 Sep 2020 00:21:20 GMT Subject: RFR: 8246774: implementing Record Classes as a standard feature in Java [v6] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Wed, 30 Sep 2020 00:06:19 GMT, Serguei Spitsyn wrote: >> Marked as reviewed by jlahoda (Reviewer). > > The instrument tests update looks good to me. @sspitsyn thanks for the review. Please add yourself as a reviewer ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From vromero at openjdk.java.net Wed Sep 30 00:24:29 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Wed, 30 Sep 2020 00:24:29 GMT Subject: RFR: 8246774: implementing Record Classes as a standard feature in Java [v6] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: <9lpq7-8Hjtvui3jtyOxfKKRCGa1yObk5po4zn92Waog=.93761bb3-a0cc-4cd6-95a5-28d95c11012d@github.com> On Wed, 30 Sep 2020 00:18:17 GMT, Vicente Romero wrote: >> The instrument tests update looks good to me. > > @sspitsyn thanks for the review. Please add yourself as a reviewer @plevart interesting, but I think that I prefer to update those benchmarks in a follow-up patch in order to keep this patch as simple as possible. I will file a follow-up issue in JIRA and link it to JDK-8246774 ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From sspitsyn at openjdk.java.net Wed Sep 30 00:30:49 2020 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 30 Sep 2020 00:30:49 GMT Subject: RFR: 8246774: implementing Record Classes as a standard feature in Java [v8] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Tue, 29 Sep 2020 18:34:07 GMT, Vicente Romero wrote: >> Co-authored-by: Vicente Romero >> Co-authored-by: Harold Seigel >> Co-authored-by: Jonathan Gibbons >> Co-authored-by: Brian Goetz >> Co-authored-by: Maurizio Cimadamore >> Co-authored-by: Joe Darcy >> Co-authored-by: Chris Hegarty >> Co-authored-by: Jan Lahoda > > Vicente Romero has refreshed the contents of this pull request, and previous commits have been removed. The incremental > views will show differences compared to the previous content of the PR. Marked as reviewed by sspitsyn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From david.holmes at oracle.com Wed Sep 30 00:48:37 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Sep 2020 10:48:37 +1000 Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <-MGl9WfKGYhIp3dk96wRn86qovh1QItkxm6occqWpao=.87a44d0d-681e-4118-9994-23faf1703614@github.com> Message-ID: <0761e39c-72b0-d058-c65b-7c392e62551e@oracle.com> Hi Thomas, On 30/09/2020 1:21 am, Thomas Stuefe wrote: > On Mon, 28 Sep 2020 04:47:47 GMT, David Holmes wrote: > >>> For thread, e.g. G1ConcurrentMarkThread, there is nothing to prevent calling _cm_thread->print_on(tty) after it >>> terminated, although, I can not find a case right now. >>> You prefer an assertion instead? >> >> I prefer no change to this method. I don't see that we need to do anything special even if a ZOMBIE could be >> encountered. > > If Thread::print_on() (and ThreadsSMRSupport::print_info_on(Thread..)) cannot be called for a Zombie thread, I'd prefer > an assertion testing that. There is nothing to say they can't be called (even if not presently), hence no reason for an assert or any change in these methods. >>> so, you prefer "ShouldNotReachHere()" ? >> >> There's no point putting a ShouldNotReachHere() in error handling code as we will just trip a secondary error. >> If we want to print something the perhaps "unknown state (no osThread)" ? >> Also I only wanted the ThreadSMRSupport::print_info_on to be excluded for Zombies, but you've excluded it for the >> no-osThread case as well. I think based on what Dan said we can just put that back and call it unconditionally. Thanks. > > +1 for "unknown state". "Aborted" is misleading since we do not know. Could have crashed right at thread creation (or > is that what "Aborted" means?) "aborted" will mean different things to different people. Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/341 > From iklam at openjdk.java.net Wed Sep 30 01:06:51 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 30 Sep 2020 01:06:51 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v4] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 07:58:06 GMT, Thomas Stuefe wrote: >> Hi all, >> >> this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the >> delay, had vacation then the entrance of Skara delayed things a bit. >> For the delta diff please see [3]. >> >> This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all >> feedback individually in this PR body, but I incorporated almost all into the new revision. >> What changed since the last version: >> >> - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the >> group consent. >> >> - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for >> overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth >> class loader which made it rather pointless. Now I run it always. >> >> - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. >> >> - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) >> >> Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. >> >> - In ChunkManager::purge() I improved the comments after discussions with Leo. >> >> - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due >> to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run >> every time. >> >> - Fixed indentation issues as Leo requested >> >> - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested >> >> - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes >> and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more >> flexible since it allows us to have list with purge-able and non-purge-able nodes. >> >> - and various smaller fixes, mainly on request of Leo. >> >> @lkorinth: >> >>> VirtualSpaceNode.hpp >>> >>>102 // Start pointer of the area. >>>103 MetaWord* const _base; >>> >>>How does this differ from _rs._base? Really needed? >>> >>>105 // Size, in words, of the whole node >>>106 const size_t _word_size; >>> >>>Can we not calculate this from _rs.size()? >> >> You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it >> is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly >> improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at >> some point and make those members const - as they should be - then I would rewrite this as you suggest. >> Thanks, again, for all your review work! >> >> ------ >> >> >> [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html >> [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html >> [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Make MetaspaceGuardAllocations a diagnostic flag (2) So far I have only reviewed the C code that uses the metaspace, but not the metaspace implementation itself (memory/metaspace*, memory/metaspace/*). I also reviewed a bit of classes. src/hotspot/share/jvmci/jvmciCompilerToVM.hpp line 28: > 26: > 27: #include "gc/shared/cardTable.hpp" > 28: #include "gc/shared/collectedHeap.hpp" This should be changed to a forward declaration to reduce the number of headers included by headers: `class CollectedHeap;` src/hotspot/share/runtime/globals.hpp line 1584: > 1582: product(uintx, ForceCompressedClassSpaceStartAddress, 0, EXPERIMENTAL, \ > 1583: "Force class space start address to a given value.") \ > 1584: \ ForceCompressedClassSpaceStartAddress doesn't seem to be used src/hotspot/share/services/memReporter.cpp line 26: > 24: #include "precompiled.hpp" > 25: > 26: Unneeded new line src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1061: > 1059: // Delete metaspaces for unloaded class loaders and clean up loader_data graph > 1060: ClassLoaderDataGraph::purge(/*at_safepoint*/true); > 1061: DEBUG_ONLY(MetaspaceUtils::verify();) I think it will be cleaner to declare MetaspaceUtils::verify() as void verify() NOT_DEBUG_RETURN; then you can omit the `DEBUG_ONLY` at every caller. test/hotspot/jtreg/runtime/cds/MaxMetaspaceSize.java line 47: > 45: > 46: if (Platform.is64bit()) { > 47: processArgs.add("-XX:MaxMetaspaceSize=8m"); Does this mean the absolute minimal size is larger than before? I just want to confirm this. I think 3M -> 8M doesn't really matter, unless other (larger) minimums also scale up by a factor of 2.66 :-) test/hotspot/jtreg/runtime/cds/appcds/sharedStrings/LargePages.java line 49: > 47: SharedStringsUtils.dump(TestCommon.list("HelloString"), > 48: "SharedStringsBasic.txt", CDS_LOGGING, > 49: "-XX:+UseLargePages"); You can delete code from line 47 onwards as they are the same as the test cases above. src/hotspot/share/memory/metaspace.hpp line 164: > 162: // > 163: // ClassLoaderMetaspace only exists to hide this logic from upper layers: > 164: // I would suggest rewriting the comments to // A ClassLoaderMetaspace manages MetaspaceArena(s) for a CLD. // // A CLD owns one MetaspaceArena if UseCompressedClassPointers is false. Otherwise // it owns two -- one for the Klass* objects from the class space, one for the other types // of MetaspaceObjs from the non-class space. (I think "hide this logic ...." can be omitted. We have lots of abstractions so there's no need to explicitly call it out). ------------- Changes requested by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/336 From iklam at openjdk.java.net Wed Sep 30 04:09:10 2020 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 30 Sep 2020 04:09:10 GMT Subject: RFR: 8247666: Support Lambda proxy classes in static CDS archive In-Reply-To: References: Message-ID: On Fri, 25 Sep 2020 17:52:24 GMT, Calvin Cheung wrote: > Following up on archiving lambda proxy classes in dynamic CDS archive > ([JDK-8198698](https://bugs.openjdk.java.net/browse/JDK-8198698)), this RFE adds the functionality of archiving of > lambda proxy classes in static CDS archive. > When the -XX:DumpLoadedClassList is enabled, the constant pool index related to LambdaMetafactory that are resolved > during application execution will be included in the classlist. The entry for a lambda proxy class in a class list will > be of the following format: > `@lambda-proxy: ` > > e.g. > `@lambda-proxy: test/java/lang/invoke/MethodHandlesGeneralTest 233` > `@lambda-proxy: test/java/lang/invoke/MethodHandlesGeneralTest 355` > > When dumping a CDS archive using the -Xshare:dump and -XX:ExtraSharedClassListFile options, when the above > `@lambda-proxy` entry is encountered while parsing the classlist, we will resolve the corresponding constant pool > indices (233 and 355 in the above example). As a result, lambda proxy classes will be generated for the constant pool > entries, and will be cached using a similar mechanism to JDK-8198698. During dumping, there is check on the cp index > and on the created BootstrapInfo using the cp index. VM will exit with an error message if the check has failed. > During runtime when looking up a lambda proxy class, the lookup will be perform on the static CDS archive and if not > found, then lookup from the dynamic archive if one is specified. (Only name change (IsDynamicDumpingEnabled -> > IsCDSDumpingEnabled) is involved in the core-libs code.) > Testing: tiers 1,2,3,4. > > Performance results (javac on HelloWorld on linux-x64): > Results of " perf stat -r 40 bin/javac -J-Xshare:on -J-XX:SharedArchiveFile=javac2.jsa Bench_HelloWorld.java " > 1: 2228016795 2067752708 (-160264087) ----- 377.760 349.110 (-28.650) ----- > 2: 2223051476 2063016483 (-160034993) ----- 374.580 350.620 (-23.960) ---- > 3: 2225908334 2067673847 (-158234487) ----- 375.220 350.990 (-24.230) ---- > 4: 2225835999 2064596883 (-161239116) ----- 374.670 349.840 (-24.830) ---- > 5: 2226005510 2061694332 (-164311178) ----- 373.512 351.120 (-22.392) ---- > 6: 2225574949 2062657482 (-162917467) ----- 374.710 348.380 (-26.330) ----- > 7: 2224702424 2064634122 (-160068302) ----- 373.670 349.510 (-24.160) ---- > 8: 2226662277 2066301134 (-160361143) ----- 375.350 349.790 (-25.560) ---- > 9: 2226761470 2063162795 (-163598675) ----- 374.260 351.290 (-22.970) ---- > 10: 2230149089 2066203307 (-163945782) ----- 374.760 350.620 (-24.140) ---- > ============================================================ > 2226266109 2064768307 (-161497801) ----- 374.848 350.126 (-24.722) ---- > instr delta = -161497801 -7.2542% > time delta = -24.722 ms -6.5951% I have reviewed a small part of the changes. Here are my initial comments src/hotspot/share/classfile/systemDictionaryShared.cpp line 336: > 334: > 335: static unsigned int dumptime_hash(Symbol* sym) { > 336: if (sym == NULL) { How about adding a comment "_invoked_name may be NULL"? src/hotspot/share/classfile/systemDictionaryShared.cpp line 2300: > 2298: > 2299: class ArchivedLambdaMirrorPatcher { > 2300: static void update(Klass* k) { I think ArchivedLambdaMirrorPatcher can be subclass of ArchivedMirrorPatcher. That way you can share the same code for ArchivedMirrorPatcher::update (you can make this a protected method). src/hotspot/share/classfile/systemDictionaryShared.cpp line 2242: > 2240: st->print_cr("Dynamic Shared Lambda Dictionary"); > 2241: SharedLambdaDictionaryPrinter ldp(st); > 2242: _dynamic_lambda_proxy_class_dictionary.iterate(&ldp); I think this function can be refactored, something like: SystemDictionaryShared::print_on(outputStream* st) { if (UseSharedSpaces) { print_on("", &_builtin_dictionary, &_unregistered_dictionary, &_lambda_proxy_class_dictionary); if (DynamicArchive::is_mapped()) { print_on("Dynamic ", &dynamic__builtin_dictionary, &_dynamic_unregistered_dictionary, &_dynamic_lambda_proxy_class_dictionary); } } } src/hotspot/share/classfile/systemDictionaryShared.cpp line 1984: > 1982: int compare_runtime_lambda_proxy_class_info(const RunTimeLambdaProxyClassNode& r1, > 1983: const RunTimeLambdaProxyClassNode& r2) { > 1984: ResourceMark rm; Is this function being used? ------------- PR: https://git.openjdk.java.net/jdk/pull/364 From stuefe at openjdk.java.net Wed Sep 30 06:09:03 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Sep 2020 06:09:03 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v4] In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 00:48:21 GMT, Ioi Lam wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Make MetaspaceGuardAllocations a diagnostic flag (2) > > test/hotspot/jtreg/runtime/cds/MaxMetaspaceSize.java line 47: > >> 45: >> 46: if (Platform.is64bit()) { >> 47: processArgs.add("-XX:MaxMetaspaceSize=8m"); > > Does this mean the absolute minimal size is larger than before? I just want to confirm this. I think 3M -> 8M doesn't > really matter, unless other (larger) minimums also scale up by a factor of 2.66 :-) On the contrary. Minimum metaspace usage to run a simple HelloWorld went way down. Before: 10,00 MB reserved, 4,75 MB ( 48%) committed Now: 12,00 MB reserved, 448,00 KB ( 3%) committed. (CDS enabled in both cases) mainly because now we commit the large chunk for the boot loader only on demand. Reservation (vsize) numbers are somwhat chunkier now since the default size for a VirtualSpaceNode went up from 4 to 8MB. You now can get away with -XX:MaxMetaspaceSize=~500K to start a simple program. While writing this, I wondered why I made this change to your test. It is not needed for the current version, probably a remnant from some earlier prototype. I will revert this line and check the other tests too. > src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1061: > >> 1059: // Delete metaspaces for unloaded class loaders and clean up loader_data graph >> 1060: ClassLoaderDataGraph::purge(/*at_safepoint*/true); >> 1061: DEBUG_ONLY(MetaspaceUtils::verify();) > > I think it will be cleaner to declare MetaspaceUtils::verify() as > > void verify() NOT_DEBUG_RETURN; > > then you can omit the `DEBUG_ONLY` at every caller. Unless you really want me to, or other Reviewers chime in, I'd rather leave it as it is. I prefer this style since it is clear at the callsites that this is only ASSERT code. > src/hotspot/share/runtime/globals.hpp line 1584: > >> 1582: product(uintx, ForceCompressedClassSpaceStartAddress, 0, EXPERIMENTAL, \ >> 1583: "Force class space start address to a given value.") \ >> 1584: \ > > ForceCompressedClassSpaceStartAddress doesn't seem to be used Good catch, will remove. > src/hotspot/share/memory/metaspace.hpp line 164: > >> 162: // >> 163: // ClassLoaderMetaspace only exists to hide this logic from upper layers: >> 164: // > > I would suggest rewriting the comments to > > // A ClassLoaderMetaspace manages MetaspaceArena(s) for a CLD. > // > // A CLD owns one MetaspaceArena if UseCompressedClassPointers is false. Otherwise > // it owns two -- one for the Klass* objects from the class space, one for the other types > // of MetaspaceObjs from the non-class space. > > (I think "hide this logic ...." can be omitted. We have lots of abstractions so there's no need to explicitly call it > out). Okay, sounds good. ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From dholmes at openjdk.java.net Wed Sep 30 06:45:41 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 30 Sep 2020 06:45:41 GMT Subject: RFR: 8253694: Remove Thread::muxAcquire() from ThreadCrashProtection() In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 06:07:58 GMT, Patricio Chilano Mateo wrote: > Hi all, > > Please review the following patch. Current ThreadCrashProtection() implementation uses static members which requires > the use of Thread::muxAcquire() to allow only one user at a time. We can avoid this synchronization requirement if each > thread has its own ThreadCrashProtection *data. I tested it builds on Linux, macOS and Windows. Since the > JfrThreadSampler is the only one using this I run all the tests from test/jdk/jdk/jfr/. I also run some tests with JFR > enabled while forcing a crash in OSThreadSampler::protected_task() and tests passed with several "Thread method sampler > crashed" UL output. Also run tiers1-3. Thanks, Patricio Hi Patricio, This seems fine, but I'm wondering what the motivation for this change was? Adding more per-thread state is arguably just adding to the clutter of per-thread state. I don't know if this approach was considered when @robehn fixed JDK-8183925. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/376 From rehn at openjdk.java.net Wed Sep 30 06:55:58 2020 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 30 Sep 2020 06:55:58 GMT Subject: RFR: 8253694: Remove Thread::muxAcquire() from ThreadCrashProtection() In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 06:43:12 GMT, David Holmes wrote: >> Hi all, >> >> Please review the following patch. Current ThreadCrashProtection() implementation uses static members which requires >> the use of Thread::muxAcquire() to allow only one user at a time. We can avoid this synchronization requirement if each >> thread has its own ThreadCrashProtection *data. I tested it builds on Linux, macOS and Windows. Since the >> JfrThreadSampler is the only one using this I run all the tests from test/jdk/jdk/jfr/. I also run some tests with JFR >> enabled while forcing a crash in OSThreadSampler::protected_task() and tests passed with several "Thread method sampler >> crashed" UL output. Also run tiers1-3. Thanks, Patricio > > Hi Patricio, > > This seems fine, but I'm wondering what the motivation for this change was? Adding more per-thread state is arguably > just adding to the clutter of per-thread state. I don't know if this approach was considered when @robehn fixed > JDK-8183925. Thanks, > David I don't think so. I have not seen crash protection catching anything since pre-JDK 9. (we did a lot of fixes to the stack-walking code) I would remove it completely instead :) Not sure what JFR team says... ------------- PR: https://git.openjdk.java.net/jdk/pull/376 From stuefe at openjdk.java.net Wed Sep 30 07:19:04 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Sep 2020 07:19:04 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v5] In-Reply-To: References: Message-ID: > Hi all, > > this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the > delay, had vacation then the entrance of Skara delayed things a bit. > For the delta diff please see [3]. > > This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all > feedback individually in this PR body, but I incorporated almost all into the new revision. > What changed since the last version: > > - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the > group consent. > > - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for > overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth > class loader which made it rather pointless. Now I run it always. > > - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. > > - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) > > Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. > > - In ChunkManager::purge() I improved the comments after discussions with Leo. > > - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due > to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run > every time. > > - Fixed indentation issues as Leo requested > > - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested > > - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes > and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more > flexible since it allows us to have list with purge-able and non-purge-able nodes. > > - and various smaller fixes, mainly on request of Leo. > > @lkorinth: > >> VirtualSpaceNode.hpp >> >>102 // Start pointer of the area. >>103 MetaWord* const _base; >> >>How does this differ from _rs._base? Really needed? >> >>105 // Size, in words, of the whole node >>106 const size_t _word_size; >> >>Can we not calculate this from _rs.size()? > > You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it > is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly > improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at > some point and make those members const - as they should be - then I would rewrite this as you suggest. > Thanks, again, for all your review work! > > ------ > > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html > [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html > [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: cds MaxMetaspaceSize test does not need MaxMetaspaceSize increase ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/336/files - new: https://git.openjdk.java.net/jdk/pull/336/files/e64d8f02..20048f9d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/336.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/336/head:pull/336 PR: https://git.openjdk.java.net/jdk/pull/336 From stuefe at openjdk.java.net Wed Sep 30 07:19:05 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Sep 2020 07:19:05 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v4] In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 00:22:19 GMT, Ioi Lam wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Make MetaspaceGuardAllocations a diagnostic flag (2) > > src/hotspot/share/jvmci/jvmciCompilerToVM.hpp line 28: > >> 26: >> 27: #include "gc/shared/cardTable.hpp" >> 28: #include "gc/shared/collectedHeap.hpp" > > This should be changed to a forward declaration to reduce the number of headers included by headers: > `class CollectedHeap;` Sure. ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From shade at openjdk.java.net Wed Sep 30 08:16:40 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 30 Sep 2020 08:16:40 GMT Subject: RFR: 8253824: Revert JDK-8253089 since VS warning C4307 has been disabled Message-ID: This reverts commit 3f455f09dc738b7c76210c1080df2aaa8e19a19d. Testing: - [x] Linux x86_64 {fastdebug,release,slowdebug} builds - [ ] Windows MSVC 2017 build (cannot test it locally) ------------- Commit messages: - 8253824: Revert JDK-8253089 since VS warning has been disabled C4307 Changes: https://git.openjdk.java.net/jdk/pull/426/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=426&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253824 Stats: 8 lines in 2 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/426.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/426/head:pull/426 PR: https://git.openjdk.java.net/jdk/pull/426 From shade at openjdk.java.net Wed Sep 30 08:16:40 2020 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 30 Sep 2020 08:16:40 GMT Subject: RFR: 8253824: Revert JDK-8253089 since VS warning C4307 has been disabled In-Reply-To: References: Message-ID: <4o9s8RMTPe6tzjKx8fvFga8KzmpgL9HS1gxw-Nd44k4=.fad19bef-3810-4432-9eaa-d2a70f4be2e2@github.com> On Wed, 30 Sep 2020 08:09:39 GMT, Aleksey Shipilev wrote: > This reverts commit 3f455f09dc738b7c76210c1080df2aaa8e19a19d. > > Testing: > - [x] Linux x86_64 {fastdebug,release,slowdebug} builds > - [ ] Windows MSVC 2017 build (cannot test it locally) @RealCLanger, @TheRealMDoerr: need your help to test Windows MSVC 2017 builds here. I (un)fortunately lost the build capability with MSVC 2017. ------------- PR: https://git.openjdk.java.net/jdk/pull/426 From stuefe at openjdk.java.net Wed Sep 30 08:17:57 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Sep 2020 08:17:57 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v4] In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 00:50:11 GMT, Ioi Lam wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Make MetaspaceGuardAllocations a diagnostic flag (2) > > test/hotspot/jtreg/runtime/cds/appcds/sharedStrings/LargePages.java line 49: > >> 47: SharedStringsUtils.dump(TestCommon.list("HelloString"), >> 48: "SharedStringsBasic.txt", CDS_LOGGING, >> 49: "-XX:+UseLargePages"); > > You can delete code from line 47 onwards as they are the same as the test cases above. I missed that. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/336 From chegar at openjdk.java.net Wed Sep 30 08:56:38 2020 From: chegar at openjdk.java.net (Chris Hegarty) Date: Wed, 30 Sep 2020 08:56:38 GMT Subject: RFR: 8246774: implementing Record Classes as a standard feature in Java [v3] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Wed, 23 Sep 2020 08:54:57 GMT, Chris Hegarty wrote: >> Vicente Romero has updated the pull request incrementally with three additional commits since the last revision: >> >> - Merge pull request #1 from ChrisHegarty/record-serial-tests >> >> Remove preview args from JDK tests >> - Remove preview args from ObjectMethodsTest >> - Remove preview args from record serialization tests > > test/langtools/tools/javac/records/LocalStaticDeclarations.java line 33: > >> 31: * jdk.compiler/com.sun.tools.javac.util >> 32: * @build combo.ComboTestHelper >> 33: * @compile LocalStaticDeclarations.java > > This, and other, explicit at compile tags could be elided, no? The test source file will be implicitly compiled by the > at run tag. I believe that the explicit at compile tag was added original so that the enable preview and source version > options could be passed to javac - neither of which are needed any more. Does this test need an @run tag rather than an @compile tag ? ( the @run with implicitly compile the source file, before running it ) ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From chegar at openjdk.java.net Wed Sep 30 08:56:37 2020 From: chegar at openjdk.java.net (Chris Hegarty) Date: Wed, 30 Sep 2020 08:56:37 GMT Subject: RFR: 8246774: implementing Record Classes as a standard feature in Java [v3] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: On Wed, 23 Sep 2020 03:34:29 GMT, Vicente Romero wrote: >> Co-authored-by: Vicente Romero >> Co-authored-by: Harold Seigel >> Co-authored-by: Jonathan Gibbons >> Co-authored-by: Brian Goetz >> Co-authored-by: Maurizio Cimadamore >> Co-authored-by: Joe Darcy >> Co-authored-by: Chris Hegarty >> Co-authored-by: Jan Lahoda > > Vicente Romero has updated the pull request incrementally with three additional commits since the last revision: > > - Merge pull request #1 from ChrisHegarty/record-serial-tests > > Remove preview args from JDK tests > - Remove preview args from ObjectMethodsTest > - Remove preview args from record serialization tests Marked as reviewed by chegar (Reviewer). test/langtools/tools/javac/records/LocalStaticDeclarations.java line 33: > 31: * jdk.compiler/com.sun.tools.javac.util > 32: * @build combo.ComboTestHelper > 33: * @compile LocalStaticDeclarations.java This, and other, explicit at compile tags could be elided, no? The test source file will be implicitly compiled by the at run tag. I believe that the explicit at compile tag was added original so that the enable preview and source version options could be passed to javac - neither of which are needed any more. ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From sgehwolf at openjdk.java.net Wed Sep 30 09:37:41 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 30 Sep 2020 09:37:41 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v5] In-Reply-To: References: Message-ID: > Account for interface files for swap and memory being reported independently. > The cgroup v1-like value is now reported by adding the memory.max value to > the memory.swap.max value. > > Testing: Container tests on Linux x86_64 on cgroups v2 with crun 0.15 Severin Gehwolf has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8253727: [cgroups v2] Memory and swap limits reported incorrectly ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/384/files - new: https://git.openjdk.java.net/jdk/pull/384/files/cc9b1087..f4c8678f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=384&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=384&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/384.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/384/head:pull/384 PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Wed Sep 30 09:37:42 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 30 Sep 2020 09:37:42 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v4] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 19:00:43 GMT, Bob Vandette wrote: >> test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java line 248: >> >>> 246: newVal = valSwap; >>> 247: } else { >>> 248: // ignore error values for valMemory, since the container runtime >> >> Add an assert?? > > Other than that nit, it looks fine. It's now an assert. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From mgronlun at openjdk.java.net Wed Sep 30 09:56:24 2020 From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 30 Sep 2020 09:56:24 GMT Subject: RFR: 8253694: Remove Thread::muxAcquire() from ThreadCrashProtection() In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 06:51:41 GMT, Robbin Ehn wrote: >> Hi Patricio, >> >> This seems fine, but I'm wondering what the motivation for this change was? Adding more per-thread state is arguably >> just adding to the clutter of per-thread state. I don't know if this approach was considered when @robehn fixed >> JDK-8183925. Thanks, >> David > > I don't think so. > > I have not seen crash protection catching anything since pre-JDK 9. (we did a lot of fixes to the stack-walking code) > I would remove it completely instead :) Not sure what JFR team says... Like David I would also like to know more about the motivation. Is the feature expected to be used by a larger number of threads? If so, there might be concerns about scalability that was not considered initially. I agree that we have seen less, and for a long time almost no, asserts related to thread sampling in our testing with fastdebug builds (only product builds run with the protection). At the same time, I am not sure how representative that is considering all the code that is out there. We should also keep in mind that we have upcoming features that will have slightly different stack layouts which will affect how stackwalking is achieved, so I would recommend keeping the established safety mechanism. ------------- PR: https://git.openjdk.java.net/jdk/pull/376 From stuefe at openjdk.java.net Wed Sep 30 09:59:52 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Sep 2020 09:59:52 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v6] In-Reply-To: References: Message-ID: <-yICiFFBE3Hmvsob4Yj9cmb8u5x_NUv6y4s--NHLH7o=.f3a57005-2eb4-40f1-aa6c-339b31577a41@github.com> > Hi all, > > this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the > delay, had vacation then the entrance of Skara delayed things a bit. > For the delta diff please see [3]. > > This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all > feedback individually in this PR body, but I incorporated almost all into the new revision. > What changed since the last version: > > - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the > group consent. > > - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for > overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth > class loader which made it rather pointless. Now I run it always. > > - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. > > - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) > > Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. > > - In ChunkManager::purge() I improved the comments after discussions with Leo. > > - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due > to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run > every time. > > - Fixed indentation issues as Leo requested > > - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested > > - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes > and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more > flexible since it allows us to have list with purge-able and non-purge-able nodes. > > - and various smaller fixes, mainly on request of Leo. > > @lkorinth: > >> VirtualSpaceNode.hpp >> >>102 // Start pointer of the area. >>103 MetaWord* const _base; >> >>How does this differ from _rs._base? Really needed? >> >>105 // Size, in words, of the whole node >>106 const size_t _word_size; >> >>Can we not calculate this from _rs.size()? > > You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it > is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly > improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at > some point and make those members const - as they should be - then I would rewrite this as you suggest. > Thanks, again, for all your review work! > > ------ > > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html > [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html > [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: - Merge branch 'jep387' of github.com:tstuefe/jdk into jep387 - Create submit.yml - Review feedback Ioi ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/336/files - new: https://git.openjdk.java.net/jdk/pull/336/files/20048f9d..ec9f7d3e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=04-05 Stats: 904 lines in 6 files changed: 886 ins; 14 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/336.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/336/head:pull/336 PR: https://git.openjdk.java.net/jdk/pull/336 From sgehwolf at openjdk.java.net Wed Sep 30 10:10:38 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 30 Sep 2020 10:10:38 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v4] In-Reply-To: References: Message-ID: On Tue, 29 Sep 2020 19:00:57 GMT, Bob Vandette wrote: >> Severin Gehwolf has refreshed the contents of this pull request, and previous commits have been removed. The >> incremental views will show differences compared to the previous content of the PR. > > Marked as reviewed by bobv (Committer). Account for interface files for swap and memory being reported independently. The cgroup v1-like value is now reported by adding the memory.max value to the memory.swap.max value, and memory.current and memory.swap.current respectively. ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From sgehwolf at openjdk.java.net Wed Sep 30 10:14:15 2020 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Wed, 30 Sep 2020 10:14:15 GMT Subject: RFR: 8253727: [cgroups v2] Memory and swap limits reported incorrectly [v4] In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 10:07:40 GMT, Severin Gehwolf wrote: >> Marked as reviewed by bobv (Committer). > > Account for interface files for swap and memory being reported independently. > The cgroup v1-like value is now reported by adding the memory.max value to > the memory.swap.max value, and memory.current and memory.swap.current > respectively. @adinn Could I ask you to review this one as well, please? Thanks very much! ------------- PR: https://git.openjdk.java.net/jdk/pull/384 From stuefe at openjdk.java.net Wed Sep 30 12:42:17 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Sep 2020 12:42:17 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v5] In-Reply-To: References: Message-ID: On Mon, 28 Sep 2020 20:22:56 GMT, Gerard Ziemski wrote: >> hi all, >> >> Please review this change that refactors common POSIX code into a separate >> file. >> >> Currently there appears to be quite a bit of duplicated code among POSIX >> platforms, which makes it difficult to apply single fix to the signal code. >> With this fix, we will only need to touch single file for common POSIX >> code fixes from now on. >> >> ---------------------------------------------------------------------------- >> The APIs which moved from os/bsd/os_bsd.cpp to to os/posix/PosixSignals.cpp: >> >> //////////////////////////////////////////////////////////////////////////////// >> // signal support >> void os::Bsd::signal_sets_init() >> sigset_t* os::Bsd::unblocked_signals() >> sigset_t* os::Bsd::vm_signals() >> void os::Bsd::hotspot_sigmask(Thread* thread) >> //////////////////////////////////////////////////////////////////////////////// >> // sun.misc.Signal support >> static void UserHandler(int sig, void *siginfo, void *context) >> void* os::user_handler() >> void* os::signal(int signal_number, void* handler) >> void os::signal_raise(int signal_number) >> int os::sigexitnum_pd() >> static void jdk_misc_signal_init() >> void os::signal_notify(int sig) >> static int check_pending_signals() >> int os::signal_wait() >> //////////////////////////////////////////////////////////////////////////////// >> // suspend/resume support >> static void resume_clear_context(OSThread *osthread) >> static void suspend_save_context(OSThread *osthread, siginfo_t* siginfo, ucontext_t* context) >> static void SR_handler(int sig, siginfo_t* siginfo, ucontext_t* context) >> static int SR_initialize() >> static int sr_notify(OSThread* osthread) >> static bool do_suspend(OSThread* osthread) >> static void do_resume(OSThread* osthread) >> /////////////////////////////////////////////////////////////////////////////////// >> // signal handling (except suspend/resume) >> static void signalHandler(int sig, siginfo_t* info, void* uc) >> struct sigaction* os::Bsd::get_chained_signal_action(int sig) >> static bool call_chained_handler(struct sigaction *actp, int sig, >> siginfo_t *siginfo, void *context) >> bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) >> int os::Bsd::get_our_sigflags(int sig) >> void os::Bsd::set_our_sigflags(int sig, int flags) >> void os::Bsd::set_signal_handler(int sig, bool set_installed) >> void os::Bsd::install_signal_handlers() >> static const char* get_signal_handler_name(address handler, >> char* buf, int buflen) >> static void print_signal_handler(outputStream* st, int sig, >> char* buf, size_t buflen) >> void os::run_periodic_checks() >> void os::Bsd::check_signal_handler(int sig) >> >> ----------------------------------------------------------------------------- >> The APIs which moved from os/posix/os_posix.cpp to os/posix/PosixSignals.cpp: >> >> const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) >> int os::Posix::get_signal_number(const char* signal_name) >> int os::get_signal_number(const char* signal_name) >> bool os::Posix::is_valid_signal(int sig) >> bool os::Posix::is_sig_ignored(int sig) >> const char* os::exception_name(int sig, char* buf, size_t size) >> const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) >> void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) >> const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) >> oid os::Posix::print_sa_flags(outputStream* st, int flags) >> static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) >> void os::print_siginfo(outputStream* os, const void* si0) >> bool os::signal_thread(Thread* thread, int sig, const char* reason) >> int os::Posix::unblock_thread_signal_mask(const sigset_t *set) >> address os::Posix::ucontext_get_pc(const ucontext_t* ctx) >> void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) >> struct sigaction* os::Posix::get_preinstalled_handler(int sig) >> void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) >> >> >> -------------------------------------------------------- >> -------------------------------------------------------- >> >> DETAILS: >> >> -------------------------------------------------------- >> Public APIs which are now internal static PosixSignals:: >> >> sigset_t* os::Bsd::vm_signals() >> struct sigaction* os::Bsd::get_chained_signal_action(int sig) >> int os::Bsd::get_our_sigflags(int sig) >> void os::Bsd::set_our_sigflags(int sig, int flags) >> void os::Bsd::set_signal_handler(int sig, bool set_installed) >> void os::Bsd::check_signal_handler(int sig) >> const char* os::Posix::get_signal_name(int sig, char* out, size_t outlen) >> bool os::Posix::is_valid_signal(int sig) >> const char* os::Posix::describe_signal_set_short(const sigset_t* set, char* buffer, size_t buf_size) >> void os::Posix::print_signal_set_short(outputStream* st, const sigset_t* set) >> const char* os::Posix::describe_sa_flags(int flags, char* buffer, size_t size) >> oid os::Posix::print_sa_flags(outputStream* st, int flags) >> static bool get_signal_code_description(const siginfo_t* si, enum_sigcode_desc_t* out) >> void os::Posix::save_preinstalled_handler(int sig, struct sigaction& oldAct) >> >> ------------------------------------------------ >> Public APIs which moved to public PosixSignals:: >> >> void os::Bsd::signal_sets_init() >> void os::Bsd::hotspot_sigmask(Thread* thread) >> bool os::Bsd::chained_handler(int sig, siginfo_t* siginfo, void* context) >> void os::Bsd::install_signal_handlers() >> bool os::Posix::is_sig_ignored(int sig) >> int os::Posix::unblock_thread_signal_mask(const sigset_t *set) >> address os::Posix::ucontext_get_pc(const ucontext_t* ctx) >> void os::Posix::ucontext_set_pc(ucontext_t* ctx, address pc) >> >> ---------------------------------------------------- >> Internal APIs which are now public in PosixSignals:: >> >> static void jdk_misc_signal_init() >> static int SR_initialize() >> static bool do_suspend(OSThread* osthread) >> static void do_resume(OSThread* osthread) >> static void print_signal_handler(outputStream* st, int sig, char* buf, size_t buflen) >> >> -------------------------- >> New APIs in PosixSignals:: >> >> static bool are_signal_handlers_installed(); > > Gerard Ziemski has updated the pull request with a new target base due to a merge or a rebase. The pull request now > contains six commits: > - Merge branch 'master' into JDK-8252324 > - Revert "Add AIX specific SA code" > > This reverts commit cc13700d7d3f15927e22d92d9f5ec9a0739ef9a1. > - Add AIX specific SA code > - Remove leftover AIX signal code > - removed white spaces > - Refactored common POSIX signal code into seperate file Gerard, this is a great cleanup. I am fine with the changes. This will make maintenance of ports easier. As we see right in this patch, platform ports can diverge over time, especially if they are closed ports which only exist downstream (as the AIX port did for a long time at SAP). .. Thomas ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/157 From stuefe at openjdk.java.net Wed Sep 30 13:13:26 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Sep 2020 13:13:26 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v7] In-Reply-To: References: Message-ID: <0pXYG6UUMAT-qcInxdercC9rhSF_JRmnqoUg-nBLh8U=.871caf5f-3af1-4b2e-902f-7663afd147c8@github.com> > Hi all, > > this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the > delay, had vacation then the entrance of Skara delayed things a bit. > For the delta diff please see [3]. > > This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all > feedback individually in this PR body, but I incorporated almost all into the new revision. > What changed since the last version: > > - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the > group consent. > > - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for > overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth > class loader which made it rather pointless. Now I run it always. > > - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. > > - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) > > Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. > > - In ChunkManager::purge() I improved the comments after discussions with Leo. > > - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due > to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run > every time. > > - Fixed indentation issues as Leo requested > > - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested > > - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes > and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more > flexible since it allows us to have list with purge-able and non-purge-able nodes. > > - and various smaller fixes, mainly on request of Leo. > > @lkorinth: > >> VirtualSpaceNode.hpp >> >>102 // Start pointer of the area. >>103 MetaWord* const _base; >> >>How does this differ from _rs._base? Really needed? >> >>105 // Size, in words, of the whole node >>106 const size_t _word_size; >> >>Can we not calculate this from _rs.size()? > > You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it > is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly > improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at > some point and make those members const - as they should be - then I would rewrite this as you suggest. > Thanks, again, for all your review work! > > ------ > > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html > [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html > [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Remove accidentally added file ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/336/files - new: https://git.openjdk.java.net/jdk/pull/336/files/ec9f7d3e..d1e413ba Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=05-06 Stats: 885 lines in 1 file changed: 0 ins; 885 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/336.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/336/head:pull/336 PR: https://git.openjdk.java.net/jdk/pull/336 From stuefe at openjdk.java.net Wed Sep 30 13:17:27 2020 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 30 Sep 2020 13:17:27 GMT Subject: RFR: 8251158: Implementation of JEP 387: Elastic Metaspace [v8] In-Reply-To: References: Message-ID: > Hi all, > > this is the continuation of the ongoing review for the JEP387 implementation (last rounds see [1] [2]). Sorry for the > delay, had vacation then the entrance of Skara delayed things a bit. > For the delta diff please see [3]. > > This is the first time I do a large PR after Skara, so if something is wrong please bear with me. I cannot answer all > feedback individually in this PR body, but I incorporated almost all into the new revision. > What changed since the last version: > > - I renamed most metaspace files back to the original naming scheme or to something similar, hopefully capturing the > group consent. > > - I changed the way allocation guards are checked if MetaspaceGuardAllocations is enabled. Before, I would test for > overwrites upon CLD destruction, but since that check was subject to VerifyMetaspaceInterval it only ran for every nth > class loader which made it rather pointless. Now I run it always. > > - I also improved the printout on block corruption, and log block corruption unconditionally before asserting. > > - I also fixed up and commented the death test which tests for allocation overwriters (test_allocationGuard.cpp) > > Side note, I find the corruption check very useful but if you guys think it is too much I still can remove the feature. > > - In ChunkManager::purge() I improved the comments after discussions with Leo. > > - I fixed a bug with VerifyMetaspaceInterval: if set to 1 the "SOMETIMES" sections were supposed to fire always, but due > to a one-off error they only fired every second time. Now, if -XX:VerifyMetaspaceInterval=1, the checks really run > every time. > > - Fixed indentation issues as Leo requested > > - Rewrote the condition and the assert in VirtualSpaceList::allocate_root_chunk() as Leo requested > > - I removed the "can_purge" logic from VirtualSpaceList. The list does not need to know. It just should iterate all nodes > and attempt purging, and if a node does not own its ReservedSpace, it refuses to be purged. That is simpler and more > flexible since it allows us to have list with purge-able and non-purge-able nodes. > > - and various smaller fixes, mainly on request of Leo. > > @lkorinth: > >> VirtualSpaceNode.hpp >> >>102 // Start pointer of the area. >>103 MetaWord* const _base; >> >>How does this differ from _rs._base? Really needed? >> >>105 // Size, in words, of the whole node >>106 const size_t _word_size; >> >>Can we not calculate this from _rs.size()? > > You are right, _base and _word_size are directly related to the underlying space. But I'd prefer to leave it the way it > is. Mainly because ReservedSpace::_base and ::_size are nonconst and theoretically can change under me. It is highly > improbable but I'd like to know. Note that VirtualSpaceNode::verify checks that. Should we clean up ReservedSpace at > some point and make those members const - as they should be - then I would rewrite this as you suggest. > Thanks, again, for all your review work! > > ------ > > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041162.html > [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-September/041628.html > [3] https://github.com/openjdk/jdk/commit/731f795bc0c1c502dc6cac8f866ff45a15bdd02d Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge branch 'master' into jep387 - Remove accidentally added file - Merge branch 'jep387' of github.com:tstuefe/jdk into jep387 - Create submit.yml - Review feedback Ioi - cds MaxMetaspaceSize test does not need MaxMetaspaceSize increase - Make MetaspaceGuardAllocations a diagnostic flag (2) - Make MetaspaceGuardAllocations a diagnostic flag - Style fixes - Remove empty lines from include sections - ... and 3 more: https://git.openjdk.java.net/jdk/compare/f80a6066...f5cf615b ------------- Changes: https://git.openjdk.java.net/jdk/pull/336/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=336&range=07 Stats: 23029 lines in 170 files changed: 14874 ins; 6918 del; 1237 mod Patch: https://git.openjdk.java.net/jdk/pull/336.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/336/head:pull/336 PR: https://git.openjdk.java.net/jdk/pull/336 From pchilanomate at openjdk.java.net Wed Sep 30 14:07:53 2020 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Wed, 30 Sep 2020 14:07:53 GMT Subject: RFR: 8253694: Remove Thread::muxAcquire() from ThreadCrashProtection() In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 09:53:52 GMT, Markus Gr?nlund wrote: >> I don't think so. >> >> I have not seen crash protection catching anything since pre-JDK 9. (we did a lot of fixes to the stack-walking code) >> I would remove it completely instead :) Not sure what JFR team says... > > Like David I would also like to know more about the motivation. Is the feature expected to be used by a larger number > of threads? If so, there might be concerns about scalability that was not considered initially. > I agree that we have seen less, and for a long time almost no, asserts related to thread sampling in our testing with > fastdebug builds (only product builds run with the protection). At the same time, I am not sure how representative that > is considering all the code that is out there. We should also keep in mind that we have upcoming features that will > have slightly different stack layouts which will affect how stackwalking is achieved, so I would recommend keeping the > established safety mechanism. I looked at the users of Thread::muxAcquire/muxRelease and this was one of the two places where it is used. If we are going to have a crash protection mechanism for general use then the fields should not be static. Now, if we know only the JfrThreadSampler uses it and we want to optimize away that pointer in the thread object then that makes sense, but then we should remove _crash_mux. ------------- PR: https://git.openjdk.java.net/jdk/pull/376 From vromero at openjdk.java.net Wed Sep 30 14:50:40 2020 From: vromero at openjdk.java.net (Vicente Romero) Date: Wed, 30 Sep 2020 14:50:40 GMT Subject: RFR: 8246774: implementing Record Classes as a standard feature in Java [v3] In-Reply-To: References: <48S0UHUnWOQmJO6ErAIDgerNxM4Ibm9anIDZAdcKBp0=.32180f4d-1096-4645-8b23-54aa9f0300fb@github.com> Message-ID: <3HQG1K8tFl8GYpUnm_gbEH8MqTKMQ17fONGeb0m1RnE=.c74d767c-e371-471a-b963-944460e468f7@github.com> On Wed, 30 Sep 2020 08:54:14 GMT, Chris Hegarty wrote: >> test/langtools/tools/javac/records/LocalStaticDeclarations.java line 33: >> >>> 31: * jdk.compiler/com.sun.tools.javac.util >>> 32: * @build combo.ComboTestHelper >>> 33: * @compile LocalStaticDeclarations.java >> >> This, and other, explicit at compile tags could be elided, no? The test source file will be implicitly compiled by the >> at run tag. I believe that the explicit at compile tag was added original so that the enable preview and source version >> options could be passed to javac - neither of which are needed any more. > > Does this test need an @run tag rather than an @compile tag ? ( the @run with implicitly compile the source file, > before running it ) yep u are right it could be removed, I will check all the instances of this, thanks ------------- PR: https://git.openjdk.java.net/jdk/pull/290 From github.com+4146708+a74nh at openjdk.java.net Wed Sep 30 16:29:53 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 30 Sep 2020 16:29:53 GMT Subject: RFR: 8253843: AArch64: Use ishst for storestore barrier Message-ID: <4bvHnsfvpM8MDmGCQgOkn2ovdeeGLJO5Vh8tYR9p2mc=.4563c186-e164-4295-a49b-06ce715fef72@github.com> AArch64 orderAccess uses gcc built in atomic functions, which expand inline to DMB barrier instructions. Specifically, they call the following: FULL_MEM_BARRIER -> DMB ISH READ_MEM_BARRIER -> DMB ISHLD WRITE_MEM_BARRIER -> DMB ISH However, storestore should be optimised to use ISHST. In addition, __sync_synchronize is marked as legacy. See: https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html In order for the code to match, I switched everything to call dmbs directly. Also, add AArch64 to the orderAccess documentation table. ------------- Commit messages: - AArch64: Use ishst for storestore barrier Changes: https://git.openjdk.java.net/jdk/pull/427/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=427&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8253843 Stats: 37 lines in 3 files changed: 11 ins; 0 del; 26 mod Patch: https://git.openjdk.java.net/jdk/pull/427.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/427/head:pull/427 PR: https://git.openjdk.java.net/jdk/pull/427 From github.com+4146708+a74nh at openjdk.java.net Wed Sep 30 16:29:53 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 30 Sep 2020 16:29:53 GMT Subject: RFR: 8253843: AArch64: Use ishst for storestore barrier In-Reply-To: <4bvHnsfvpM8MDmGCQgOkn2ovdeeGLJO5Vh8tYR9p2mc=.4563c186-e164-4295-a49b-06ce715fef72@github.com> References: <4bvHnsfvpM8MDmGCQgOkn2ovdeeGLJO5Vh8tYR9p2mc=.4563c186-e164-4295-a49b-06ce715fef72@github.com> Message-ID: On Wed, 30 Sep 2020 08:34:14 GMT, Alan Hayward wrote: > AArch64 orderAccess uses gcc built in atomic functions, which expand > inline to DMB barrier instructions. Specifically, they call the following: > > FULL_MEM_BARRIER -> DMB ISH > READ_MEM_BARRIER -> DMB ISHLD > WRITE_MEM_BARRIER -> DMB ISH > > However, storestore should be optimised to use ISHST. > > In addition, __sync_synchronize is marked as legacy. See: > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html > > In order for the code to match, I switched everything to call dmbs directly. > > Also, add AArch64 to the orderAccess documentation table. > /signed this was a mistake. Meant to do /covered ------------- PR: https://git.openjdk.java.net/jdk/pull/427 From github.com+4146708+a74nh at openjdk.java.net Wed Sep 30 16:29:54 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 30 Sep 2020 16:29:54 GMT Subject: RFR: 8253843: AArch64: Use ishst for storestore barrier In-Reply-To: References: <4bvHnsfvpM8MDmGCQgOkn2ovdeeGLJO5Vh8tYR9p2mc=.4563c186-e164-4295-a49b-06ce715fef72@github.com> Message-ID: On Wed, 30 Sep 2020 08:44:37 GMT, Alan Hayward wrote: >> AArch64 orderAccess uses gcc built in atomic functions, which expand >> inline to DMB barrier instructions. Specifically, they call the following: >> >> FULL_MEM_BARRIER -> DMB ISH >> READ_MEM_BARRIER -> DMB ISHLD >> WRITE_MEM_BARRIER -> DMB ISH >> >> However, storestore should be optimised to use ISHST. >> >> In addition, __sync_synchronize is marked as legacy. See: >> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html >> >> In order for the code to match, I switched everything to call dmbs directly. >> >> Also, add AArch64 to the orderAccess documentation table. > >> /signed > > this was a mistake. Meant to do /covered There doesn't seem to be an AArch64-port label. Is there any way to get AArch64 patches posted to the AArch64-port-dev mailing list? ------------- PR: https://git.openjdk.java.net/jdk/pull/427 From github.com+4146708+a74nh at openjdk.java.net Wed Sep 30 16:29:54 2020 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 30 Sep 2020 16:29:54 GMT Subject: RFR: 8253843: AArch64: Use ishst for storestore barrier In-Reply-To: References: <4bvHnsfvpM8MDmGCQgOkn2ovdeeGLJO5Vh8tYR9p2mc=.4563c186-e164-4295-a49b-06ce715fef72@github.com> Message-ID: On Wed, 30 Sep 2020 10:05:58 GMT, Alan Hayward wrote: >>> /signed >> >> this was a mistake. Meant to do /covered > > There doesn't seem to be an AArch64-port label. Is there any way to get AArch64 patches posted to the AArch64-port-dev > mailing list? I'm also getting CI failures unrelated to this patch, due to disks being full: https://github.com/a74nh/jdk/actions/runs/280072223 ------------- PR: https://git.openjdk.java.net/jdk/pull/427 From dcubed at openjdk.java.net Wed Sep 30 17:03:10 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 30 Sep 2020 17:03:10 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> Message-ID: On Tue, 29 Sep 2020 15:17:46 GMT, Thomas Stuefe wrote: >> For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report >> thread's state after thread terminated. >> But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid >> thread stack, etc. >> This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated >> thread from live thread. >> Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). >> >> Note: Java thread does not have such issue, its thread object is deleted before thread terminates. > > Hi Zhengyu, > > I'm updating my review after reading through your conversation with David. Save for small nits this seem fine. > > Cheers, Thomas I think we're approaching this problem incorrectly. David mentioned this in the bug report: > As the reporting is done by the thread closure of the target subsystem > this is not a runtime issue in this case but a GC issue. To me, the first part of that sentence is the important part. It is indeed a thread closure that causes us to reach the terminated thread. It is also a thread closure that is used by Thread-SMR to determine when a thread's ThreadsList protects JavaThreads. In particular: src/hotspot/share/runtime/threadSMR.cpp: bool ThreadsSMRSupport::is_a_protected_JavaThread(JavaThread *thread) { uses a ScanHazardPtrGatherProtectedThreadsClosure passed to ThreadsSMRSupport::threads_do() to gather all the protected JavaThread*. This threads_do() function applies the closure to all threads in the system: JavaThreads on 'list' and all the non-JavaThreads: src/hotspot/share/runtime/threadSMR.cpp: void ThreadsSMRSupport::threads_do(ThreadClosure *tc, ThreadsList *list) { list->threads_do(tc); Threads::non_java_threads_do(tc); } So if a particular non-JavaThread is still found via Threads::non_java_threads_do(), then any ThreadsList that it holds protects JavaThread*'s even if that non-JavaThread has terminated. That means that calling ThreadsSMRSupport::print_info_on() is a valid thing to do because the non-JavaThread is still participating in Thread-SMR related decisions. I have no problem with the part where we set the ZOMBIE state as a marker for a terminated non-JavaThread, but we need to determine why that terminated thread is still being found by Threads::non_java_threads_do() and whether it is safe to remove that non-JavaThread from whatever list is holding it. ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From dcubed at openjdk.java.net Wed Sep 30 17:05:55 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 30 Sep 2020 17:05:55 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> Message-ID: On Thu, 24 Sep 2020 18:14:10 GMT, Zhengyu Gu wrote: > For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report > thread's state after thread terminated. > But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid > thread stack, etc. > This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated > thread from live thread. > Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). > > Note: Java thread does not have such issue, its thread object is deleted before thread terminates. Changes requested by dcubed (Reviewer). src/hotspot/share/runtime/thread.cpp line 1345: > 1343: // Ensure thread-local-storage is cleared before termination. > 1344: Thread::clear_thread_current(); > 1345: osthread()->set_state(ZOMBIE); I'm okay with this change as a debugging marker that the non-JavaThread has terminated. ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From dcubed at openjdk.java.net Wed Sep 30 17:05:56 2020 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 30 Sep 2020 17:05:56 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads In-Reply-To: References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> <-MGl9WfKGYhIp3dk96wRn86qovh1QItkxm6occqWpao=.87a44d0d-681e-4118-9994-23faf1703614@github.com> Message-ID: On Tue, 29 Sep 2020 15:14:32 GMT, Thomas Stuefe wrote: >> I prefer no change to this method. I don't see that we need to do anything special even if a ZOMBIE could be >> encountered. > > If Thread::print_on() (and ThreadsSMRSupport::print_info_on(Thread..)) cannot be called for a Zombie thread, I'd prefer > an assertion testing that. I think this part is not necessary. Please see my general comment about the thread closures. ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From zgu at openjdk.java.net Wed Sep 30 20:17:46 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 30 Sep 2020 20:17:46 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads [v2] In-Reply-To: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> Message-ID: > For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report > thread's state after thread terminated. > But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid > thread stack, etc. > This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated > thread from live thread. > Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL). > > Note: Java thread does not have such issue, its thread object is deleted before thread terminates. Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Update ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/341/files - new: https://git.openjdk.java.net/jdk/pull/341/files/36be40e9..dae7a0de Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=341&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=341&range=00-01 Stats: 9 lines in 1 file changed: 2 ins; 5 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/341.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/341/head:pull/341 PR: https://git.openjdk.java.net/jdk/pull/341 From zgu at openjdk.java.net Wed Sep 30 20:24:16 2020 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 30 Sep 2020 20:24:16 GMT Subject: RFR: 8253429: Error reporting should report correct state of terminated/aborted threads [v2] In-Reply-To: References: <5n1FG7ZdpZblWlv7-4An1W-mUTFhw3ugf2b85X-ALeQ=.e8a38616-ca96-4d4e-83e8-a886bafa9f92@github.com> Message-ID: On Wed, 30 Sep 2020 17:03:24 GMT, Daniel D. Daugherty wrote: >> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: >> >> Update > > Changes requested by dcubed (Reviewer). Sorry for the delay. Hopefully, I understood and captured your comments correctly. ------------- PR: https://git.openjdk.java.net/jdk/pull/341 From gziemski at openjdk.java.net Wed Sep 30 22:44:38 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 30 Sep 2020 22:44:38 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v5] In-Reply-To: References: Message-ID: <0_pPVrBJAGbaVyHMnVhS2E0ByTkolg4tKpUH4N8fplg=.eb0fa0d9-44f8-4c63-8322-bd2320d3000a@github.com> On Wed, 30 Sep 2020 12:39:47 GMT, Thomas Stuefe wrote: > Gerard, this is a great cleanup. I am fine with the changes. > > This will make maintenance of ports easier. As we see right in this patch, platform ports can diverge over time, > especially if they are closed ports which only exist downstream (as the AIX port did for a long time at SAP). > .. Thomas Thank you Martin for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From gziemski at openjdk.java.net Wed Sep 30 22:52:44 2020 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 30 Sep 2020 22:52:44 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v5] In-Reply-To: <0_pPVrBJAGbaVyHMnVhS2E0ByTkolg4tKpUH4N8fplg=.eb0fa0d9-44f8-4c63-8322-bd2320d3000a@github.com> References: <0_pPVrBJAGbaVyHMnVhS2E0ByTkolg4tKpUH4N8fplg=.eb0fa0d9-44f8-4c63-8322-bd2320d3000a@github.com> Message-ID: <_RJ0u-P40RQfJNTWyHNYS3uh4TCeHY8ljaS5CLGgnFY=.3bbd9ea2-0d27-4562-a1bb-f0eadb374c50@github.com> On Wed, 30 Sep 2020 22:42:16 GMT, Gerard Ziemski wrote: >> Gerard, this is a great cleanup. I am fine with the changes. >> >> This will make maintenance of ports easier. As we see right in this patch, platform ports can diverge over time, >> especially if they are closed ports which only exist downstream (as the AIX port did for a long time at SAP). >> .. Thomas > >> Gerard, this is a great cleanup. I am fine with the changes. >> >> This will make maintenance of ports easier. As we see right in this patch, platform ports can diverge over time, >> especially if they are closed ports which only exist downstream (as the AIX port did for a long time at SAP). >> .. Thomas > > Thank you Martin for the review! Sorry for taking so long, but this is my 1st github pr and I did not liked the merge issues that David discovered and tried to redo the merge, but I keep getting it wrong. I just forced pushed update removing the BIG merge and I keep working on it, hopefully without introducing another huge merge. If anyone has tips on how to apply local changes (i.e. rebase/stash?) without having to merge, please let me know. Once I figure this out, I will re-implement the few, non-merge, issues that David discovered and this will be ready for integration. ------------- PR: https://git.openjdk.java.net/jdk/pull/157 From ccheung at openjdk.java.net Wed Sep 30 23:17:18 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 30 Sep 2020 23:17:18 GMT Subject: RFR: 8247666: Support Lambda proxy classes in static CDS archive In-Reply-To: References: Message-ID: On Wed, 30 Sep 2020 03:43:51 GMT, Ioi Lam wrote: >> Following up on archiving lambda proxy classes in dynamic CDS archive >> ([JDK-8198698](https://bugs.openjdk.java.net/browse/JDK-8198698)), this RFE adds the functionality of archiving of >> lambda proxy classes in static CDS archive. >> When the -XX:DumpLoadedClassList is enabled, the constant pool index related to LambdaMetafactory that are resolved >> during application execution will be included in the classlist. The entry for a lambda proxy class in a class list will >> be of the following format: >> `@lambda-proxy: ` >> >> e.g. >> `@lambda-proxy: test/java/lang/invoke/MethodHandlesGeneralTest 233` >> `@lambda-proxy: test/java/lang/invoke/MethodHandlesGeneralTest 355` >> >> When dumping a CDS archive using the -Xshare:dump and -XX:ExtraSharedClassListFile options, when the above >> `@lambda-proxy` entry is encountered while parsing the classlist, we will resolve the corresponding constant pool >> indices (233 and 355 in the above example). As a result, lambda proxy classes will be generated for the constant pool >> entries, and will be cached using a similar mechanism to JDK-8198698. During dumping, there is check on the cp index >> and on the created BootstrapInfo using the cp index. VM will exit with an error message if the check has failed. >> During runtime when looking up a lambda proxy class, the lookup will be perform on the static CDS archive and if not >> found, then lookup from the dynamic archive if one is specified. (Only name change (IsDynamicDumpingEnabled -> >> IsCDSDumpingEnabled) is involved in the core-libs code.) >> Testing: tiers 1,2,3,4. >> >> Performance results (javac on HelloWorld on linux-x64): >> Results of " perf stat -r 40 bin/javac -J-Xshare:on -J-XX:SharedArchiveFile=javac2.jsa Bench_HelloWorld.java " >> 1: 2228016795 2067752708 (-160264087) ----- 377.760 349.110 (-28.650) ----- >> 2: 2223051476 2063016483 (-160034993) ----- 374.580 350.620 (-23.960) ---- >> 3: 2225908334 2067673847 (-158234487) ----- 375.220 350.990 (-24.230) ---- >> 4: 2225835999 2064596883 (-161239116) ----- 374.670 349.840 (-24.830) ---- >> 5: 2226005510 2061694332 (-164311178) ----- 373.512 351.120 (-22.392) ---- >> 6: 2225574949 2062657482 (-162917467) ----- 374.710 348.380 (-26.330) ----- >> 7: 2224702424 2064634122 (-160068302) ----- 373.670 349.510 (-24.160) ---- >> 8: 2226662277 2066301134 (-160361143) ----- 375.350 349.790 (-25.560) ---- >> 9: 2226761470 2063162795 (-163598675) ----- 374.260 351.290 (-22.970) ---- >> 10: 2230149089 2066203307 (-163945782) ----- 374.760 350.620 (-24.140) ---- >> ============================================================ >> 2226266109 2064768307 (-161497801) ----- 374.848 350.126 (-24.722) ---- >> instr delta = -161497801 -7.2542% >> time delta = -24.722 ms -6.5951% > > src/hotspot/share/classfile/systemDictionaryShared.cpp line 336: > >> 334: >> 335: static unsigned int dumptime_hash(Symbol* sym) { >> 336: if (sym == NULL) { > > How about adding a comment "_invoked_name may be NULL"? Comment added. > src/hotspot/share/classfile/systemDictionaryShared.cpp line 2300: > >> 2298: >> 2299: class ArchivedLambdaMirrorPatcher { >> 2300: static void update(Klass* k) { > > I think ArchivedLambdaMirrorPatcher can be subclass of ArchivedMirrorPatcher. That way you can share the same code for > ArchivedMirrorPatcher::update (you can make this a protected method). I've changed the ArchivedLambdaMirrorPatcher to the following: class ArchivedLambdaMirrorPatcher : public ArchivedMirrorPatcher { public: void do_value(const RunTimeLambdaProxyClassInfo* info) { InstanceKlass* ik = info->proxy_klass_head(); while (ik != NULL) { update(ik); Klass* k = ik->next_link(); ik = (k != NULL) ? InstanceKlass::cast(k) : NULL; } } }; > src/hotspot/share/classfile/systemDictionaryShared.cpp line 2242: > >> 2240: st->print_cr("Dynamic Shared Lambda Dictionary"); >> 2241: SharedLambdaDictionaryPrinter ldp(st); >> 2242: _dynamic_lambda_proxy_class_dictionary.iterate(&ldp); > > I think this function can be refactored, something like: > > SystemDictionaryShared::print_on(outputStream* st) { > if (UseSharedSpaces) { > print_on("", &_builtin_dictionary, &_unregistered_dictionary, &_lambda_proxy_class_dictionary); > if (DynamicArchive::is_mapped()) { > print_on("Dynamic ", &dynamic__builtin_dictionary, &_dynamic_unregistered_dictionary, > &_dynamic_lambda_proxy_class_dictionary); > } > } > } I've added the following print_on function for code refactoring: void SystemDictionaryShared::print_on(const char* prefix, RunTimeSharedDictionary builtin_dictionary, RunTimeSharedDictionary unregistered_dictionary, LambdaProxyClassDictionary lambda_dictionary, outputStream* st) { st->print_cr("%sShared Dictionary", prefix); SharedDictionaryPrinter p(st); builtin_dictionary.iterate(&p); unregistered_dictionary.iterate(&p); if (!lambda_dictionary.empty()) { st->print_cr("%sShared Lambda Dictionary", prefix); SharedLambdaDictionaryPrinter ldp(st); lambda_dictionary.iterate(&ldp); } } > src/hotspot/share/classfile/systemDictionaryShared.cpp line 1984: > >> 1982: int compare_runtime_lambda_proxy_class_info(const RunTimeLambdaProxyClassNode& r1, >> 1983: const RunTimeLambdaProxyClassNode& r2) { >> 1984: ResourceMark rm; > > Is this function being used? No, it is an unused function. I've removed it. ------------- PR: https://git.openjdk.java.net/jdk/pull/364 From ccheung at openjdk.java.net Wed Sep 30 23:25:00 2020 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 30 Sep 2020 23:25:00 GMT Subject: RFR: 8247666: Support Lambda proxy classes in static CDS archive [v2] In-Reply-To: References: Message-ID: > Following up on archiving lambda proxy classes in dynamic CDS archive > ([JDK-8198698](https://bugs.openjdk.java.net/browse/JDK-8198698)), this RFE adds the functionality of archiving of > lambda proxy classes in static CDS archive. > When the -XX:DumpLoadedClassList is enabled, the constant pool index related to LambdaMetafactory that are resolved > during application execution will be included in the classlist. The entry for a lambda proxy class in a class list will > be of the following format: > `@lambda-proxy: ` > > e.g. > `@lambda-proxy: test/java/lang/invoke/MethodHandlesGeneralTest 233` > `@lambda-proxy: test/java/lang/invoke/MethodHandlesGeneralTest 355` > > When dumping a CDS archive using the -Xshare:dump and -XX:ExtraSharedClassListFile options, when the above > `@lambda-proxy` entry is encountered while parsing the classlist, we will resolve the corresponding constant pool > indices (233 and 355 in the above example). As a result, lambda proxy classes will be generated for the constant pool > entries, and will be cached using a similar mechanism to JDK-8198698. During dumping, there is check on the cp index > and on the created BootstrapInfo using the cp index. VM will exit with an error message if the check has failed. > During runtime when looking up a lambda proxy class, the lookup will be perform on the static CDS archive and if not > found, then lookup from the dynamic archive if one is specified. (Only name change (IsDynamicDumpingEnabled -> > IsCDSDumpingEnabled) is involved in the core-libs code.) > Testing: tiers 1,2,3,4. > > Performance results (javac on HelloWorld on linux-x64): > Results of " perf stat -r 40 bin/javac -J-Xshare:on -J-XX:SharedArchiveFile=javac2.jsa Bench_HelloWorld.java " > 1: 2228016795 2067752708 (-160264087) ----- 377.760 349.110 (-28.650) ----- > 2: 2223051476 2063016483 (-160034993) ----- 374.580 350.620 (-23.960) ---- > 3: 2225908334 2067673847 (-158234487) ----- 375.220 350.990 (-24.230) ---- > 4: 2225835999 2064596883 (-161239116) ----- 374.670 349.840 (-24.830) ---- > 5: 2226005510 2061694332 (-164311178) ----- 373.512 351.120 (-22.392) ---- > 6: 2225574949 2062657482 (-162917467) ----- 374.710 348.380 (-26.330) ----- > 7: 2224702424 2064634122 (-160068302) ----- 373.670 349.510 (-24.160) ---- > 8: 2226662277 2066301134 (-160361143) ----- 375.350 349.790 (-25.560) ---- > 9: 2226761470 2063162795 (-163598675) ----- 374.260 351.290 (-22.970) ---- > 10: 2230149089 2066203307 (-163945782) ----- 374.760 350.620 (-24.140) ---- > ============================================================ > 2226266109 2064768307 (-161497801) ----- 374.848 350.126 (-24.722) ---- > instr delta = -161497801 -7.2542% > time delta = -24.722 ms -6.5951% Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: updated systemDictionaryShared.[c|h]pp based on suggestions from Ioi ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/364/files - new: https://git.openjdk.java.net/jdk/pull/364/files/4a55fddc..d66667d4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=364&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=364&range=00-01 Stats: 68 lines in 2 files changed: 23 ins; 41 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/364.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/364/head:pull/364 PR: https://git.openjdk.java.net/jdk/pull/364 From dholmes at openjdk.java.net Wed Sep 30 23:27:08 2020 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 30 Sep 2020 23:27:08 GMT Subject: RFR: 8252324: Signal related code should be shared among POSIX platforms [v5] In-Reply-To: <_RJ0u-P40RQfJNTWyHNYS3uh4TCeHY8ljaS5CLGgnFY=.3bbd9ea2-0d27-4562-a1bb-f0eadb374c50@github.com> References: <0_pPVrBJAGbaVyHMnVhS2E0ByTkolg4tKpUH4N8fplg=.eb0fa0d9-44f8-4c63-8322-bd2320d3000a@github.com> <_RJ0u-P40RQfJNTWyHNYS3uh4TCeHY8ljaS5CLGgnFY=.3bbd9ea2-0d27-4562-a1bb-f0eadb374c50@github.com> Message-ID: On Wed, 30 Sep 2020 22:47:59 GMT, Gerard Ziemski wrote: >>> Gerard, this is a great cleanup. I am fine with the changes. >>> >>> This will make maintenance of ports easier. As we see right in this patch, platform ports can diverge over time, >>> especially if they are closed ports which only exist downstream (as the AIX port did for a long time at SAP). >>> .. Thomas >> >> Thank you Martin for the review! > > Sorry for taking so long, but this is my 1st github pr and I did not liked the merge issues that David discovered and > tried to redo the merge, but I keep getting it wrong. > I just forced pushed update removing the BIG merge and I keep working on it, hopefully without introducing another huge > merge. If anyone has tips on how to apply local changes (i.e. rebase/stash?) without having to merge, please let me > know. Once I figure this out, I will re-implement the few, non-merge, issues that David discovered and this will be > ready for integration. @gerard-ziemski Please do not force-push any commits in an open PR as it breaks the commit history and the chain of review comments. I will not be able to see the merge issues actually fixed as a diff from the previous commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/157